Files
oam/knowledge base/dblab engine.md
2025-08-18 23:31:39 +02:00

4.9 KiB

DBLab engine

Creates instant, full-size clones of PostgreSQL databases.
Mainly used to test database migrations, optimize SQL, or deploy full-size staging apps.

Can be self-hosted.
The website hosts the SaaS version.

  1. TL;DR
  2. Further readings
    1. Sources

TL;DR

It leverages thin clones to provide full-sized database environments in seconds, regardless of the source database's size.
It relies on copy-on-write (CoW) filesystem technologies (currently ZFS or LVM) to provide efficient storage and provisioning for database clones.

Relies on Docker containers to isolate and run PostgreSQL instances for each clone.
Each clone gets its own network port.

The Retrieval Service acquires data from source PostgreSQL databases and prepares it for cloning.
It supports:

  • Physical retrieval, by using physical backup methods like pg_basebackup, WAL-G, or pgBackRest to copy the entire PGDATA directory.
  • Logical retrieval, by using logical dump and restore tools like pg_dump and pg_restore to copy database objects and data.

Important

Managed PostgreSQL databases in cloud environments (e.g.: AWS RDS) support only logical synchronization.

The Pool Manager manages storage pools and filesystem operations.
It abstracts the underlying filesystem (ZFS or LVM) and provides a consistent interface for snapshot and clone operations.
It supports different pools, each with its own independent configuration and filesystem manager.

The Provisioner manages the resources it needs to run and handle the lifecycle of database clones.
It creates and manages PostgreSQL instances by allocating network ports to them from a pool, creating and managing the containers they run on, mounting filesystem clones for them to use, and configuring them.

The Cloning Service orchestrates the overall process of creating and managing database clones by coordinating the Provisioner and Pool Manager to fulfill cloning requests from clients.

The API Server exposes HTTP endpoints for interactions by providing RESTful APIs that allow creating and managing clones, viewing snapshots, and monitoring systems' status.

Database Lab Engine uses a YAML-based configuration file, which is loaded at startup and can be reloaded at runtime.
It is located at ~/.dblab/engine/configs/server.yml by default.

Metadata files are located at ~/.dblab/engine/meta by default.
The metadata's folder must be writable.

# Reload the configuration without downtime.
docker exec -it 'dblab_server' kill -SIGHUP 1

# Follow logs.
docker logs --since '1m' -f 'dblab_server'
docker logs --since '2024-05-01' -f 'dblab_server'
docker logs --since '2024-08-01T23:11:35' -f 'dblab_server'

Before DLE can create thin clones, it must first obtain a full copy of the source database.
The initial data retrieval process is also referred to as thick cloning, and is typically a one-time or a scheduled operation.

Each clone runs in its own PostgreSQL container, and its configuration can be customized.
Clone DBs configuration starting point is at ~/.dblab/postgres_conf/postgresql.conf.

Database clones come as thick or thin clones.

Thick clones work as normal replica would, continuously synchronizing with their source database.

Thin clones:

  1. Prompt the creation of a dedicated filesystem snapshot.
  2. Spin up a local database container that mounts that snapshot as volume.

The creation speed of thin clones does not depend on the database's size.

When thin clones are involved, DLE periodically creates a new snapshot from the source database, and maintains a set of them.
When requesting a new clone, users choose which snapshot to use as its base.

Container images for the Community edition are available at https://gitlab.com/postgres-ai/custom-images.
Specialized images for only the Standard and Enterprise editions are available at https://gitlab.com/postgres-ai/se-images/container_registry/.

Further readings

Sources