chore(dblab-engine): expand notes

This commit is contained in:
Michele Cereda
2025-08-18 23:31:39 +02:00
parent 9089c3c981
commit f72db3ce28
9 changed files with 243 additions and 103 deletions

View File

@@ -30,8 +30,10 @@
1. [Resource tagging](#resource-tagging)
1. [API](#api)
1. [Python](#python)
1. [Container images](#container-images)
1. [Amazon Linux](#amazon-linux)
1. [Further readings](#further-readings)
1. [Sources](#sources)
1. [Sources](#sources)
## TL;DR
@@ -868,6 +870,21 @@ machine if not.
</details>
## Container images
### Amazon Linux
Refer [Pulling the Amazon Linux container image].
Amazon Linux container images are **infamous** for having issues when connecting to their package repositories from
**outside** of AWS' network.<br/>
While it can connect to them _sometimes™_ when running locally, one can get much easier and more consistent results by
just running it from **inside** AWS.
Disconnect from the VPN, start the container, and reconnect to the VPN before installing packages when running the
container locally.<br/>
If one can, prefer just build the image from an EC2 instance.
## Further readings
- [Learn AWS]
@@ -1001,11 +1018,13 @@ machine if not.
[what is amazon vpc?]: https://docs.aws.amazon.com/vpc/latest/userguide/what-is-amazon-vpc.html
[what is aws config?]: https://docs.aws.amazon.com/config/latest/developerguide/WhatIsConfig.html
[what is aws global accelerator?]: https://docs.aws.amazon.com/global-accelerator/latest/dg/what-is-global-accelerator.html
[Pulling the Amazon Linux container image]: https://docs.aws.amazon.com/AmazonECR/latest/userguide/amazon_linux_container_image.html
<!-- Others -->
[a guide to tagging resources in aws]: https://medium.com/@staxmarketing/a-guide-to-tagging-resources-in-aws-8f4311afeb46
[automating dns-challenge based letsencrypt certificates with aws route 53]: https://johnrix.medium.com/automating-dns-challenge-based-letsencrypt-certificates-with-aws-route-53-8ba799dd207b
[aws config tutorial by stephane maarek]: https://www.youtube.com/watch?v=qHdFoYSrUvk
[AWS Fundamentals Blog]: https://awsfundamentals.com/blog
[aws savings plans vs. reserved instances: when to use each]: https://www.cloudzero.com/blog/savings-plans-vs-reserved-instances/
[date & time policy conditions at aws - 1-minute iam lesson]: https://www.youtube.com/watch?v=4wpKP1HLEXg
[difference in boto3 between resource, client, and session?]: https://stackoverflow.com/questions/42809096/difference-in-boto3-between-resource-client-and-session
@@ -1017,4 +1036,3 @@ machine if not.
[using aws kms via the cli with a symmetric key]: https://nsmith.net/aws-kms-cli
[VPC Endpoints: Secure and Direct Access to AWS Services]: https://awsfundamentals.com/blog/vpc-endpoints
[What Is OIDC and Why Do We Need It?]: https://awsfundamentals.com/blog/oidc-introduction
[AWS Fundamentals Blog]: https://awsfundamentals.com/blog

View File

@@ -1,93 +0,0 @@
# Database Lab
Database Lab Engine is an open-source platform developed by Postgres.ai to create instant, full-size clones of
production databases.<br/>
Use cases of the clones are to test database migrations, optimize SQL, or deploy full-size staging apps.
The website <https://Postgres.ai/> hosts the SaaS version of the Database Lab Engine.
Configuration file examples are available at <https://gitlab.com/postgres-ai/database-lab/-/tree/v3.0.0/configs>.
1. [Engine](#engine)
1. [Clones](#clones)
1. [Further readings](#further-readings)
1. [Sources](#sources)
## Engine
Config file in YAML format, at `~/.dblab/engine/configs/server.yml` by default.
Metadata files at `~/.dblab/engine/meta` by default. The metadata folder **must be writable**.
```sh
# Reload the configuration without downtime.
docker exec -it 'dblab_server' kill -SIGHUP 1
# Follow logs.
docker logs --since '1m' -f 'dblab_server'
docker logs --since '2024-05-01' -f 'dblab_server'
docker logs --since '2024-08-01T23:11:35' -f 'dblab_server'
```
Images for the _Standard_ and _Enterprise_ editions are available at
<https://gitlab.com/postgres-ai/se-images/container_registry/>.<br/>
Images for the _Community_ edition are available at <https://gitlab.com/postgres-ai/custom-images>.
## Clones
Database clones comes in two flavours:
- _Thick_ cloning: the regular way to copy data.<br/>
It is also how data is copied to Database Lab the first time a source is added.
Thick clones can be:
- _Logical_: do a regular dump and restore using `pg_dump` and `pg_restore`.
- _Physical_: done using `pg_basebackup` or restoring data from physical archives created by backup tools such as
WAL-E/WAL-G, Barman, pgBackRest, or pg_probackup.
> Managed PostgreSQL databases in cloud environments (e.g.: AWS RDS) support only the logical clone type.
The Engine supports continuous synchronization with the source databases.<br/>
Achieved by repeating the thick cloning method one initially used for the source.
- _Thin_ cloning: local containerized database clones based on CoW (Copy-on-Write) spin up in few seconds.<br/>
They share most of the data blocks, but logically they look fully independent.<br/>
The speed of thin cloning does **not** depend on the database size.
As of 2024-06, Database Lab Engine supports ZFS and LVM for thin cloning.<br/>
With ZFS, the Engine periodically creates a new snapshot of the data directory and maintains a set of snapshots. When
requesting a new clone, users choose which snapshot to use as base.
Clone DBs configuration starting point is at `~/.dblab/postgres_conf/postgresql.conf`.
## Further readings
- [Website]
- [Main repository]
- [Documentation]
- [`dblab`][dblab]
- [Installation guide for DBLab Community Edition][how to install dblab manually]
### Sources
- [Database Lab Engine configuration reference]
<!--
Reference
═╬═Time══
-->
<!-- In-article sections -->
<!-- Knowledge base -->
[dblab]: dblab.md
<!-- Files -->
<!-- Upstream -->
[database lab engine configuration reference]: https://postgres.ai/docs/reference-guides/database-lab-engine-configuration-reference
[documentation]: https://postgres.ai/docs/
[how to install dblab manually]: https://postgres.ai/docs/how-to-guides/administration/install-dle-manually
[main repository]: https://gitlab.com/postgres-ai/database-lab
[website]: https://postgres.ai/
<!-- Others -->

View File

@@ -0,0 +1,123 @@
# DBLab engine
Creates **instant**, **full-size** clones of PostgreSQL databases.<br/>
Mainly used to test database migrations, optimize SQL, or deploy full-size staging apps.
Can be self-hosted.<br/>
The [website] hosts the SaaS version.
1. [TL;DR](#tldr)
1. [Further readings](#further-readings)
1. [Sources](#sources)
## TL;DR
It leverages thin clones to provide full-sized database environments in seconds, regardless of the source database's
size.<br/>
It relies on copy-on-write (CoW) filesystem technologies (currently ZFS or LVM) to provide efficient storage and
provisioning for database clones.
Relies on Docker containers to isolate and run PostgreSQL instances for each clone.<br/>
Each clone gets its own network port.
The _Retrieval Service_ acquires data from source PostgreSQL databases and prepares it for cloning.<br/>
It supports:
- **Physical** retrieval, by using physical backup methods like `pg_basebackup`, WAL-G, or `pgBackRest` to copy the
entire `PGDATA` directory.
- **Logical** retrieval, by using logical dump and restore tools like `pg_dump` and `pg_restore` to copy database
objects and data.
> [!important]
> Managed PostgreSQL databases in cloud environments (e.g.: AWS RDS) support only logical synchronization.
The _Pool Manager_ manages storage pools and filesystem operations.<br/>
It abstracts the underlying filesystem (ZFS or LVM) and provides a consistent interface for snapshot and clone
operations.<br/>
It supports different pools, each with its own **independent** configuration and filesystem manager.
The _Provisioner_ manages the resources it needs to run and handle the lifecycle of database clones.<br/>
It creates and manages PostgreSQL instances by allocating network ports to them from a pool, creating and managing the
containers they run on, mounting filesystem clones for them to use, and configuring them.
The _Cloning Service_ orchestrates the overall process of creating and managing database clones by coordinating the
Provisioner and Pool Manager to fulfill cloning requests from clients.
The _API Server_ exposes HTTP endpoints for interactions by providing RESTful APIs that allow creating and managing
clones, viewing snapshots, and monitoring systems' status.
Database Lab Engine uses a YAML-based configuration file, which is loaded at startup and **can be reloaded at
runtime**.<br/>
It is located at `~/.dblab/engine/configs/server.yml` by default.
Metadata files are located at `~/.dblab/engine/meta` by default.<br/>
The metadata's folder **must be writable**.
```sh
# Reload the configuration without downtime.
docker exec -it 'dblab_server' kill -SIGHUP 1
# Follow logs.
docker logs --since '1m' -f 'dblab_server'
docker logs --since '2024-05-01' -f 'dblab_server'
docker logs --since '2024-08-01T23:11:35' -f 'dblab_server'
```
Before DLE can create thin clones, it must first obtain a **full** copy of the source database.<br/>
The initial data retrieval process is also referred to as _thick cloning_, and is typically a one-time or a scheduled
operation.
Each clone runs in its own PostgreSQL container, and its configuration can be customized.<br/>
Clone DBs configuration starting point is at `~/.dblab/postgres_conf/postgresql.conf`.
Database clones come as _thick_ or _thin_ clones.
Thick clones work as normal replica would, **continuously** synchronizing with their source database.
Thin clones:
1. Prompt the creation of a dedicated filesystem snapshot.
1. Spin up a local database container that mounts that snapshot as volume.
The creation speed of thin clones does **not** depend on the database's size.
When thin clones are involved, DLE **periodically** creates a new snapshot from the source database, and maintains a
set of them.<br/>
When requesting a new clone, users choose which snapshot to use as its base.
Container images for the _Community_ edition are available at <https://gitlab.com/postgres-ai/custom-images>.<br/>
Specialized images for only the _Standard_ and _Enterprise_ editions are available at
<https://gitlab.com/postgres-ai/se-images/container_registry/>.
## Further readings
- [Website]
- [Codebase]
- [Documentation]
- [`dblab`][dblab]
### Sources
- [DeepWiki][deepwiki postgres-ai/database-lab-engine]
- [Database Lab Engine configuration reference]
- [Installation guide for DBLab Community Edition][how to install dblab manually]
<!--
Reference
═╬═Time══
-->
<!-- In-article sections -->
<!-- Knowledge base -->
[dblab]: dblab.md
<!-- Files -->
<!-- Upstream -->
[database lab engine configuration reference]: https://postgres.ai/docs/reference-guides/database-lab-engine-configuration-reference
[Documentation]: https://postgres.ai/docs/
[how to install dblab manually]: https://postgres.ai/docs/how-to-guides/administration/install-dle-manually
[Codebase]: https://gitlab.com/postgres-ai/database-lab
[Website]: https://postgres.ai/
<!-- Others -->
[DeepWiki postgres-ai/database-lab-engine]: https://deepwiki.com/postgres-ai/database-lab-engine

View File

@@ -1,6 +1,6 @@
# `dblab`
Database Lab Engine client CLI.
DBLab Engine's CLI client.
1. [TL;DR](#tldr)
1. [Further readings](#further-readings)
@@ -91,7 +91,7 @@ curl -X 'DELETE' 'https://dblab.company.com:1234/api/clone/smth' \
## Further readings
- [Database Lab]
- [DBLab engine]
- [Database Lab Client CLI reference (dblab)]
- [API reference]
@@ -107,11 +107,11 @@ curl -X 'DELETE' 'https://dblab.company.com:1234/api/clone/smth' \
<!-- In-article sections -->
<!-- Knowledge base -->
[database lab]: database%20lab.md
[DBLab engine]: dblab%20engine.md
<!-- Files -->
<!-- Upstream -->
[api reference]: https://dblab.readme.io/reference/
[API reference]: https://dblab.readme.io/reference/
[database lab client cli reference (dblab)]: https://postgres.ai/docs/reference-guides/dblab-client-cli-reference
[how to install and initialize database lab cli]: https://postgres.ai/docs/how-to-guides/cli/cli-install-init
[How to refresh data when working in the "logical" mode]: https://postgres.ai/docs/how-to-guides/administration/logical-full-refresh