diff --git a/knowledge base/cloud computing/aws/README.md b/knowledge base/cloud computing/aws/README.md index 426b60c..12320bf 100644 --- a/knowledge base/cloud computing/aws/README.md +++ b/knowledge base/cloud computing/aws/README.md @@ -30,8 +30,10 @@ 1. [Resource tagging](#resource-tagging) 1. [API](#api) 1. [Python](#python) +1. [Container images](#container-images) + 1. [Amazon Linux](#amazon-linux) 1. [Further readings](#further-readings) - 1. [Sources](#sources) + 1. [Sources](#sources) ## TL;DR @@ -868,6 +870,21 @@ machine if not. +## Container images + +### Amazon Linux + +Refer [Pulling the Amazon Linux container image]. + +Amazon Linux container images are **infamous** for having issues when connecting to their package repositories from +**outside** of AWS' network.
+While it can connect to them _sometimes™_ when running locally, one can get much easier and more consistent results by +just running it from **inside** AWS. + +Disconnect from the VPN, start the container, and reconnect to the VPN before installing packages when running the +container locally.
+If one can, prefer just build the image from an EC2 instance. + ## Further readings - [Learn AWS] @@ -1001,11 +1018,13 @@ machine if not. [what is amazon vpc?]: https://docs.aws.amazon.com/vpc/latest/userguide/what-is-amazon-vpc.html [what is aws config?]: https://docs.aws.amazon.com/config/latest/developerguide/WhatIsConfig.html [what is aws global accelerator?]: https://docs.aws.amazon.com/global-accelerator/latest/dg/what-is-global-accelerator.html +[Pulling the Amazon Linux container image]: https://docs.aws.amazon.com/AmazonECR/latest/userguide/amazon_linux_container_image.html [a guide to tagging resources in aws]: https://medium.com/@staxmarketing/a-guide-to-tagging-resources-in-aws-8f4311afeb46 [automating dns-challenge based letsencrypt certificates with aws route 53]: https://johnrix.medium.com/automating-dns-challenge-based-letsencrypt-certificates-with-aws-route-53-8ba799dd207b [aws config tutorial by stephane maarek]: https://www.youtube.com/watch?v=qHdFoYSrUvk +[AWS Fundamentals Blog]: https://awsfundamentals.com/blog [aws savings plans vs. reserved instances: when to use each]: https://www.cloudzero.com/blog/savings-plans-vs-reserved-instances/ [date & time policy conditions at aws - 1-minute iam lesson]: https://www.youtube.com/watch?v=4wpKP1HLEXg [difference in boto3 between resource, client, and session?]: https://stackoverflow.com/questions/42809096/difference-in-boto3-between-resource-client-and-session @@ -1017,4 +1036,3 @@ machine if not. [using aws kms via the cli with a symmetric key]: https://nsmith.net/aws-kms-cli [VPC Endpoints: Secure and Direct Access to AWS Services]: https://awsfundamentals.com/blog/vpc-endpoints [What Is OIDC and Why Do We Need It?]: https://awsfundamentals.com/blog/oidc-introduction -[AWS Fundamentals Blog]: https://awsfundamentals.com/blog diff --git a/knowledge base/database lab.md b/knowledge base/database lab.md deleted file mode 100644 index 9bae7b2..0000000 --- a/knowledge base/database lab.md +++ /dev/null @@ -1,93 +0,0 @@ -# Database Lab - -Database Lab Engine is an open-source platform developed by Postgres.ai to create instant, full-size clones of -production databases.
-Use cases of the clones are to test database migrations, optimize SQL, or deploy full-size staging apps. - -The website hosts the SaaS version of the Database Lab Engine. - -Configuration file examples are available at . - -1. [Engine](#engine) -1. [Clones](#clones) -1. [Further readings](#further-readings) - 1. [Sources](#sources) - -## Engine - -Config file in YAML format, at `~/.dblab/engine/configs/server.yml` by default. - -Metadata files at `~/.dblab/engine/meta` by default. The metadata folder **must be writable**. - -```sh -# Reload the configuration without downtime. -docker exec -it 'dblab_server' kill -SIGHUP 1 - -# Follow logs. -docker logs --since '1m' -f 'dblab_server' -docker logs --since '2024-05-01' -f 'dblab_server' -docker logs --since '2024-08-01T23:11:35' -f 'dblab_server' -``` - -Images for the _Standard_ and _Enterprise_ editions are available at -.
-Images for the _Community_ edition are available at . - -## Clones - -Database clones comes in two flavours: - -- _Thick_ cloning: the regular way to copy data.
- It is also how data is copied to Database Lab the first time a source is added. - - Thick clones can be: - - - _Logical_: do a regular dump and restore using `pg_dump` and `pg_restore`. - - _Physical_: done using `pg_basebackup` or restoring data from physical archives created by backup tools such as - WAL-E/WAL-G, Barman, pgBackRest, or pg_probackup. - - > Managed PostgreSQL databases in cloud environments (e.g.: AWS RDS) support only the logical clone type. - - The Engine supports continuous synchronization with the source databases.
- Achieved by repeating the thick cloning method one initially used for the source. - -- _Thin_ cloning: local containerized database clones based on CoW (Copy-on-Write) spin up in few seconds.
- They share most of the data blocks, but logically they look fully independent.
- The speed of thin cloning does **not** depend on the database size. - - As of 2024-06, Database Lab Engine supports ZFS and LVM for thin cloning.
- With ZFS, the Engine periodically creates a new snapshot of the data directory and maintains a set of snapshots. When - requesting a new clone, users choose which snapshot to use as base. - -Clone DBs configuration starting point is at `~/.dblab/postgres_conf/postgresql.conf`. - -## Further readings - -- [Website] -- [Main repository] -- [Documentation] -- [`dblab`][dblab] -- [Installation guide for DBLab Community Edition][how to install dblab manually] - -### Sources - -- [Database Lab Engine configuration reference] - - - - - -[dblab]: dblab.md - - - -[database lab engine configuration reference]: https://postgres.ai/docs/reference-guides/database-lab-engine-configuration-reference -[documentation]: https://postgres.ai/docs/ -[how to install dblab manually]: https://postgres.ai/docs/how-to-guides/administration/install-dle-manually -[main repository]: https://gitlab.com/postgres-ai/database-lab -[website]: https://postgres.ai/ - - diff --git a/knowledge base/dblab engine.md b/knowledge base/dblab engine.md new file mode 100644 index 0000000..204b397 --- /dev/null +++ b/knowledge base/dblab engine.md @@ -0,0 +1,123 @@ +# DBLab engine + +Creates **instant**, **full-size** clones of PostgreSQL databases.
+Mainly used to test database migrations, optimize SQL, or deploy full-size staging apps. + +Can be self-hosted.
+The [website] hosts the SaaS version. + +1. [TL;DR](#tldr) +1. [Further readings](#further-readings) + 1. [Sources](#sources) + +## TL;DR + +It leverages thin clones to provide full-sized database environments in seconds, regardless of the source database's +size.
+It relies on copy-on-write (CoW) filesystem technologies (currently ZFS or LVM) to provide efficient storage and +provisioning for database clones. + +Relies on Docker containers to isolate and run PostgreSQL instances for each clone.
+Each clone gets its own network port. + +The _Retrieval Service_ acquires data from source PostgreSQL databases and prepares it for cloning.
+It supports: + +- **Physical** retrieval, by using physical backup methods like `pg_basebackup`, WAL-G, or `pgBackRest` to copy the + entire `PGDATA` directory. +- **Logical** retrieval, by using logical dump and restore tools like `pg_dump` and `pg_restore` to copy database + objects and data. + +> [!important] +> Managed PostgreSQL databases in cloud environments (e.g.: AWS RDS) support only logical synchronization. + +The _Pool Manager_ manages storage pools and filesystem operations.
+It abstracts the underlying filesystem (ZFS or LVM) and provides a consistent interface for snapshot and clone +operations.
+It supports different pools, each with its own **independent** configuration and filesystem manager. + +The _Provisioner_ manages the resources it needs to run and handle the lifecycle of database clones.
+It creates and manages PostgreSQL instances by allocating network ports to them from a pool, creating and managing the +containers they run on, mounting filesystem clones for them to use, and configuring them. + +The _Cloning Service_ orchestrates the overall process of creating and managing database clones by coordinating the +Provisioner and Pool Manager to fulfill cloning requests from clients. + +The _API Server_ exposes HTTP endpoints for interactions by providing RESTful APIs that allow creating and managing +clones, viewing snapshots, and monitoring systems' status. + +Database Lab Engine uses a YAML-based configuration file, which is loaded at startup and **can be reloaded at +runtime**.
+It is located at `~/.dblab/engine/configs/server.yml` by default. + +Metadata files are located at `~/.dblab/engine/meta` by default.
+The metadata's folder **must be writable**. + +```sh +# Reload the configuration without downtime. +docker exec -it 'dblab_server' kill -SIGHUP 1 + +# Follow logs. +docker logs --since '1m' -f 'dblab_server' +docker logs --since '2024-05-01' -f 'dblab_server' +docker logs --since '2024-08-01T23:11:35' -f 'dblab_server' +``` + +Before DLE can create thin clones, it must first obtain a **full** copy of the source database.
+The initial data retrieval process is also referred to as _thick cloning_, and is typically a one-time or a scheduled +operation. + +Each clone runs in its own PostgreSQL container, and its configuration can be customized.
+Clone DBs configuration starting point is at `~/.dblab/postgres_conf/postgresql.conf`. + +Database clones come as _thick_ or _thin_ clones. + +Thick clones work as normal replica would, **continuously** synchronizing with their source database. + +Thin clones: + +1. Prompt the creation of a dedicated filesystem snapshot. +1. Spin up a local database container that mounts that snapshot as volume. + +The creation speed of thin clones does **not** depend on the database's size. + +When thin clones are involved, DLE **periodically** creates a new snapshot from the source database, and maintains a +set of them.
+When requesting a new clone, users choose which snapshot to use as its base. + +Container images for the _Community_ edition are available at .
+Specialized images for only the _Standard_ and _Enterprise_ editions are available at +. + +## Further readings + +- [Website] +- [Codebase] +- [Documentation] +- [`dblab`][dblab] + +### Sources + +- [DeepWiki][deepwiki postgres-ai/database-lab-engine] +- [Database Lab Engine configuration reference] +- [Installation guide for DBLab Community Edition][how to install dblab manually] + + + + + +[dblab]: dblab.md + + + +[database lab engine configuration reference]: https://postgres.ai/docs/reference-guides/database-lab-engine-configuration-reference +[Documentation]: https://postgres.ai/docs/ +[how to install dblab manually]: https://postgres.ai/docs/how-to-guides/administration/install-dle-manually +[Codebase]: https://gitlab.com/postgres-ai/database-lab +[Website]: https://postgres.ai/ + + +[DeepWiki postgres-ai/database-lab-engine]: https://deepwiki.com/postgres-ai/database-lab-engine diff --git a/knowledge base/dblab.md b/knowledge base/dblab.md index 4f51b2f..df42ee2 100644 --- a/knowledge base/dblab.md +++ b/knowledge base/dblab.md @@ -1,6 +1,6 @@ # `dblab` -Database Lab Engine client CLI. +DBLab Engine's CLI client. 1. [TL;DR](#tldr) 1. [Further readings](#further-readings) @@ -91,7 +91,7 @@ curl -X 'DELETE' 'https://dblab.company.com:1234/api/clone/smth' \ ## Further readings -- [Database Lab] +- [DBLab engine] - [Database Lab Client CLI reference (dblab)] - [API reference] @@ -107,11 +107,11 @@ curl -X 'DELETE' 'https://dblab.company.com:1234/api/clone/smth' \ -[database lab]: database%20lab.md +[DBLab engine]: dblab%20engine.md -[api reference]: https://dblab.readme.io/reference/ +[API reference]: https://dblab.readme.io/reference/ [database lab client cli reference (dblab)]: https://postgres.ai/docs/reference-guides/dblab-client-cli-reference [how to install and initialize database lab cli]: https://postgres.ai/docs/how-to-guides/cli/cli-install-init [How to refresh data when working in the "logical" mode]: https://postgres.ai/docs/how-to-guides/administration/logical-full-refresh diff --git a/snippets/ansible/tasks.yml b/snippets/ansible/tasks.yml index 51bf971..c9731a0 100644 --- a/snippets/ansible/tasks.yml +++ b/snippets/ansible/tasks.yml @@ -805,7 +805,7 @@ | reject('match', '^CREATE ROLE ' + master_username) | reject('match', '.*rdsadmin.*') | reject('match', '^(CREATE|ALTER) ROLE rds_') - | map('regex_replace', '(NO)(SUPERUSER|REPLICATION)\s?', '') + | map('regex_replace', '(\s+(NO)?(SUPERUSER|REPLICATION))?', '') }} - name: Wait for pending changes to be applied amazon.aws.rds_instance_info: diff --git a/snippets/ansible/tasks/manipulate data.yml b/snippets/ansible/tasks/manipulate data.yml index fd43ca1..fa27da7 100644 --- a/snippets/ansible/tasks/manipulate data.yml +++ b/snippets/ansible/tasks/manipulate data.yml @@ -256,7 +256,7 @@ | reject('match', '^CREATE ROLE ' + master_username) | reject('match', '.*rdsadmin.*') | reject('match', '^(CREATE|ALTER) ROLE rds_') - | map('regex_replace', '(NO)(SUPERUSER|REPLICATION)\s?', '') + | map('regex_replace', '(\s+(NO)?(SUPERUSER|REPLICATION))?', '') }} - name: Manipulate numbers diff --git a/snippets/dblab.fish b/snippets/dblab.fish index a9aff24..9582165 100644 --- a/snippets/dblab.fish +++ b/snippets/dblab.fish @@ -4,9 +4,11 @@ dblab --url 'http://dblab.example.org:1234/' --token "$(gopass show -o 'dblab')" … # Check logs +# Only available from the server hosting the engine docker logs --since '5m' -f 'dblab_server' # Reload the configuration +# Only available from the server hosting the engine docker exec -it 'dblab_server' kill -SIGHUP '1' # Check the running container's version @@ -83,7 +85,7 @@ curl 'https://dblab.example.org:1234/clone/some-clone' -H "Verification-Token: $ curl 'https://dblab.example.org:1234/api/clone/some-clone' -H "Verification-Token: $(gopass show -o 'dblab')" # Restart clones -# Only doable from the instance +# Only available from the server hosting the engine docker restart 'dblab_clone_6000' # Reset clones @@ -111,7 +113,14 @@ curl -X 'PATCH' 'https://dblab.example.org:1234/api/clone/some-clone' \ # Delete clones dblab clone destroy 'some-clone' -curl -X 'DELETE' 'https://dblab.example.org:1234/api/clone/some-clone' -H "Verification-Token: $(gopass show -o 'dblab')" +curl -X 'DELETE' 'https://dblab.example.org:1234/api/clone/some-clone' \ + -H "Verification-Token: $(gopass show -o 'dblab')" # Get admin config in YAML format curl 'https://dblab.example.org:1234/api/admin/config.yaml' -H "Verification-Token: $(gopass show -o 'dblab')" + +# Display the engine's status +dblab instance status + +# Display the engine's version +dblab instance version diff --git a/snippets/postgres/primer.sql b/snippets/postgres/primer.sql index bcb2dd9..f7a43c1 100644 --- a/snippets/postgres/primer.sql +++ b/snippets/postgres/primer.sql @@ -56,6 +56,10 @@ ALTER DATABASE reviser SET pgaudit.log TO none; \c sales \connect vendor +-- Get databases' size +SELECT pg_database_size('postgres'); +SELECT pg_size_pretty(pg_database_size('postgres')); + -- List schemas \dn @@ -91,6 +95,10 @@ CREATE TABLE people ( \d+ clients SELECT column_name, data_type, character_maximum_length FROM information_schema.columns WHERE table_name = 'vendors'; +-- Get tables' size +SELECT pg_relation_size('vendors'); +SELECT pg_size_pretty(pg_relation_size('vendors')); + -- Insert data INSERT INTO people(id, first_name, last_name, phone) diff --git a/snippets/zfs.fish b/snippets/zfs.fish index 8bdc5c6..f72c77d 100644 --- a/snippets/zfs.fish +++ b/snippets/zfs.fish @@ -1,5 +1,63 @@ #!/usr/bin/env fish +### +# Pools +# -------------------------------------- +### + +# List available pools +zpool list + +# Show pools' I/O statistics +zpool iostat + +# Show pools' configuration and status +zpool status + +# List all pools available for import +zpool import + +# Import pools +zpool import -a +zpool import -d +zpool import 'vault' +zpool import 'tank' -N +zpool import 'encrypted_pool_name' -l + +# Get pools' properties +zpool get all 'vault' + +# Set pools' properties +zpool set 'compression=lz4' 'tank' + +# Get info about pools' features +man zpool-features + +# Show the history of all pool's operations +zpool history 'tank' + +# Check pools for errors +# Very cpu *and* disk intensive +zpool scrub 'tank' + +# Export pools +# Unmounts *all* filesystems in any given pool +zpool export 'vault' +zpool export -f 'vault' + +# Destroy pools +zpool destroy 'tank' + +# Restore destroyed pools +# Pools can only be reimported right after the destroy command has been issued +zpool import -D + +# Check pool configuration +zdb -C 'vault' + +# Display the predicted effect of enabling deduplication +zdb -S 'rpool' + ### # File Systems # -------------------------------------- @@ -9,6 +67,23 @@ # List available datasets zfs list +# Automatically mount filesystems +# Find a dataset's mountpoint's root path via `zfs get mountpoint 'pool_name'` +zfs mount -alv + +# Automatically unmount datasets +zfs unmount 'tank/media' + +# Create filesystems +zfs create 'tank/docs' +zfs create -V '1gb' 'vault/good_memories' + # List snapshots zfs list -t 'all' zfs list -t 'snapshot,volume,bookmark' + +# Create snapshots +zfs snapshot 'vault/good_memories@2024-12-31' + +# Check key parameters are fine +zfs get -r checksum,compression,readonly,canmount 'tank'