mirror of
https://gitea.com/mcereda/oam.git
synced 2026-02-08 21:34:25 +00:00
docs(peerdb): expand mirrors
This commit is contained in:
@@ -8,6 +8,8 @@ MD013: # line-length
|
|||||||
tables: false
|
tables: false
|
||||||
code_blocks: false
|
code_blocks: false
|
||||||
severity: warning
|
severity: warning
|
||||||
|
MD028: # no-blanks-blockquote
|
||||||
|
false
|
||||||
MD033: # no-inline-html
|
MD033: # no-inline-html
|
||||||
allowed_elements:
|
allowed_elements:
|
||||||
- b
|
- b
|
||||||
|
|||||||
@@ -108,8 +108,35 @@ Mirrors can be in the following states:
|
|||||||
| Terminated | `STATUS_TERMINATED` | The mirror has been deleted/terminated |
|
| Terminated | `STATUS_TERMINATED` | The mirror has been deleted/terminated |
|
||||||
| Unknown | `STATUS_UNKNOWN` | The mirror is not found in PeerDB's catalog, or its status cannot be obtained due to some other issue |
|
| Unknown | `STATUS_UNKNOWN` | The mirror is not found in PeerDB's catalog, or its status cannot be obtained due to some other issue |
|
||||||
|
|
||||||
|
Existing mirrors can be edited, but **must** be paused beforehand.<br/>
|
||||||
|
There are checks in place that will make operations that would edit a mirror fail, if the mirror is not in the
|
||||||
|
`STATUS_PAUSED` state.
|
||||||
|
|
||||||
|
> [!warning]
|
||||||
|
> Once a mirror is created, one **will not** be able to change some of the mirror's settings, like whether to do an
|
||||||
|
> initial snapshot or not.
|
||||||
|
>
|
||||||
|
> Some other parameters, like the number of tables or max workers for initial snapshots, will **only** be configurable
|
||||||
|
> via the API (and **not** via the UI).
|
||||||
|
|
||||||
Mirrors using _PostgreSQL_ peers as sources create [replication slots] in the source DB to get changes from.
|
Mirrors using _PostgreSQL_ peers as sources create [replication slots] in the source DB to get changes from.
|
||||||
|
|
||||||
|
During mirrors' initial snapshots, PeerDB creates at least one worker per table (`snapshot_max_parallel_workers` times
|
||||||
|
`snapshot_num_tables_in_parallel`).
|
||||||
|
|
||||||
|
> [!caution]
|
||||||
|
> Newly created mirrors **will start replication right away**.\
|
||||||
|
> This _usually_ means taking a snapshot of the tables from the source. While in this state, a mirror **cannot be
|
||||||
|
> paused**.
|
||||||
|
|
||||||
|
> [!note]
|
||||||
|
> It looks like partitions are allocated to workers at the start of the process, which results in a slow worker lagging
|
||||||
|
> behind while the rest of the workers for that table already finished.
|
||||||
|
|
||||||
|
> [!tip]
|
||||||
|
> When dealing with lots of data, prefer starting by adding tables one at a time (with the bigger tables first), then
|
||||||
|
> add them in bigger and bigger batches.
|
||||||
|
|
||||||
Operations:
|
Operations:
|
||||||
|
|
||||||
<details style="padding: 0 0 0 1rem">
|
<details style="padding: 0 0 0 1rem">
|
||||||
@@ -124,30 +151,6 @@ GET /api/v1/mirrors/list
|
|||||||
<details style="padding: 0 0 0 1rem">
|
<details style="padding: 0 0 0 1rem">
|
||||||
<summary>Create</summary>
|
<summary>Create</summary>
|
||||||
|
|
||||||
| Field | Type | Required | Default | Notes |
|
|
||||||
| --------------------------------------------- | --------------- | -------- | -------------------- | ------------------------------------------------ |
|
|
||||||
| `flow_job_name` | string | yes | | name of the mirror |
|
|
||||||
| `source_name` | string | yes | | name of the source peer |
|
|
||||||
| `destination_name` | string | yes | | name of the destination peer |
|
|
||||||
| `table_mappings` | array | yes | | |
|
|
||||||
| `table_mappings.source_table_identifier` | string | yes | | source schema and table |
|
|
||||||
| `table_mappings.destination_table_identifier` | string | yes | | destination schema and table |
|
|
||||||
| `table_mappings.exclude` | list of strings | no | [] | columns excluded from the sync |
|
|
||||||
| `table_mappings.columns` | list of objects | no | [] | ordering setting; for ClickHouse only |
|
|
||||||
| `table_mappings.columns.name` | string | yes | | name of the column |
|
|
||||||
| `table_mappings.columns.ordering` | number | yes | | rank of the column |
|
|
||||||
| `idle_timeout_seconds` | number | no | 60 | |
|
|
||||||
| `publication_name` | string | no | | will be created if not provided |
|
|
||||||
| `max_batch_size` | number | no | 1000000 | |
|
|
||||||
| `do_initial_snapshot` | boolean | yes | | |
|
|
||||||
| `snapshot_num_rows_per_partition` | number | no | 1000000 | only used for the initial snapshot |
|
|
||||||
| `snapshot_max_parallel_workers` | number | no | 4 | only used for the initial snapshot |
|
|
||||||
| `snapshot_num_tables_in_parallel` | number | no | 1 | only used for the initial snapshot |
|
|
||||||
| `resync` | boolean | no | false | the mirror **must be dropped** before re-syncing |
|
|
||||||
| `initial_snapshot_only` | boolean | no | false | |
|
|
||||||
| `soft_delete_col_name` | string | no | `_PEERDB_IS_DELETED` | |
|
|
||||||
| `synced_at_col_name` | string | no | `_PEERDB_SYNCED_AT` | |
|
|
||||||
|
|
||||||
```sql
|
```sql
|
||||||
CREATE MIRROR IF NOT EXISTS some_cdc_mirror
|
CREATE MIRROR IF NOT EXISTS some_cdc_mirror
|
||||||
FROM main_pg TO snowflake_prod -- FROM source_peer TO target_peer
|
FROM main_pg TO snowflake_prod -- FROM source_peer TO target_peer
|
||||||
@@ -192,6 +195,30 @@ POST /api/v1/flows/cdc/create
|
|||||||
}'
|
}'
|
||||||
```
|
```
|
||||||
|
|
||||||
|
| Field | Type | Required | Default | Notes |
|
||||||
|
| --------------------------------------------- | --------------- | -------- | -------------------- | ------------------------------------------------ |
|
||||||
|
| `flow_job_name` | string | yes | | name of the mirror |
|
||||||
|
| `source_name` | string | yes | | name of the source peer |
|
||||||
|
| `destination_name` | string | yes | | name of the destination peer |
|
||||||
|
| `table_mappings` | array | yes | | |
|
||||||
|
| `table_mappings.source_table_identifier` | string | yes | | source schema and table |
|
||||||
|
| `table_mappings.destination_table_identifier` | string | yes | | destination schema and table |
|
||||||
|
| `table_mappings.exclude` | list of strings | no | [] | columns excluded from the sync |
|
||||||
|
| `table_mappings.columns` | list of objects | no | [] | ordering setting; for ClickHouse only |
|
||||||
|
| `table_mappings.columns.name` | string | yes | | name of the column |
|
||||||
|
| `table_mappings.columns.ordering` | number | yes | | rank of the column |
|
||||||
|
| `idle_timeout_seconds` | number | no | 60 | |
|
||||||
|
| `publication_name` | string | no | | will be created if not provided |
|
||||||
|
| `max_batch_size` | number | no | 1000000 | |
|
||||||
|
| `do_initial_snapshot` | boolean | yes | | |
|
||||||
|
| `snapshot_num_rows_per_partition` | number | no | 1000000 | only used for the initial snapshot |
|
||||||
|
| `snapshot_max_parallel_workers` | number | no | 4 | only used for the initial snapshot |
|
||||||
|
| `snapshot_num_tables_in_parallel` | number | no | 1 | only used for the initial snapshot |
|
||||||
|
| `resync` | boolean | no | false | the mirror **must be dropped** before re-syncing |
|
||||||
|
| `initial_snapshot_only` | boolean | no | false | |
|
||||||
|
| `soft_delete_col_name` | string | no | `_PEERDB_IS_DELETED` | |
|
||||||
|
| `synced_at_col_name` | string | no | `_PEERDB_SYNCED_AT` | |
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details style="padding: 0 0 0 1rem">
|
<details style="padding: 0 0 0 1rem">
|
||||||
@@ -347,6 +374,7 @@ WITH (
|
|||||||
| Kafka | `9` | `kafka_config` |
|
| Kafka | `9` | `kafka_config` |
|
||||||
| PostgreSQL | `3` or `'POSTGRES'` | `postgres_config` |
|
| PostgreSQL | `3` or `'POSTGRES'` | `postgres_config` |
|
||||||
|
|
||||||
|
> [!note]
|
||||||
> The optional `"allow_update": true` attribute in the API seems to do **absolutely nothing** as of the time of writing.
|
> The optional `"allow_update": true` attribute in the API seems to do **absolutely nothing** as of the time of writing.
|
||||||
|
|
||||||
```plaintext
|
```plaintext
|
||||||
|
|||||||
Reference in New Issue
Block a user