CDC Configuration
Install
See the Install guide for the full setup, including Windows PowerShell.
curl -fsSL https://install.skippr.io/install.sh | shClick to copyInstalling Skippr means accepting the Skippr EULA.
CDC is configured via a top-level cdc: block in skippr.yaml, alongside cdc_enabled: true on the source connector. The cdc.default contract applies to tables discovered at runtime, and cdc.namespaces can override it for tables that use different business keys.
This page describes configuration. For the semantic contract, see CDC Guarantees.
The cdc: block
cdc:
default:
business_key_columns:
- id
namespaces:
public.orders:
business_key_columns:
- order_id| Field | Required | Description |
|---|---|---|
default.business_key_columns | Yes for final-state CDC | Default columns that uniquely identify a row for dynamically discovered namespaces/tables. |
namespaces.<name>.business_key_columns | No | Override business keys for one namespace/table when it differs from the default. |
namespaces.<name>.null_key_policy | No | Controls how null business keys are handled for that namespace. Final-state sinks reject null keys unless explicitly documented otherwise. |
Precedence is deterministic: namespace/table override first, then cdc.default. This keeps Skippr able to discover namespaces on the fly while still supporting per-table contracts.
Automatic guarantee inference
Skippr automatically determines the most complete CDC contract your source and destination pair supports. You do not need to specify a guarantee level manually; the system derives it at startup and applies it throughout the run.
Exactly-once final state
When both the source and sink support full CDC reconciliation, Skippr uses exactly-once final-state semantics:
- Inserts, updates, and deletes are applied via MERGE with order-token guards
- Stale writes are rejected
- Deletes are tracked in tombstone tables to prevent ghost resurrections
cdc.default.business_key_columnsor a namespace override is required -- Skippr will error at startup if final-state CDC has no business key contract
This guarantee is about the final table state after replay and restart. Internal retries can still happen inside the pipeline.
CDC-encoded
When the destination cannot perform full final-state reconciliation but can faithfully land CDC payloads, Skippr writes events with their mutation metadata (_skippr_mutation_kind, _skippr_order_token) as an append-only change log. Downstream consumers can process this log independently.
Validation at startup
Skippr performs the following checks before starting a CDC pipeline:
- The source connector must support CDC (
cdc_enabled: trueis accepted) - The destination connector must be capable of accepting CDC payloads
- If the pair supports exactly-once final state, the resolved business key columns must be non-empty
- Column names in the resolved business key contract must exist in the source schema/table
If any check fails, the pipeline exits with a descriptive error message before any data is read.
Source/destination compatibility
All CDC-capable sources work with all warehouse destinations:
| Source | Destinations |
|---|---|
| PostgreSQL | Snowflake, BigQuery, PostgreSQL, Redshift, ClickHouse, Databricks, Synapse, MotherDuck |
| MySQL | Snowflake, BigQuery, PostgreSQL, Redshift, ClickHouse, Databricks, Synapse, MotherDuck |
| MongoDB | Snowflake, BigQuery, PostgreSQL, Redshift, ClickHouse, Databricks, Synapse, MotherDuck |
| DynamoDB | Snowflake, BigQuery, PostgreSQL, Redshift, ClickHouse, Databricks, Synapse, MotherDuck |
| Kafka (Debezium) | Snowflake, BigQuery, PostgreSQL, Redshift, ClickHouse, Databricks, Synapse, MotherDuck |
Non-CDC sources (e.g. S3, SFTP, HTTP) cannot enable cdc_enabled: true. Skippr validates source/destination compatibility at startup and returns a clear error if the combination is unsupported.
Complete example
PostgreSQL CDC to Snowflake (exactly-once final state is inferred automatically):
project: pg_cdc_to_snowflake
source:
kind: postgres
host: db.example.com
port: 5432
user: replicator
password: ${POSTGRES_PASSWORD}
database: production
cdc_enabled: true
warehouse:
kind: snowflake
database: ANALYTICS
schema: RAW
warehouse: COMPUTE_WH
role: SKIPPR_ROLE
cdc:
default:
business_key_columns:
- idMySQL CDC to BigQuery:
project: mysql_cdc_to_bq
source:
kind: mysql
connection_string: mysql://replicator:${MYSQL_PASSWORD}@host:3306/ecommerce
cdc_enabled: true
warehouse:
kind: bigquery
project: my-gcp-project
dataset: raw
location: US
cdc:
default:
business_key_columns:
- id
namespaces:
ecommerce.orders:
business_key_columns:
- order_idKafka Debezium CDC to Redshift:
project: kafka_cdc_to_redshift
source:
kind: kafka
brokers: "kafka.example.com:9092"
topic: dbserver1.public.orders
cdc_enabled: true
warehouse:
kind: redshift
cluster_identifier: my-cluster
database: analytics
db_user: admin
region: us-east-1
cdc:
default:
business_key_columns:
- idInstall
See the Install guide for the full setup, including Windows PowerShell.
curl -fsSL https://install.skippr.io/install.sh | shClick to copyInstalling Skippr means accepting the Skippr EULA.
