Skip to content

CDC Configuration

Install

See the Install guide for the full setup, including Windows PowerShell.

curl -fsSL https://install.skippr.io/install.sh | shClick to copy

Installing Skippr means accepting the Skippr EULA.

CDC is configured via a top-level cdc: block in skippr.yaml, alongside cdc_enabled: true on the source connector. The cdc.default contract applies to tables discovered at runtime, and cdc.namespaces can override it for tables that use different business keys.

This page describes configuration. For the semantic contract, see CDC Guarantees.

The cdc: block

yaml
cdc:
  default:
    business_key_columns:
      - id
  namespaces:
    public.orders:
      business_key_columns:
        - order_id
FieldRequiredDescription
default.business_key_columnsYes for final-state CDCDefault columns that uniquely identify a row for dynamically discovered namespaces/tables.
namespaces.<name>.business_key_columnsNoOverride business keys for one namespace/table when it differs from the default.
namespaces.<name>.null_key_policyNoControls how null business keys are handled for that namespace. Final-state sinks reject null keys unless explicitly documented otherwise.

Precedence is deterministic: namespace/table override first, then cdc.default. This keeps Skippr able to discover namespaces on the fly while still supporting per-table contracts.

Automatic guarantee inference

Skippr automatically determines the most complete CDC contract your source and destination pair supports. You do not need to specify a guarantee level manually; the system derives it at startup and applies it throughout the run.

Exactly-once final state

When both the source and sink support full CDC reconciliation, Skippr uses exactly-once final-state semantics:

  • Inserts, updates, and deletes are applied via MERGE with order-token guards
  • Stale writes are rejected
  • Deletes are tracked in tombstone tables to prevent ghost resurrections
  • cdc.default.business_key_columns or a namespace override is required -- Skippr will error at startup if final-state CDC has no business key contract

This guarantee is about the final table state after replay and restart. Internal retries can still happen inside the pipeline.

CDC-encoded

When the destination cannot perform full final-state reconciliation but can faithfully land CDC payloads, Skippr writes events with their mutation metadata (_skippr_mutation_kind, _skippr_order_token) as an append-only change log. Downstream consumers can process this log independently.

Validation at startup

Skippr performs the following checks before starting a CDC pipeline:

  1. The source connector must support CDC (cdc_enabled: true is accepted)
  2. The destination connector must be capable of accepting CDC payloads
  3. If the pair supports exactly-once final state, the resolved business key columns must be non-empty
  4. Column names in the resolved business key contract must exist in the source schema/table

If any check fails, the pipeline exits with a descriptive error message before any data is read.

Source/destination compatibility

All CDC-capable sources work with all warehouse destinations:

SourceDestinations
PostgreSQLSnowflake, BigQuery, PostgreSQL, Redshift, ClickHouse, Databricks, Synapse, MotherDuck
MySQLSnowflake, BigQuery, PostgreSQL, Redshift, ClickHouse, Databricks, Synapse, MotherDuck
MongoDBSnowflake, BigQuery, PostgreSQL, Redshift, ClickHouse, Databricks, Synapse, MotherDuck
DynamoDBSnowflake, BigQuery, PostgreSQL, Redshift, ClickHouse, Databricks, Synapse, MotherDuck
Kafka (Debezium)Snowflake, BigQuery, PostgreSQL, Redshift, ClickHouse, Databricks, Synapse, MotherDuck

Non-CDC sources (e.g. S3, SFTP, HTTP) cannot enable cdc_enabled: true. Skippr validates source/destination compatibility at startup and returns a clear error if the combination is unsupported.

Complete example

PostgreSQL CDC to Snowflake (exactly-once final state is inferred automatically):

yaml
project: pg_cdc_to_snowflake

source:
  kind: postgres
  host: db.example.com
  port: 5432
  user: replicator
  password: ${POSTGRES_PASSWORD}
  database: production
  cdc_enabled: true

warehouse:
  kind: snowflake
  database: ANALYTICS
  schema: RAW
  warehouse: COMPUTE_WH
  role: SKIPPR_ROLE

cdc:
  default:
    business_key_columns:
      - id

MySQL CDC to BigQuery:

yaml
project: mysql_cdc_to_bq

source:
  kind: mysql
  connection_string: mysql://replicator:${MYSQL_PASSWORD}@host:3306/ecommerce
  cdc_enabled: true

warehouse:
  kind: bigquery
  project: my-gcp-project
  dataset: raw
  location: US

cdc:
  default:
    business_key_columns:
      - id
  namespaces:
    ecommerce.orders:
      business_key_columns:
        - order_id

Kafka Debezium CDC to Redshift:

yaml
project: kafka_cdc_to_redshift

source:
  kind: kafka
  brokers: "kafka.example.com:9092"
  topic: dbserver1.public.orders
  cdc_enabled: true

warehouse:
  kind: redshift
  cluster_identifier: my-cluster
  database: analytics
  db_user: admin
  region: us-east-1

cdc:
  default:
    business_key_columns:
      - id

Install

See the Install guide for the full setup, including Windows PowerShell.

curl -fsSL https://install.skippr.io/install.sh | shClick to copy

Installing Skippr means accepting the Skippr EULA.