Skip to content

CDC Recipes

Use these recipes as starting points for final-state CDC pipelines.

Postgres to Iceberg

yaml
pipelines:
  postgres_iceberg:
    data_source: data_sources.postgres
    data_sink: data_sinks.iceberg
    cdc:
      business_key_columns: [id]

MySQL to Iceberg

yaml
pipelines:
  mysql_iceberg:
    data_source: data_sources.mysql
    data_sink: data_sinks.iceberg
    cdc:
      business_key_columns: [id]

DynamoDB to Iceberg

yaml
pipelines:
  dynamodb_iceberg:
    data_source: data_sources.dynamodb
    data_sink: data_sinks.iceberg
    cdc:
      business_key_columns: [id]

Run skippr sync --pipeline <name> --once --output json in CI to produce structured events and fail fast on configuration issues.

title: "CDC Recipes | Skippr Docs" description: "Practical CDC recipes for defaults, table overrides, S3 envelope keys, and Iceberg deletes."

CDC Recipes

Defaults for discovered tables

Use cdc.default when many source tables share the same business key convention.

yaml
cdc:
  default:
    business_key_columns:
      - id

Skippr applies this contract to namespaces discovered during sync, so you do not need to list every table ahead of time.

Override one table

Use cdc.namespaces when one table has a different natural key.

yaml
cdc:
  default:
    business_key_columns:
      - id
  namespaces:
    public.order_lines:
      business_key_columns:
        - order_id
        - line_id

S3 JSON envelope CDC

If S3 JSON records share envelope fields such as tenant_id and event_id, set them as the default and override only the exceptions.

yaml
cdc:
  default:
    business_key_columns:
      - tenant_id
      - entity_id

Iceberg deletes

Iceberg final-state CDC writes data files for inserts/updates and equality delete files for deletes. A single-row delete does not rewrite a large Parquet batch; Athena applies the Iceberg delete file when querying the table.