Skip to content

How It Works

When you run skippr run, the CLI moves through a short public pipeline model: discover the source, sync raw data, draft dbt assets, and validate the result.

The pipeline

Your Source

  │  discover ── inspect metadata and shape the destination

Discover

  │  sync ── move raw rows into bronze tables

Bronze Tables

  │  model ── draft silver/gold dbt assets

Reviewable dbt Project

  │  validate ── compile and run against the destination

Silver and Gold Models

Each phase runs automatically. You see real-time progress in the terminal UI (or structured logs in CI).

How this maps to the CLI phases

Public stepCLI phases
DiscoverDiscover
SyncSync, Verify
ModelPlan, Author
ValidateValidate, Review

What happens at each step

  1. Discover -- reads source metadata such as table names, column names, and types. Destination mapping is determined here, using deterministic logic rather than model output.
  2. Sync -- extracts rows and files from the source and writes them into bronze tables in your destination.
  3. Model -- drafts a dbt project with source definitions, staging models, and business-facing models for review.
  4. Validate -- runs the generated dbt project against the destination to confirm that the output materialises cleanly.

Incremental by default

Re-running skippr run on an existing project doesn't start from scratch:

  • Data sync -- offsets are tracked internally. Only new and changed rows are extracted and loaded.
  • dbt models -- existing models are preserved. The agent updates or adds new models as the source evolves.

This means you can run the same pipeline on a schedule and it behaves like a proper incremental ETL -- no custom state management required.

Data privacy

Row-level data only ever exists in two places: the machine running skippr, and your warehouse.

  • Source data is read locally and written directly to the warehouse API (Snowflake REST, BigQuery API, Postgres wire protocol, etc.). It is never sent to Skippr or any third party.
  • AI modeling uses only metadata (table names, column names, types) by default. Data samples can optionally be sent to improve model quality but are off by default.
  • The Skippr cloud path handles authentication and control-plane services. It receives metadata needed to operate the service, not row-level source or warehouse data.
  • Credentials live in environment variables, never in config files.

Output structure

The pipeline creates schemas in your warehouse using the project name:

TierSchema nameContents
Bronze<warehouse_schema> (e.g. RAW)Raw extracted data
Silver<project>_silver (e.g. my_project_silver)Staged, cleaned, typed
Gold<project>_gold (e.g. my_project_gold)Business-ready models

What you can inspect

  • Bronze, silver, and gold objects in your destination
  • The generated dbt project in your working directory
  • Connector guides for auth, permissions or network requirements, and troubleshooting