Plans012 2026 02 23 Pipeline Workflow Unification
Pipeline Workflow Unification
Overview
Unify the scattered pipeline workflow into a frequency-based command structure: seed (bootstrap), sync (monthly data refresh), publish (dashboard data), and report (generate outputs). Add missing transform and publish stages, implement metadata tracking, and standardize CLI flags.
Goals
- Reorganize CLI commands by usage frequency:
seed,sync,publish,report - Add
transformcommand as dbt wrapper (uv run dbt run) - Add
publishcommand for DuckDB → LibSQL data sync - Combine
format+presentinto unifiedreportcommand with--typeflag - Support month ranges (
--month 1-3) for quarterly reports - Normalize
--entityand--unitto uppercase (case-insensitive input) - Track pipeline runs in DuckDB
_pipeline_runstable - Add staleness check (info-only warning) before report generation
Non-Goals
- Yearly report templates (future scope)
- Weekly report type (future scope)
- Interactive prompts for staleness (just warn, allow
--force) - Combined flags (
--publishon sync,--reporton sync) — too magical - Upsert publish strategy (using delete + insert per month instead)
Phases
- Phase 1: CLI restructure — new commands, flag parsing, case normalization
- Phase 2: Transform & Publish — implement missing pipeline stages
- Phase 3: Report unification — combine format + present, add
--type - Phase 4: Metadata tracking —
_pipeline_runstable and staleness checks - Phase 5: Documentation — update AGENTS.md, architecture.md, workflow docs
Success
-
pnpm seed --entity ionsseeds lookup tables (case-insensitive) -
pnpm sync --entity IONS --year 2026runs extract → load → validate → transform -
pnpm publish --entity IONS --year 2026 --month 2pushes data to LibSQL -
pnpm report --entity IONS --year 2026 --month 2 --type monthlygenerates report.json + PPTX/PDF -
pnpm report --entity IONS --year 2026 --month 1-3 --type quarterlygenerates Q1 report - Pipeline runs tracked in
_pipeline_runstable with timestamps - Staleness warning shown before report if data is stale
-
--forceflag skips staleness check - Documentation updated (AGENTS.md, architecture.md)
Requirements
- Existing
@packages/pipelinepackage structure - DuckDB node API for
_pipeline_runstable @libsql/clientfor publish operations (already a dependency)uvinstalled for dbt execution- Drizzle schema for target LibSQL tables (
commerce_order,finance_transaction, etc.)
Context
Why This Approach
- Frequency-based grouping matches user mental model (bootstrap vs recurring vs on-demand)
- Separate
syncandpublishprovides control over when dashboard updates - Delete + insert per month is simpler and handles record deletions automatically
- Metadata in DuckDB keeps pipeline self-contained (no external dependencies)
Key Constraints
- dbt runs all models (no per-month filtering) — transform is always full refresh
- LibSQL publish requires FK resolution (denormalized dbt output → normalized LibSQL schema)
- Present service uses pptxgenjs constructor interop (existing quirk from Plan 011)
Edge Cases
- Month range
--month 1-3should validate start <= end - Quarterly report with missing month data should warn, not fail
- Entity/unit normalization should handle mixed case (
ions,IONS,Ions) - Publish with no data for month should be a no-op (not error)
Tradeoffs
- Transform always runs all models (acceptable — dbt is fast, idempotent)
- Delete + insert may briefly show incomplete data (acceptable — operation is fast)
- Staleness check is info-only (user requested no interactive prompts for now)
Skills
- None required — this is core pipeline TypeScript work
Boundaries
- Always: Run in transaction for publish (atomic delete + insert)
- Always: Normalize entity/unit to uppercase at parse time
- Always: Log pipeline runs to
_pipeline_runstable - Ask first: Schema changes to LibSQL tables (may need migrations)
- Ask first: Changes to existing root package.json scripts
- Never: Delete existing commands without deprecation path
- Never: Change dbt model outputs (only read from them)
Questions
- Should
clean-reportcommand be kept as-is or folded intoreport? → Keep as-is (different purpose: QA workbook vs presentation) - Should
runcommand be deprecated or kept as alias? → Deprecate with warning (backward compatible, guides users tosync)