Quickstart Guide
Quickstart Guide
This guide helps new developers and AI agents understand the Atlas codebase quickly.
First 5 Minutes
1. Understand what Atlas is
Atlas is a local-first data pipeline and reporting system that:
- Loads source data from xlsx files into DuckDB (analytical) and LibSQL (operational)
- Transforms raw data through dbt SQL models into analysis-ready mart tables
- Assembles a structured report JSON from the mart data
- Renders the report as editable PPTX slides, PDF, and a live web dashboard
- Supports any reporting cadence: daily, weekly, monthly
2. Read these files in order
| Order | File | Why |
|---|---|---|
| 1 | @plan/concept.md | What Atlas is, the problem it solves, and why it's built this way |
| 2 | @plan/workflow.md | End-to-end data flow from xlsx to outputs |
| 3 | @plan/glossary.md | Indonesian → English term mappings used throughout the codebase |
| 4 | @plan/registry.md | The actual entity, unit, and channel values used in data |
| 5 | @plan/architecture.md | Pipeline design, tech stack, and deployment phases |
| 6 | @plan/analytics.md | DuckDB layer definitions and mart SQL specs |
| 7 | @plan/model.md | LibSQL operational schema |
| 8 | NETWORK.yml | Ports and domains for all services |
| 9 | AGENTS.md | Commands, commit conventions, and workspace rules |
3. Understand the directory structure
ions/
├── @plan/ # Planning docs, architecture, glossary, active plans
├── @core/ # Shared infra: typescript presets, ESLint config, UI, AI sync
├── @packages/ # Business-logic packages
│ ├── pipeline/ # Unified CLI — seed, sync, publish, report (+ stage-level debug commands)
│ └── db/ # Drizzle ORM schema for LibSQL operational layer
├── @python/ # Python workspaces
│ └── analytics/ # dbt project — staging, intermediate, mart SQL models
├── @services/ # Deployable applications
│ ├── plan/ # Fumadocs docs site serving @plan/ (plan.atlas.prata.ma)
│ ├── present/ # PPTX + PDF generator — reads report.json, produces slides
│ └── dashboard/ # TanStack Start web dashboard — live mart + record views
├── @source/ # Source data directory
│ ├── raw/ # Raw xlsx source files (gitignored data)
│ ├── clean/ # Canonical CSV snapshots (extracted from xlsx)
│ └── config/ # Per-year YAML source configs (ions-2026.yaml, etc.)
├── output/ # Generated reports and validation (gitignored)
│ ├── monthly/ # report.json, .pptx, .pdf per month
│ ├── weekly/ # Weekly report outputs
│ └── validation/ # Validation JSON reports
├── atlas.db # DuckDB analytical database (gitignored)
├── atlas-ops.db # LibSQL operational database (gitignored)
├── NETWORK.yml # Port and domain configuration (source of truth)
└── AGENTS.md # AI agent instructions and workspace rulesPrerequisites
| Tool | Purpose | Check |
|---|---|---|
| pnpm | Package manager and workspace tooling | pnpm --version |
| uv | Python package manager for dbt | uv --version |
| DuckDB CLI | Inspect atlas.db directly | duckdb --version |
| LibreOffice | PDF conversion from PPTX | libreoffice --version |
# Enable pnpm via Corepack
corepack enable
corepack prepare pnpm@10.29.3 --activate
# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install workspace dependencies
pnpm install
# Set up dbt Python environment
cd @python/analytics && uv syncRunning the Full Pipeline
Monthly report (e.g. February 2026)
# 1. Seed LibSQL lookup tables (first time only)
pnpm run seed -- --entity IONS
# 2. Sync source data (extract -> load -> validate -> transform)
pnpm run sync -- --entity IONS --year 2026
# 3. Publish month to LibSQL operational tables
pnpm run publish -- --entity IONS --year 2026 --month 2
# 4. Generate monthly report artifacts (JSON + PPTX + PDF)
pnpm run report -- --entity IONS --year 2026 --month 2 --type monthlyOutput: output/monthly/IONS-2026-02.json, output/monthly/IONS-2026-02.pptx, output/monthly/IONS-2026-02.pdf
Single-month incremental sync
# Sync only February data (faster than full-year sync)
pnpm run sync -- --entity IONS --year 2026 --month 2Quarterly and yearly reports
pnpm run report -- --entity IONS --year 2026 --month 1-3 --type quarterly
pnpm run report -- --entity IONS --year 2026 --type yearlyStart the dashboard
pnpm run --filter @services/dashboard dev
# → http://localhost:15002Start the plan docs site
pnpm run --filter @services/plan dev
# → http://localhost:15001Service Ports (from NETWORK.yml)
| Service | Port | Local domain |
|---|---|---|
| plan | 15001 | plan.atlas.test |
| dashboard | 15002 | dashboard.atlas.test |
| slides | 15003 | slides.atlas.test |
| api | 15004 | api.atlas.test |
Common Tasks
Add a new year's data
- Add
@source/config/ions-{year}.yamlwith file paths, sheet names, column mappings, and header row offsets for the new year - Place the xlsx files in
@source/raw/ - Run
pnpm run extract -- --entity IONS --year {year} - Run
pnpm run sync -- --entity IONS --year {year} - Run
cd @python/analytics && uv run dbt run
The new year's data is now available in all mart queries and dashboard views.
Inspect raw data after sync
duckdb atlas.db -c "SHOW TABLES;"
duckdb atlas.db -c "SELECT COUNT(*) FROM raw_transactions;"
duckdb atlas.db -c "SELECT * FROM mart_revenue LIMIT 10;"Verify dbt models
cd @python/analytics
uv run dbt run
uv run dbt test
uv run dbt docs generate && uv run dbt docs serveRegenerate migrations after schema change
cd @packages/db
pnpm exec drizzle-kit generate
pnpm exec drizzle-kit pushUnderstanding the @plan/ Structure
Architecture vs. Plans
architecture.mdis normative — the source of truth for how the system is designed- Plans in
@plan/plans/are execution artifacts — they implement architecture decisions - If a plan conflicts with architecture, review architecture first
Active Plans (@plan/plans/)
001 - 2026-02-20 - Plan Service/ # @services/plan — completed
002 - 2026-02-20 - Sync and Source Config/ # @packages/sync (now @packages/pipeline) — completed
003 - 2026-02-20 - Transform Layer/ # @python/analytics — completed
004 - 2026-02-20 - Format Layer/ # @packages/format (now @packages/pipeline) — completed
005 - 2026-02-20 - Operational Database/ # @packages/db — completed
006 - 2026-02-20 - Present Service/ # @services/present — completed
007 - 2026-02-20 - Dashboard Service/ # @services/dashboard — completed
008 - 2026-02-21 - Package Restructure/ # Merged sync+format → pipeline — completed
009 - 2026-02-21 - Data Quality Pipeline/ # Extract, validate, canonical CSV — completed
010 - 2026-02-22 - Slides Service/ # @services/slides — completed
011 - 2026-02-22 - Workers Deployment/ # @services/api + Worker auth — completed
012 - 2026-02-23 - Pipeline Workflow Unification/ # seed/sync/publish/report CLI + metadata/staleness — completedEach plan folder contains five files:
plan/
├── Plan.md # Goals, phases, success criteria, boundaries
├── Tasks.md # Task breakdown with T-XXX IDs and acceptance criteria
├── Progress.md # Append-only execution log
├── Review.md # Append-only compliance review log
└── Summary.md # Current state for AI session handoffPlan status meanings:
draft— Defined, not yet startedactive— Currently being executedpaused— Work temporarily stoppedcompleted— All tasks done, success criteria met
Getting Help
- What is Atlas? →
@plan/concept.md - End-to-end workflow? →
@plan/workflow.md - Architecture decisions →
@plan/architecture.md - Term unclear? →
@plan/glossary.md - Which unit code is which? →
@plan/registry.md - How are marts defined? →
@plan/analytics.md - What's the DB schema? →
@plan/model.md - Which port does X run on? →
NETWORK.yml - Build/lint/test commands →
AGENTS.md