Atlas Plan

Quickstart Guide

Quickstart Guide

This guide helps new developers and AI agents understand the Atlas codebase quickly.


First 5 Minutes

1. Understand what Atlas is

Atlas is a local-first data pipeline and reporting system that:

  • Loads source data from xlsx files into DuckDB (analytical) and LibSQL (operational)
  • Transforms raw data through dbt SQL models into analysis-ready mart tables
  • Assembles a structured report JSON from the mart data
  • Renders the report as editable PPTX slides, PDF, and a live web dashboard
  • Supports any reporting cadence: daily, weekly, monthly

2. Read these files in order

OrderFileWhy
1@plan/concept.mdWhat Atlas is, the problem it solves, and why it's built this way
2@plan/workflow.mdEnd-to-end data flow from xlsx to outputs
3@plan/glossary.mdIndonesian → English term mappings used throughout the codebase
4@plan/registry.mdThe actual entity, unit, and channel values used in data
5@plan/architecture.mdPipeline design, tech stack, and deployment phases
6@plan/analytics.mdDuckDB layer definitions and mart SQL specs
7@plan/model.mdLibSQL operational schema
8NETWORK.ymlPorts and domains for all services
9AGENTS.mdCommands, commit conventions, and workspace rules

3. Understand the directory structure

ions/
├── @plan/              # Planning docs, architecture, glossary, active plans
├── @core/              # Shared infra: typescript presets, ESLint config, UI, AI sync
├── @packages/          # Business-logic packages
│   ├── pipeline/       # Unified CLI — seed, sync, publish, report (+ stage-level debug commands)
│   └── db/             # Drizzle ORM schema for LibSQL operational layer
├── @python/            # Python workspaces
│   └── analytics/      # dbt project — staging, intermediate, mart SQL models
├── @services/          # Deployable applications
│   ├── plan/           # Fumadocs docs site serving @plan/ (plan.atlas.prata.ma)
│   ├── present/        # PPTX + PDF generator — reads report.json, produces slides
│   └── dashboard/      # TanStack Start web dashboard — live mart + record views
├── @source/            # Source data directory
│   ├── raw/            # Raw xlsx source files (gitignored data)
│   ├── clean/          # Canonical CSV snapshots (extracted from xlsx)
│   └── config/         # Per-year YAML source configs (ions-2026.yaml, etc.)
├── output/             # Generated reports and validation (gitignored)
│   ├── monthly/        # report.json, .pptx, .pdf per month
│   ├── weekly/         # Weekly report outputs
│   └── validation/     # Validation JSON reports
├── atlas.db            # DuckDB analytical database (gitignored)
├── atlas-ops.db        # LibSQL operational database (gitignored)
├── NETWORK.yml         # Port and domain configuration (source of truth)
└── AGENTS.md           # AI agent instructions and workspace rules

Prerequisites

ToolPurposeCheck
pnpmPackage manager and workspace toolingpnpm --version
uvPython package manager for dbtuv --version
DuckDB CLIInspect atlas.db directlyduckdb --version
LibreOfficePDF conversion from PPTXlibreoffice --version
# Enable pnpm via Corepack
corepack enable
corepack prepare pnpm@10.29.3 --activate

# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install workspace dependencies
pnpm install

# Set up dbt Python environment
cd @python/analytics && uv sync

Running the Full Pipeline

Monthly report (e.g. February 2026)

# 1. Seed LibSQL lookup tables (first time only)
pnpm run seed -- --entity IONS

# 2. Sync source data (extract -> load -> validate -> transform)
pnpm run sync -- --entity IONS --year 2026

# 3. Publish month to LibSQL operational tables
pnpm run publish -- --entity IONS --year 2026 --month 2

# 4. Generate monthly report artifacts (JSON + PPTX + PDF)
pnpm run report -- --entity IONS --year 2026 --month 2 --type monthly

Output: output/monthly/IONS-2026-02.json, output/monthly/IONS-2026-02.pptx, output/monthly/IONS-2026-02.pdf

Single-month incremental sync

# Sync only February data (faster than full-year sync)
pnpm run sync -- --entity IONS --year 2026 --month 2

Quarterly and yearly reports

pnpm run report -- --entity IONS --year 2026 --month 1-3 --type quarterly
pnpm run report -- --entity IONS --year 2026 --type yearly

Start the dashboard

pnpm run --filter @services/dashboard dev
# → http://localhost:15002

Start the plan docs site

pnpm run --filter @services/plan dev
# → http://localhost:15001

Service Ports (from NETWORK.yml)

ServicePortLocal domain
plan15001plan.atlas.test
dashboard15002dashboard.atlas.test
slides15003slides.atlas.test
api15004api.atlas.test

Common Tasks

Add a new year's data

  1. Add @source/config/ions-{year}.yaml with file paths, sheet names, column mappings, and header row offsets for the new year
  2. Place the xlsx files in @source/raw/
  3. Run pnpm run extract -- --entity IONS --year {year}
  4. Run pnpm run sync -- --entity IONS --year {year}
  5. Run cd @python/analytics && uv run dbt run

The new year's data is now available in all mart queries and dashboard views.

Inspect raw data after sync

duckdb atlas.db -c "SHOW TABLES;"
duckdb atlas.db -c "SELECT COUNT(*) FROM raw_transactions;"
duckdb atlas.db -c "SELECT * FROM mart_revenue LIMIT 10;"

Verify dbt models

cd @python/analytics
uv run dbt run
uv run dbt test
uv run dbt docs generate && uv run dbt docs serve

Regenerate migrations after schema change

cd @packages/db
pnpm exec drizzle-kit generate
pnpm exec drizzle-kit push

Understanding the @plan/ Structure

Architecture vs. Plans

  • architecture.md is normative — the source of truth for how the system is designed
  • Plans in @plan/plans/ are execution artifacts — they implement architecture decisions
  • If a plan conflicts with architecture, review architecture first

Active Plans (@plan/plans/)

001 - 2026-02-20 - Plan Service/           # @services/plan — completed
002 - 2026-02-20 - Sync and Source Config/ # @packages/sync (now @packages/pipeline) — completed
003 - 2026-02-20 - Transform Layer/        # @python/analytics — completed
004 - 2026-02-20 - Format Layer/           # @packages/format (now @packages/pipeline) — completed
005 - 2026-02-20 - Operational Database/   # @packages/db — completed
006 - 2026-02-20 - Present Service/        # @services/present — completed
007 - 2026-02-20 - Dashboard Service/      # @services/dashboard — completed
008 - 2026-02-21 - Package Restructure/    # Merged sync+format → pipeline — completed
009 - 2026-02-21 - Data Quality Pipeline/  # Extract, validate, canonical CSV — completed
010 - 2026-02-22 - Slides Service/         # @services/slides — completed
011 - 2026-02-22 - Workers Deployment/     # @services/api + Worker auth — completed
012 - 2026-02-23 - Pipeline Workflow Unification/  # seed/sync/publish/report CLI + metadata/staleness — completed

Each plan folder contains five files:

plan/
├── Plan.md       # Goals, phases, success criteria, boundaries
├── Tasks.md      # Task breakdown with T-XXX IDs and acceptance criteria
├── Progress.md   # Append-only execution log
├── Review.md     # Append-only compliance review log
└── Summary.md    # Current state for AI session handoff

Plan status meanings:

  • draft — Defined, not yet started
  • active — Currently being executed
  • paused — Work temporarily stopped
  • completed — All tasks done, success criteria met

Getting Help

  • What is Atlas?@plan/concept.md
  • End-to-end workflow?@plan/workflow.md
  • Architecture decisions@plan/architecture.md
  • Term unclear?@plan/glossary.md
  • Which unit code is which?@plan/registry.md
  • How are marts defined?@plan/analytics.md
  • What's the DB schema?@plan/model.md
  • Which port does X run on?NETWORK.yml
  • Build/lint/test commandsAGENTS.md

On this page