Architecture
Architecture
System architecture for Atlas. All terminology follows glossary.md. All data entities follow model.md and analytics.md.
Overview
Atlas is a local-first Modern Data Stack for multi-entity business reporting. It normalizes data from operational sources using an ELT pipeline, transforms raw data into analytical marts via dbt, and serves outputs through both a static presentation layer (PPTX/PDF) and a real-time web dashboard.
Principles
| Principle | Description |
|---|---|
| Modern Data Stack | Modular, best-of-breed components for each layer. Each layer has a single responsibility and a clean interface to the next. |
| ELT over ETL | Raw source data is loaded first, untransformed. All business logic lives in the transform layer (dbt). Source data is always reprocessable. |
| Two-layer database | DuckDB for analytical workloads (aggregations, marts, reporting). LibSQL/Turso for operational workloads (individual record lookup, dashboard drill-down). |
| Declarative transforms | All business logic for reporting is expressed as SQL models in dbt. No imperative calculation code in the pipeline. |
| Adapter pattern | The Sync layer uses source adapters to decouple extraction from loading. Swapping xlsx for Odoo requires only a new adapter. |
| Schema as code | The LibSQL operational schema is defined in TypeScript via Drizzle ORM. The DuckDB analytical schema is defined in dbt models. Both are version-controlled. |
| Entity scoping | All data is scoped to a core_entity record. IONs and EPN never share data in queries. |
| Report as data | The Format layer reads dbt marts and assembles a structured report.json. The Present layer is a pure renderer — it applies no business logic. |
| Python where it makes sense | dbt-core and any data tooling that benefits from Python uses Python, managed via uv. Everything else is TypeScript on Bun. |
Technology Stack
| Concern | Technology | Notes |
|---|---|---|
| Runtime | Bun | All TS scripts, packages, and services |
| Monorepo | Turborepo + Bun workspaces | Task pipeline for build, lint, test |
| Language | TypeScript (primary), Python (dbt + data tooling) | Strict TS throughout; Python managed via uv |
| Python tooling | uv | Fast Python package manager; manages dbt venv |
| Analytical DB | DuckDB | In-process OLAP; native xlsx reading; hosts raw, staging, intermediate, mart layers |
| Operational DB | LibSQL (local) → Turso (remote) | SQLite-compatible; individual record lookup; dashboard drill-down |
| Operational ORM | Drizzle ORM (LibSQL dialect) | Type-safe queries; schema-as-code; migrations via Drizzle Kit |
| Transform layer | dbt-core + dbt-duckdb | Declarative SQL models; staging, intermediate, and mart layers |
| Extract + Load | DuckDB native xlsx reading + SheetJS fallback | DuckDB reads xlsx directly for raw layer; SheetJS for edge cases |
| Odoo sync (future) | Odoo JSON-RPC via fetch | Same adapter interface as xlsx |
| Format | TypeScript + DuckDB node client | Reads marts, assembles report.json |
| Present — PPTX | pptxgenjs | Programmatic slide generation |
| Present — PDF | LibreOffice CLI | Converts PPTX to PDF via headless LibreOffice |
| Dashboard | TanStack Start | SSR + client, file-based routing |
| Dashboard UI | shadcn/ui + Tailwind CSS | Component library |
| Dashboard — aggregated | DuckDB node client via server functions | Reads marts directly |
| Dashboard — individual | Drizzle + LibSQL via server functions | Individual record lookup and drill-down |
| Code quality | ESLint + Prettier + TypeScript strict | Shared configs from @repo/lint and @repo/typescript |
Repository Structure
ions/
├── @plan/ # Planning docs (not a workspace package)
│ ├── architecture.md
│ ├── analytics.md # DuckDB, dbt layers, mart definitions
│ ├── glossary.md
│ ├── model.md
│ ├── registry.md
│ ├── domains/
│ └── discussions/
│
├── @repo/ # Shared infra packages
│ ├── typescript/ # Shared tsconfig presets
│ └── lint/ # Shared ESLint flat config
│
├── @packages/ # Business-logic packages
│ ├── db/ # Drizzle schema + migrations (operational layer)
│ │ ├── @source/
│ │ │ ├── schema/
│ │ │ │ ├── core.ts # core_group, core_entity, core_unit
│ │ │ │ ├── db.ts # db_person, db_organization, db_account, db_relation
│ │ │ │ ├── catalog.ts # catalog_brand, catalog_item, catalog_item_variant, pricing
│ │ │ │ ├── commerce.ts # commerce_order + lookups
│ │ │ │ ├── finance.ts # finance_transaction + lookups
│ │ │ │ ├── marketing.ts # marketing_channel, marketing_agenda, marketing_activity
│ │ │ │ └── target.ts # target_item, target_metric
│ │ │ ├── client.ts # LibSQL client + Drizzle instance
│ │ │ └── index.ts
│ │ ├── drizzle.config.ts
│ │ ├── migrations/
│ │ └── package.json
│ │
│ ├── sync/ # Extract + Load layer
│ │ ├── @source/
│ │ │ ├── adapters/
│ │ │ │ ├── base.ts # Abstract SyncAdapter interface
│ │ │ │ └── xlsx.ts # XlsxAdapter — loads raw rows into DuckDB
│ │ │ ├── duck.ts # DuckDB connection + raw layer setup
│ │ │ ├── seed.ts # Seeds LibSQL lookup tables from registry
│ │ │ └── index.ts
│ │ └── package.json
│ │
│ ├── transform/ # dbt project — analytical transform layer
│ │ ├── models/
│ │ │ ├── staging/ # stg_* — clean and rename raw columns
│ │ │ │ ├── stg_transactions.sql
│ │ │ │ ├── stg_students.sql
│ │ │ │ └── stg_organizations.sql
│ │ │ ├── intermediate/ # int_* — join, classify, resolve FKs
│ │ │ │ ├── int_orders.sql
│ │ │ │ └── int_enrollments.sql
│ │ │ └── marts/ # mart_* — analysis-ready aggregations
│ │ │ ├── mart_revenue.sql
│ │ │ ├── mart_program_progress.sql
│ │ │ ├── mart_channel_marketing.sql
│ │ │ └── mart_school_progress.sql
│ │ ├── tests/
│ │ ├── dbt_project.yml
│ │ ├── profiles.yml
│ │ ├── pyproject.toml # uv project file
│ │ └── requirements.txt # dbt-core, dbt-duckdb
│ │
│ └── format/ # Format layer — marts -> report.json
│ ├── @source/
│ │ ├── sections/
│ │ │ ├── revenue.ts # Reads mart_revenue
│ │ │ ├── orders.ts # Reads mart_program_progress
│ │ │ ├── marketing.ts # Reads mart_channel_marketing
│ │ │ └── schools.ts # Reads mart_school_progress
│ │ ├── report.ts # Assembles full report.json
│ │ └── index.ts
│ └── package.json
│
├── @services/ # Deployable apps
│ ├── present/ # PPTX + PDF generator
│ │ ├── @source/
│ │ │ ├── slides/
│ │ │ │ ├── cover.ts
│ │ │ │ ├── revenue-comparison.ts
│ │ │ │ ├── key-comparison.ts
│ │ │ │ ├── program-progress.ts
│ │ │ │ ├── school-progress.ts
│ │ │ │ └── channel-marketing.ts
│ │ │ ├── generator.ts
│ │ │ └── index.ts
│ │ └── package.json
│ │
│ └── dashboard/ # TanStack Start web app
│ ├── src/
│ │ ├── routes/
│ │ │ ├── __root.tsx
│ │ │ ├── index.tsx
│ │ │ ├── _app.tsx
│ │ │ └── _app/
│ │ │ ├── revenue.tsx
│ │ │ ├── orders.tsx
│ │ │ ├── marketing.tsx
│ │ │ ├── schools.tsx
│ │ │ ├── students.$id.tsx
│ │ │ └── organizations.$id.tsx
│ │ ├── components/
│ │ └── styles.css
│ ├── app.config.ts
│ ├── vite.config.ts
│ ├── tsconfig.json
│ ├── components.json
│ └── package.json
│
├── source/ # Raw xlsx files — untouched input data
├── output/ # Generated reports — gitignored
│ ├── monthly/
│ └── weekly/
│
├── atlas.db # DuckDB file (analytical) — gitignored
├── atlas-ops.db # LibSQL file (operational) — gitignored
├── bun.lock
├── package.json
├── turbo.json
└── tsconfig.jsonELT Pipeline
Extract + Load (Sync)
Reads source data and loads it into the DuckDB raw layer with minimal processing. The raw layer preserves source column names (Indonesian) and original types. No business logic is applied at this stage.
DuckDB reads xlsx files natively — no adapter code needed for extraction:
CREATE OR REPLACE TABLE raw_transactions AS
SELECT * FROM read_xlsx(
'source/LAPORAN EVALUASI IONs 2026 TM-WLC.xlsx',
sheet := 'FEBRUARI'
);The SyncAdapter interface is reserved for non-DuckDB sources (Odoo, APIs):
interface SyncAdapter {
readonly source: string
connect(): Promise<void>
extractToRaw(db: DuckDBConnection, targetTable: string): Promise<void>
disconnect(): Promise<void>
}CLI usage:
bun run sync --source xlsx --path source/ --entity IONS --year 2026
bun run sync --source odoo --entity EPN # futureTransform (dbt)
Runs dbt models against DuckDB to produce clean, analysis-ready marts. Full layer definitions and mart schemas are in analytics.md.
CLI usage:
cd @packages/transform
uv run dbt run # run all models
uv run dbt run --select marts
uv run dbt test # run all tests
uv run dbt docs generate # generate lineage docsFormat
Reads dbt marts from DuckDB, applies period filtering and cross-mart assembly, and outputs report.json. Also returns data in-memory for direct consumption by the Present layer.
Output type:
type Report = {
meta: {
entity: string // e.g. "IONS"
period: string // e.g. "2026-02"
generated_at: string // ISO timestamp
scope: string[] // unit codes included
}
units: Record<string, UnitReport>
}CLI usage:
bun run format --entity IONS --period 2026-02
bun run format --entity IONS --period 2026-02 --unit TM,WLC_NON_ENGLISHPresent
Consumes report.json and produces PPTX + PDF. No business logic — pure rendering.
CLI usage:
bun run present --report output/monthly/2026-02-report.json
bun run present --entity IONS --period 2026-02 # runs Format then PresentDatabase Architecture
Two Layers
DuckDB (atlas.db) | LibSQL (atlas-ops.db) | |
|---|---|---|
| Role | Analytical | Operational |
| Contents | raw, staging, intermediate, mart layers | 38 normalized tables from model.md |
| Schema defined in | dbt models (.sql files) | Drizzle ORM (TypeScript) |
| Query pattern | Aggregations, GROUP BY, window functions, PIVOT | Individual record lookup, FK joins |
| Used by | Format layer, dashboard aggregated views | Dashboard drill-down, individual record views |
| Remote option | MotherDuck (future) | Turso |
Query routing in the dashboard
LibSQL environment config
# Local development
DATABASE_URL=file:atlas-ops.db
# Remote (Turso)
DATABASE_URL=libsql://<db-name>.turso.io
TURSO_AUTH_TOKEN=<token>Migrations (LibSQL / Drizzle)
bunx drizzle-kit generate # generate migration from schema changes
bunx drizzle-kit migrate # apply to database
bunx drizzle-kit push # push schema directly to dev DBDashboard
TanStack Start app with server functions that query both DuckDB (aggregated) and LibSQL (individual) depending on the request type. No separate API service.
Route to data source mapping:
| Route | Section | Data source |
|---|---|---|
/ | Overview | DuckDB — all unit totals |
/revenue | Revenue Comparison | DuckDB — mart_revenue |
/orders | Key Comparison | DuckDB — mart_program_progress |
/programs | Program Progress | DuckDB — mart_program_progress |
/schools | School Progress | DuckDB — mart_school_progress |
/marketing | Channel Marketing | DuckDB — mart_channel_marketing |
/students/:id | Student profile | LibSQL — db_person + commerce_order |
/organizations/:id | Organization detail | LibSQL — db_organization + db_person |
Adapter Pattern
The SyncAdapter interface decouples source systems from the DuckDB raw layer. DuckDB's native read_xlsx handles xlsx without an adapter. Future sources (Odoo, APIs) implement the interface.
The column map in XlsxAdapter (and future adapters) is the only place in the codebase where Indonesian source terms appear. All other layers use English terms exclusively.
Scope and Multi-tenancy
All operational tables in LibSQL include entity_id. All dbt models accept an entity variable for filtering. Queries never mix data across entities.
// LibSQL — entity scoping at query layer
const orders = await db.select().from(commerceOrder)
.where(eq(commerceOrder.entityId, entityId))-- dbt — entity variable passed at runtime
SELECT * FROM stg_transactions
WHERE entity_code = '{{ var("entity") }}'Deployment
Phase 1 — Local
# Seed LibSQL lookup tables
bun run sync:seed --entity IONS
# Load xlsx into DuckDB raw layer
bun run sync --source xlsx --path source/ --entity IONS
# Run dbt transforms
cd @packages/transform && uv run dbt run
# Generate monthly report
bun run format --entity IONS --period 2026-02
# Generate PPTX + PDF
bun run present --entity IONS --period 2026-02
# Start dashboard
bun run dev --filter @services/dashboardPhase 2 — Remote DB
Replace DATABASE_URL with a Turso connection string for LibSQL. Replace DuckDB local file with MotherDuck for the analytical layer (optional).
Phase 3 — Hosted Dashboard
Deploy @services/dashboard to Cloudflare Pages, Fly.io, or similar. Dashboard connects to Turso and MotherDuck remotely.
Phase 4 — Odoo Sync
Implement OdooAdapter. Raw Odoo data loads into DuckDB raw layer via the same adapter interface. dbt models are unchanged — they read from the raw layer regardless of source.