Architecture

System architecture for Atlas. All terminology follows glossary.md. All data entities follow model.md and analytics.md.

Overview

Atlas is a local-first Modern Data Stack for multi-entity business reporting. It normalizes data from operational sources using an ELT pipeline, transforms raw data into analytical marts via dbt, and serves outputs through both a static presentation layer (PPTX/PDF) and a real-time web dashboard.

Principles

Principle	Description
Modern Data Stack	Modular, best-of-breed components for each layer. Each layer has a single responsibility and a clean interface to the next.
ELT over ETL	Raw source data is loaded first, untransformed. All business logic lives in the transform layer (dbt). Source data is always reprocessable.
Two-layer database	DuckDB for analytical workloads (aggregations, marts, reporting). LibSQL/Turso for operational workloads (individual record lookup, dashboard drill-down).
Declarative transforms	All business logic for reporting is expressed as SQL models in dbt. No imperative calculation code in the pipeline.
Adapter pattern	The Sync layer uses source adapters to decouple extraction from loading. Swapping xlsx for Odoo requires only a new adapter.
Schema as code	The LibSQL operational schema is defined in TypeScript via Drizzle ORM. The DuckDB analytical schema is defined in dbt models. Both are version-controlled.
Entity scoping	All data is scoped to a `core_entity` record. IONs and EPN never share data in queries.
Report as data	The Format layer reads dbt marts and assembles a structured `report.json`. The Present layer is a pure renderer — it applies no business logic.
Python where it makes sense	dbt-core and any data tooling that benefits from Python uses Python, managed via `uv`. Everything else is TypeScript on Bun.

Technology Stack

Concern	Technology	Notes
Runtime	Bun	All TS scripts, packages, and services
Monorepo	Turborepo + Bun workspaces	Task pipeline for build, lint, test
Language	TypeScript (primary), Python (dbt + data tooling)	Strict TS throughout; Python managed via `uv`
Python tooling	uv	Fast Python package manager; manages dbt venv
Analytical DB	DuckDB	In-process OLAP; native xlsx reading; hosts raw, staging, intermediate, mart layers
Operational DB	LibSQL (local) → Turso (remote)	SQLite-compatible; individual record lookup; dashboard drill-down
Operational ORM	Drizzle ORM (LibSQL dialect)	Type-safe queries; schema-as-code; migrations via Drizzle Kit
Transform layer	dbt-core + dbt-duckdb	Declarative SQL models; staging, intermediate, and mart layers
Extract + Load	DuckDB native xlsx reading + SheetJS fallback	DuckDB reads xlsx directly for raw layer; SheetJS for edge cases
Odoo sync (future)	Odoo JSON-RPC via `fetch`	Same adapter interface as xlsx
Format	TypeScript + DuckDB node client	Reads marts, assembles report.json
Present — PPTX	pptxgenjs	Programmatic slide generation
Present — PDF	LibreOffice CLI	Converts PPTX to PDF via headless LibreOffice
Dashboard	TanStack Start	SSR + client, file-based routing
Dashboard UI	shadcn/ui + Tailwind CSS	Component library
Dashboard — aggregated	DuckDB node client via server functions	Reads marts directly
Dashboard — individual	Drizzle + LibSQL via server functions	Individual record lookup and drill-down
Code quality	ESLint + Prettier + TypeScript strict	Shared configs from `@repo/lint` and `@repo/typescript`

Repository Structure

ions/
├── @plan/                              # Planning docs (not a workspace package)
│   ├── architecture.md
│   ├── analytics.md                    # DuckDB, dbt layers, mart definitions
│   ├── glossary.md
│   ├── model.md
│   ├── registry.md
│   ├── domains/
│   └── discussions/
│
├── @repo/                              # Shared infra packages
│   ├── typescript/                     # Shared tsconfig presets
│   └── lint/                           # Shared ESLint flat config
│
├── @packages/                          # Business-logic packages
│   ├── db/                             # Drizzle schema + migrations (operational layer)
│   │   ├── @source/
│   │   │   ├── schema/
│   │   │   │   ├── core.ts             # core_group, core_entity, core_unit
│   │   │   │   ├── db.ts               # db_person, db_organization, db_account, db_relation
│   │   │   │   ├── catalog.ts          # catalog_brand, catalog_item, catalog_item_variant, pricing
│   │   │   │   ├── commerce.ts         # commerce_order + lookups
│   │   │   │   ├── finance.ts          # finance_transaction + lookups
│   │   │   │   ├── marketing.ts        # marketing_channel, marketing_agenda, marketing_activity
│   │   │   │   └── target.ts           # target_item, target_metric
│   │   │   ├── client.ts               # LibSQL client + Drizzle instance
│   │   │   └── index.ts
│   │   ├── drizzle.config.ts
│   │   ├── migrations/
│   │   └── package.json
│   │
│   ├── sync/                           # Extract + Load layer
│   │   ├── @source/
│   │   │   ├── adapters/
│   │   │   │   ├── base.ts             # Abstract SyncAdapter interface
│   │   │   │   └── xlsx.ts             # XlsxAdapter — loads raw rows into DuckDB
│   │   │   ├── duck.ts                 # DuckDB connection + raw layer setup
│   │   │   ├── seed.ts                 # Seeds LibSQL lookup tables from registry
│   │   │   └── index.ts
│   │   └── package.json
│   │
│   ├── transform/                      # dbt project — analytical transform layer
│   │   ├── models/
│   │   │   ├── staging/                # stg_* — clean and rename raw columns
│   │   │   │   ├── stg_transactions.sql
│   │   │   │   ├── stg_students.sql
│   │   │   │   └── stg_organizations.sql
│   │   │   ├── intermediate/           # int_* — join, classify, resolve FKs
│   │   │   │   ├── int_orders.sql
│   │   │   │   └── int_enrollments.sql
│   │   │   └── marts/                  # mart_* — analysis-ready aggregations
│   │   │       ├── mart_revenue.sql
│   │   │       ├── mart_program_progress.sql
│   │   │       ├── mart_channel_marketing.sql
│   │   │       └── mart_school_progress.sql
│   │   ├── tests/
│   │   ├── dbt_project.yml
│   │   ├── profiles.yml
│   │   ├── pyproject.toml              # uv project file
│   │   └── requirements.txt            # dbt-core, dbt-duckdb
│   │
│   └── format/                         # Format layer — marts -> report.json
│       ├── @source/
│       │   ├── sections/
│       │   │   ├── revenue.ts          # Reads mart_revenue
│       │   │   ├── orders.ts           # Reads mart_program_progress
│       │   │   ├── marketing.ts        # Reads mart_channel_marketing
│       │   │   └── schools.ts          # Reads mart_school_progress
│       │   ├── report.ts               # Assembles full report.json
│       │   └── index.ts
│       └── package.json
│
├── @services/                          # Deployable apps
│   ├── present/                        # PPTX + PDF generator
│   │   ├── @source/
│   │   │   ├── slides/
│   │   │   │   ├── cover.ts
│   │   │   │   ├── revenue-comparison.ts
│   │   │   │   ├── key-comparison.ts
│   │   │   │   ├── program-progress.ts
│   │   │   │   ├── school-progress.ts
│   │   │   │   └── channel-marketing.ts
│   │   │   ├── generator.ts
│   │   │   └── index.ts
│   │   └── package.json
│   │
│   └── dashboard/                      # TanStack Start web app
│       ├── src/
│       │   ├── routes/
│       │   │   ├── __root.tsx
│       │   │   ├── index.tsx
│       │   │   ├── _app.tsx
│       │   │   └── _app/
│       │   │       ├── revenue.tsx
│       │   │       ├── orders.tsx
│       │   │       ├── marketing.tsx
│       │   │       ├── schools.tsx
│       │   │       ├── students.$id.tsx
│       │   │       └── organizations.$id.tsx
│       │   ├── components/
│       │   └── styles.css
│       ├── app.config.ts
│       ├── vite.config.ts
│       ├── tsconfig.json
│       ├── components.json
│       └── package.json
│
├── source/                             # Raw xlsx files — untouched input data
├── output/                             # Generated reports — gitignored
│   ├── monthly/
│   └── weekly/
│
├── atlas.db                            # DuckDB file (analytical) — gitignored
├── atlas-ops.db                        # LibSQL file (operational) — gitignored
├── bun.lock
├── package.json
├── turbo.json
└── tsconfig.json

ELT Pipeline

Extract + Load (Sync)

Reads source data and loads it into the DuckDB raw layer with minimal processing. The raw layer preserves source column names (Indonesian) and original types. No business logic is applied at this stage.

DuckDB reads xlsx files natively — no adapter code needed for extraction:

CREATE OR REPLACE TABLE raw_transactions AS
SELECT * FROM read_xlsx(
  'source/LAPORAN EVALUASI IONs 2026 TM-WLC.xlsx',
  sheet := 'FEBRUARI'
);

The SyncAdapter interface is reserved for non-DuckDB sources (Odoo, APIs):

interface SyncAdapter {
  readonly source: string
  connect(): Promise<void>
  extractToRaw(db: DuckDBConnection, targetTable: string): Promise<void>
  disconnect(): Promise<void>
}

CLI usage:

bun run sync --source xlsx --path source/ --entity IONS --year 2026
bun run sync --source odoo --entity EPN  # future

Transform (dbt)

Runs dbt models against DuckDB to produce clean, analysis-ready marts. Full layer definitions and mart schemas are in analytics.md.

CLI usage:

cd @packages/transform

uv run dbt run            # run all models
uv run dbt run --select marts
uv run dbt test           # run all tests
uv run dbt docs generate  # generate lineage docs

Format

Reads dbt marts from DuckDB, applies period filtering and cross-mart assembly, and outputs report.json. Also returns data in-memory for direct consumption by the Present layer.

Output type:

type Report = {
  meta: {
    entity: string        // e.g. "IONS"
    period: string        // e.g. "2026-02"
    generated_at: string  // ISO timestamp
    scope: string[]       // unit codes included
  }
  units: Record<string, UnitReport>
}

CLI usage:

bun run format --entity IONS --period 2026-02
bun run format --entity IONS --period 2026-02 --unit TM,WLC_NON_ENGLISH

Present

Consumes report.json and produces PPTX + PDF. No business logic — pure rendering.

CLI usage:

bun run present --report output/monthly/2026-02-report.json
bun run present --entity IONS --period 2026-02  # runs Format then Present

Database Architecture

Two Layers

	DuckDB (`atlas.db`)	LibSQL (`atlas-ops.db`)
Role	Analytical	Operational
Contents	raw, staging, intermediate, mart layers	38 normalized tables from `model.md`
Schema defined in	dbt models (`.sql` files)	Drizzle ORM (TypeScript)
Query pattern	Aggregations, GROUP BY, window functions, PIVOT	Individual record lookup, FK joins
Used by	Format layer, dashboard aggregated views	Dashboard drill-down, individual record views
Remote option	MotherDuck (future)	Turso

Query routing in the dashboard

LibSQL environment config

# Local development
DATABASE_URL=file:atlas-ops.db

# Remote (Turso)
DATABASE_URL=libsql://<db-name>.turso.io
TURSO_AUTH_TOKEN=<token>

Migrations (LibSQL / Drizzle)

bunx drizzle-kit generate   # generate migration from schema changes
bunx drizzle-kit migrate    # apply to database
bunx drizzle-kit push       # push schema directly to dev DB

Dashboard

TanStack Start app with server functions that query both DuckDB (aggregated) and LibSQL (individual) depending on the request type. No separate API service.

Route to data source mapping:

Route	Section	Data source
`/`	Overview	DuckDB — all unit totals
`/revenue`	Revenue Comparison	DuckDB — `mart_revenue`
`/orders`	Key Comparison	DuckDB — `mart_program_progress`
`/programs`	Program Progress	DuckDB — `mart_program_progress`
`/schools`	School Progress	DuckDB — `mart_school_progress`
`/marketing`	Channel Marketing	DuckDB — `mart_channel_marketing`
`/students/:id`	Student profile	LibSQL — `db_person` + `commerce_order`
`/organizations/:id`	Organization detail	LibSQL — `db_organization` + `db_person`

Adapter Pattern

The SyncAdapter interface decouples source systems from the DuckDB raw layer. DuckDB's native read_xlsx handles xlsx without an adapter. Future sources (Odoo, APIs) implement the interface.

The column map in XlsxAdapter (and future adapters) is the only place in the codebase where Indonesian source terms appear. All other layers use English terms exclusively.

Scope and Multi-tenancy

All operational tables in LibSQL include entity_id. All dbt models accept an entity variable for filtering. Queries never mix data across entities.

// LibSQL — entity scoping at query layer
const orders = await db.select().from(commerceOrder)
  .where(eq(commerceOrder.entityId, entityId))

-- dbt — entity variable passed at runtime
SELECT * FROM stg_transactions
WHERE entity_code = '{{ var("entity") }}'

Deployment

Phase 1 — Local

# Seed LibSQL lookup tables
bun run sync:seed --entity IONS

# Load xlsx into DuckDB raw layer
bun run sync --source xlsx --path source/ --entity IONS

# Run dbt transforms
cd @packages/transform && uv run dbt run

# Generate monthly report
bun run format --entity IONS --period 2026-02

# Generate PPTX + PDF
bun run present --entity IONS --period 2026-02

# Start dashboard
bun run dev --filter @services/dashboard

Architecture

Architecture

Overview

Principles

Technology Stack

Repository Structure

ELT Pipeline

Extract + Load (Sync)

Transform (dbt)

Format

Present

Database Architecture

Two Layers

Query routing in the dashboard

LibSQL environment config

Migrations (LibSQL / Drizzle)

Dashboard

Adapter Pattern

Scope and Multi-tenancy

Deployment

Phase 1 — Local

Phase 2 — Remote DB

Phase 3 — Hosted Dashboard

Phase 4 — Odoo Sync