Atlas Plan

Architecture

Architecture

System architecture for Atlas. All terminology follows glossary.md. All data entities follow model.md and analytics.md.


Overview

Atlas is a local-first Modern Data Stack for multi-entity business reporting. It normalizes data from operational sources using an ELT pipeline, transforms raw data into analytical marts via dbt, and serves outputs through both a static presentation layer (PPTX/PDF) and a real-time web dashboard.


Principles

PrincipleDescription
Modern Data StackModular, best-of-breed components for each layer. Each layer has a single responsibility and a clean interface to the next.
ELT over ETLRaw source data is loaded first, untransformed. All business logic lives in the transform layer (dbt). Source data is always reprocessable.
Two-layer databaseDuckDB for analytical workloads (aggregations, marts, reporting). LibSQL/Turso for operational workloads (individual record lookup, dashboard drill-down).
Declarative transformsAll business logic for reporting is expressed as SQL models in dbt. No imperative calculation code in the pipeline.
Adapter patternThe Sync layer uses source adapters to decouple extraction from loading. Swapping xlsx for Odoo requires only a new adapter.
Schema as codeThe LibSQL operational schema is defined in TypeScript via Drizzle ORM. The DuckDB analytical schema is defined in dbt models. Both are version-controlled.
Entity scopingAll data is scoped to a core_entity record. IONs and EPN never share data in queries.
Report as dataThe Format layer reads dbt marts and assembles a structured report.json. The Present layer is a pure renderer — it applies no business logic.
Python where it makes sensedbt-core and any data tooling that benefits from Python uses Python, managed via uv. Everything else is TypeScript on Bun.

Technology Stack

ConcernTechnologyNotes
RuntimeBunAll TS scripts, packages, and services
MonorepoTurborepo + Bun workspacesTask pipeline for build, lint, test
LanguageTypeScript (primary), Python (dbt + data tooling)Strict TS throughout; Python managed via uv
Python toolinguvFast Python package manager; manages dbt venv
Analytical DBDuckDBIn-process OLAP; native xlsx reading; hosts raw, staging, intermediate, mart layers
Operational DBLibSQL (local) → Turso (remote)SQLite-compatible; individual record lookup; dashboard drill-down
Operational ORMDrizzle ORM (LibSQL dialect)Type-safe queries; schema-as-code; migrations via Drizzle Kit
Transform layerdbt-core + dbt-duckdbDeclarative SQL models; staging, intermediate, and mart layers
Extract + LoadDuckDB native xlsx reading + SheetJS fallbackDuckDB reads xlsx directly for raw layer; SheetJS for edge cases
Odoo sync (future)Odoo JSON-RPC via fetchSame adapter interface as xlsx
FormatTypeScript + DuckDB node clientReads marts, assembles report.json
Present — PPTXpptxgenjsProgrammatic slide generation
Present — PDFLibreOffice CLIConverts PPTX to PDF via headless LibreOffice
DashboardTanStack StartSSR + client, file-based routing
Dashboard UIshadcn/ui + Tailwind CSSComponent library
Dashboard — aggregatedDuckDB node client via server functionsReads marts directly
Dashboard — individualDrizzle + LibSQL via server functionsIndividual record lookup and drill-down
Code qualityESLint + Prettier + TypeScript strictShared configs from @repo/lint and @repo/typescript

Repository Structure

ions/
├── @plan/                              # Planning docs (not a workspace package)
│   ├── architecture.md
│   ├── analytics.md                    # DuckDB, dbt layers, mart definitions
│   ├── glossary.md
│   ├── model.md
│   ├── registry.md
│   ├── domains/
│   └── discussions/

├── @repo/                              # Shared infra packages
│   ├── typescript/                     # Shared tsconfig presets
│   └── lint/                           # Shared ESLint flat config

├── @packages/                          # Business-logic packages
│   ├── db/                             # Drizzle schema + migrations (operational layer)
│   │   ├── @source/
│   │   │   ├── schema/
│   │   │   │   ├── core.ts             # core_group, core_entity, core_unit
│   │   │   │   ├── db.ts               # db_person, db_organization, db_account, db_relation
│   │   │   │   ├── catalog.ts          # catalog_brand, catalog_item, catalog_item_variant, pricing
│   │   │   │   ├── commerce.ts         # commerce_order + lookups
│   │   │   │   ├── finance.ts          # finance_transaction + lookups
│   │   │   │   ├── marketing.ts        # marketing_channel, marketing_agenda, marketing_activity
│   │   │   │   └── target.ts           # target_item, target_metric
│   │   │   ├── client.ts               # LibSQL client + Drizzle instance
│   │   │   └── index.ts
│   │   ├── drizzle.config.ts
│   │   ├── migrations/
│   │   └── package.json
│   │
│   ├── sync/                           # Extract + Load layer
│   │   ├── @source/
│   │   │   ├── adapters/
│   │   │   │   ├── base.ts             # Abstract SyncAdapter interface
│   │   │   │   └── xlsx.ts             # XlsxAdapter — loads raw rows into DuckDB
│   │   │   ├── duck.ts                 # DuckDB connection + raw layer setup
│   │   │   ├── seed.ts                 # Seeds LibSQL lookup tables from registry
│   │   │   └── index.ts
│   │   └── package.json
│   │
│   ├── transform/                      # dbt project — analytical transform layer
│   │   ├── models/
│   │   │   ├── staging/                # stg_* — clean and rename raw columns
│   │   │   │   ├── stg_transactions.sql
│   │   │   │   ├── stg_students.sql
│   │   │   │   └── stg_organizations.sql
│   │   │   ├── intermediate/           # int_* — join, classify, resolve FKs
│   │   │   │   ├── int_orders.sql
│   │   │   │   └── int_enrollments.sql
│   │   │   └── marts/                  # mart_* — analysis-ready aggregations
│   │   │       ├── mart_revenue.sql
│   │   │       ├── mart_program_progress.sql
│   │   │       ├── mart_channel_marketing.sql
│   │   │       └── mart_school_progress.sql
│   │   ├── tests/
│   │   ├── dbt_project.yml
│   │   ├── profiles.yml
│   │   ├── pyproject.toml              # uv project file
│   │   └── requirements.txt            # dbt-core, dbt-duckdb
│   │
│   └── format/                         # Format layer — marts -> report.json
│       ├── @source/
│       │   ├── sections/
│       │   │   ├── revenue.ts          # Reads mart_revenue
│       │   │   ├── orders.ts           # Reads mart_program_progress
│       │   │   ├── marketing.ts        # Reads mart_channel_marketing
│       │   │   └── schools.ts          # Reads mart_school_progress
│       │   ├── report.ts               # Assembles full report.json
│       │   └── index.ts
│       └── package.json

├── @services/                          # Deployable apps
│   ├── present/                        # PPTX + PDF generator
│   │   ├── @source/
│   │   │   ├── slides/
│   │   │   │   ├── cover.ts
│   │   │   │   ├── revenue-comparison.ts
│   │   │   │   ├── key-comparison.ts
│   │   │   │   ├── program-progress.ts
│   │   │   │   ├── school-progress.ts
│   │   │   │   └── channel-marketing.ts
│   │   │   ├── generator.ts
│   │   │   └── index.ts
│   │   └── package.json
│   │
│   └── dashboard/                      # TanStack Start web app
│       ├── src/
│       │   ├── routes/
│       │   │   ├── __root.tsx
│       │   │   ├── index.tsx
│       │   │   ├── _app.tsx
│       │   │   └── _app/
│       │   │       ├── revenue.tsx
│       │   │       ├── orders.tsx
│       │   │       ├── marketing.tsx
│       │   │       ├── schools.tsx
│       │   │       ├── students.$id.tsx
│       │   │       └── organizations.$id.tsx
│       │   ├── components/
│       │   └── styles.css
│       ├── app.config.ts
│       ├── vite.config.ts
│       ├── tsconfig.json
│       ├── components.json
│       └── package.json

├── source/                             # Raw xlsx files — untouched input data
├── output/                             # Generated reports — gitignored
│   ├── monthly/
│   └── weekly/

├── atlas.db                            # DuckDB file (analytical) — gitignored
├── atlas-ops.db                        # LibSQL file (operational) — gitignored
├── bun.lock
├── package.json
├── turbo.json
└── tsconfig.json

ELT Pipeline

Extract + Load (Sync)

Reads source data and loads it into the DuckDB raw layer with minimal processing. The raw layer preserves source column names (Indonesian) and original types. No business logic is applied at this stage.

DuckDB reads xlsx files natively — no adapter code needed for extraction:

CREATE OR REPLACE TABLE raw_transactions AS
SELECT * FROM read_xlsx(
  'source/LAPORAN EVALUASI IONs 2026 TM-WLC.xlsx',
  sheet := 'FEBRUARI'
);

The SyncAdapter interface is reserved for non-DuckDB sources (Odoo, APIs):

interface SyncAdapter {
  readonly source: string
  connect(): Promise<void>
  extractToRaw(db: DuckDBConnection, targetTable: string): Promise<void>
  disconnect(): Promise<void>
}

CLI usage:

bun run sync --source xlsx --path source/ --entity IONS --year 2026
bun run sync --source odoo --entity EPN  # future

Transform (dbt)

Runs dbt models against DuckDB to produce clean, analysis-ready marts. Full layer definitions and mart schemas are in analytics.md.

CLI usage:

cd @packages/transform

uv run dbt run            # run all models
uv run dbt run --select marts
uv run dbt test           # run all tests
uv run dbt docs generate  # generate lineage docs

Format

Reads dbt marts from DuckDB, applies period filtering and cross-mart assembly, and outputs report.json. Also returns data in-memory for direct consumption by the Present layer.

Output type:

type Report = {
  meta: {
    entity: string        // e.g. "IONS"
    period: string        // e.g. "2026-02"
    generated_at: string  // ISO timestamp
    scope: string[]       // unit codes included
  }
  units: Record<string, UnitReport>
}

CLI usage:

bun run format --entity IONS --period 2026-02
bun run format --entity IONS --period 2026-02 --unit TM,WLC_NON_ENGLISH

Present

Consumes report.json and produces PPTX + PDF. No business logic — pure rendering.

CLI usage:

bun run present --report output/monthly/2026-02-report.json
bun run present --entity IONS --period 2026-02  # runs Format then Present

Database Architecture

Two Layers

DuckDB (atlas.db)LibSQL (atlas-ops.db)
RoleAnalyticalOperational
Contentsraw, staging, intermediate, mart layers38 normalized tables from model.md
Schema defined indbt models (.sql files)Drizzle ORM (TypeScript)
Query patternAggregations, GROUP BY, window functions, PIVOTIndividual record lookup, FK joins
Used byFormat layer, dashboard aggregated viewsDashboard drill-down, individual record views
Remote optionMotherDuck (future)Turso

Query routing in the dashboard

LibSQL environment config

# Local development
DATABASE_URL=file:atlas-ops.db

# Remote (Turso)
DATABASE_URL=libsql://<db-name>.turso.io
TURSO_AUTH_TOKEN=<token>

Migrations (LibSQL / Drizzle)

bunx drizzle-kit generate   # generate migration from schema changes
bunx drizzle-kit migrate    # apply to database
bunx drizzle-kit push       # push schema directly to dev DB

Dashboard

TanStack Start app with server functions that query both DuckDB (aggregated) and LibSQL (individual) depending on the request type. No separate API service.

Route to data source mapping:

RouteSectionData source
/OverviewDuckDB — all unit totals
/revenueRevenue ComparisonDuckDB — mart_revenue
/ordersKey ComparisonDuckDB — mart_program_progress
/programsProgram ProgressDuckDB — mart_program_progress
/schoolsSchool ProgressDuckDB — mart_school_progress
/marketingChannel MarketingDuckDB — mart_channel_marketing
/students/:idStudent profileLibSQL — db_person + commerce_order
/organizations/:idOrganization detailLibSQL — db_organization + db_person

Adapter Pattern

The SyncAdapter interface decouples source systems from the DuckDB raw layer. DuckDB's native read_xlsx handles xlsx without an adapter. Future sources (Odoo, APIs) implement the interface.

The column map in XlsxAdapter (and future adapters) is the only place in the codebase where Indonesian source terms appear. All other layers use English terms exclusively.


Scope and Multi-tenancy

All operational tables in LibSQL include entity_id. All dbt models accept an entity variable for filtering. Queries never mix data across entities.

// LibSQL — entity scoping at query layer
const orders = await db.select().from(commerceOrder)
  .where(eq(commerceOrder.entityId, entityId))
-- dbt — entity variable passed at runtime
SELECT * FROM stg_transactions
WHERE entity_code = '{{ var("entity") }}'

Deployment

Phase 1 — Local

# Seed LibSQL lookup tables
bun run sync:seed --entity IONS

# Load xlsx into DuckDB raw layer
bun run sync --source xlsx --path source/ --entity IONS

# Run dbt transforms
cd @packages/transform && uv run dbt run

# Generate monthly report
bun run format --entity IONS --period 2026-02

# Generate PPTX + PDF
bun run present --entity IONS --period 2026-02

# Start dashboard
bun run dev --filter @services/dashboard

Phase 2 — Remote DB

Replace DATABASE_URL with a Turso connection string for LibSQL. Replace DuckDB local file with MotherDuck for the analytical layer (optional).

Phase 3 — Hosted Dashboard

Deploy @services/dashboard to Cloudflare Pages, Fly.io, or similar. Dashboard connects to Turso and MotherDuck remotely.

Phase 4 — Odoo Sync

Implement OdooAdapter. Raw Odoo data loads into DuckDB raw layer via the same adapter interface. dbt models are unchanged — they read from the raw layer regardless of source.