Atlas Plan

Atlas Concept

Atlas Concept

Status: Draft Last Updated: 2026-02-20


Vision

Atlas is a local-first Modern Data Stack for multi-entity business reporting. It normalizes operational data from source files, transforms it through a declarative analytics pipeline, and delivers outputs as presentation slides, PDFs, and a live web dashboard — on any cadence: daily, weekly, or monthly.

Core belief: Business reporting should be generated from data, not assembled by hand. Every number in a slide should be traceable to a source record.


The Problem

Manual Reporting Is Fragile

Most small and mid-size businesses run their reporting from spreadsheets. Each reporting cycle — whether daily, weekly, or monthly — involves the same sequence: open the source file, copy numbers into a template, reformat the layout, check the totals, and export to PDF. When a number changes, the process starts over.

This works until it doesn't. Common failure points:

  • Formula drift. Spreadsheet formulas accumulate over time. A column gets added, a sheet gets renamed, a reference breaks silently. The wrong number appears in the report and no one catches it until the meeting.

  • No history. The current file reflects the current state. Last period's numbers exist only in the last period's export — if it was saved. Year-over-year comparison requires opening multiple files and manually aligning columns.

  • Schema drift. Source files evolve. A new column appears, a field is split, a sheet gets renamed. Downstream formulas break. Each year's data is subtly different from the last, and there is no systematic way to handle the difference.

  • No single source of truth. Revenue figures in one sheet may not match the summary in another. Enrollment counts may not reconcile with the transaction ledger. Discrepancies are discovered in the meeting, not before it.

  • Reporting cadence is a bottleneck. Moving from monthly to weekly reporting means proportionally more manual work, not just more frequent work. The pipeline does not scale with cadence — increasing frequency increases effort.

The Reporting Pipeline Is Implicit

The knowledge of how to produce the report lives in people, not systems. When the person who knows how to run the report is unavailable, the report does not get produced. When that person leaves, the process leaves with them. There is no way to audit, reproduce, or improve a process that exists only as institutional memory.


The Solution

Atlas replaces the manual reporting pipeline with a declarative, automated one. Source data is loaded once; the report is generated on demand, at any cadence.

Load Once, Query Many Times

Source files (xlsx, or future API sources) are loaded into a structured analytical database exactly once per reporting cycle. The raw data is preserved unchanged. All transformation and aggregation happens in the database — not in the spreadsheet.

This means the same source data can power a monthly summary, a weekly progress check, or a daily operational view without any manual re-extraction.

Schema Differences Are Handled Systematically

Source files change from year to year — columns are renamed, new fields are added, sheet structures evolve. Atlas handles this through per-year configuration files that map source column names to consistent internal names. Adding a new year's data requires adding one configuration file, not changing any code.

Business Logic Lives in the Transform Layer

All calculations — revenue totals, period-over-period comparisons, customer classification, channel contribution rates — are expressed as SQL models in a declarative transform tool (dbt). They run in the database, are version-controlled, and produce the same result every time. There is no ambiguity about how a number was calculated.

Reports Are Generated, Not Assembled

The output layer (slides, PDF, dashboard) reads from pre-computed analytical tables. It applies no business logic. A slide showing a revenue figure is generated from a database row, not typed by hand. Changing the period produces a new report in seconds.

Any Cadence, Same Pipeline

The same pipeline that produces a monthly board report can produce a weekly progress check or a daily enrollment summary. The cadence is a parameter, not a structural change. Increasing reporting frequency adds no incremental effort.


Core Principles

1. ELT over ETL. Raw source data is loaded first, untransformed. All business logic lives in the transform layer. Source data is always reprocessable — if a calculation is wrong, fix the SQL model and re-run. You never need to re-extract from the source.

2. Declarative transforms. Every aggregation, classification, and join is expressed as a named SQL model. There are no implicit calculations hiding in application code. Anyone with SQL knowledge can read the transform layer and understand exactly how every number is derived.

3. Two-layer database. Analytical queries (aggregations, comparisons, pivots) run against an in-process columnar database. Individual record lookups (person profile, organization history) run against a relational database. Each database is optimized for its workload.

4. Report as data. The Format layer produces a structured JSON document. The Present layer is a pure renderer — it applies no business logic. The same report document can be rendered as editable slides, PDF, or a web dashboard without recomputing anything.

5. Adapter pattern. The Sync layer uses source adapters to decouple extraction from loading. The current implementation reads xlsx files natively. Future sources (ERP systems, APIs) implement the same interface. The transform layer is unaffected by source changes.

6. Entity scoping. All data is scoped to a legal business entity. Different entities never share data in queries, even when they operate under the same group and use the same pipeline and source file structure.


The Data Journey

A reporting cycle follows five steps:

1. Sync. Source files are read and loaded into the raw layer of the analytical database. Column names are preserved exactly as they appear in the source — no transformation at this stage. A configuration file per source year maps file paths, sheet names, and column names so that schema differences across years are resolved before data enters the pipeline.

2. Transform. SQL models run in sequence: staging models clean and rename columns to a consistent internal vocabulary; intermediate models join and classify records; mart models aggregate into analysis-ready tables covering revenue, program enrollment, marketing channel performance, and institutional segmentation.

3. Format. The Format layer reads the mart tables, applies period filtering, and assembles a structured report document. This document contains every number that will appear in any output — actuals, targets, gaps, period-over-period comparisons, and rankings. It is written to disk as JSON for auditability.

4. Present. The Present layer reads the report document and renders it as slides. Each slide type is a template that accepts mart data and produces formatted output. Slides are editable (PPTX) for human review and exportable for distribution (PDF). The Present layer applies no business logic.

5. Dashboard. The web dashboard reads mart tables directly for aggregated views and the operational database for individual record drill-down. It reflects the same data as the slides, accessible at any time without re-running the pipeline.


Business Entities

Atlas supports multiple business entities under a common group structure:

Group
├── Entity A
│   ├── Unit 1
│   ├── Unit 2
│   └── Unit 3
└── Entity B
    ├── Unit 1
    └── Unit 2

Each Entity operates independently. Units within an Entity own products, fulfill orders, and report revenue separately. A standard report covers all active units for a given entity, with per-unit output for revenue, program progress, and institutional segmentation.

Entity scoping is enforced at every layer — raw tables, transform models, mart queries, and dashboard views never mix data across entities, even when running under the same pipeline.


Report Sections

The standard report contains five sections, each answering a specific business question:

SectionQuestion answered
Revenue ComparisonHow does this period's revenue compare to last period, last year, all-time best, and target?
Key ComparisonAre we on track for new customers, renewals, and returning customers this period?
Program ProgressWhich programs or product lines are meeting their enrollment targets?
Institutional ProgressWhich institutions are producing customers, and how has that changed year over year?
Channel MarketingWhich acquisition channels are converting, and what is each channel's contribution rate?

Each section is produced per unit. Channel Marketing is presented as a combined view across units. The same sections apply regardless of reporting cadence — weekly reports show the same structure with a narrower time window.


Open Questions

  • Present layer component reuse. Should slide rendering components be shared with the dashboard? Options range from data-only reuse (same JSON schema, separate rendering code) to full visual component reuse via React-to-PPTX rendering. Decision deferred — see architecture.md for the full option analysis.

On this page