02-Architecture / 02.01.Core-Data-Architecture

02.01.Core Data Architecture

02.01. Core Data Architecture

Purpose

This document defines the target architecture for the business-critical vehicle-data acquisition pipeline.

Operational backout reference:

It exists because the current API has already proven the Tesla-first product flow, but the data-fetch/report core is still too tightly coupled, too Tesla-shaped, and too implicit in its failure handling.

This architecture is the source of truth for how the API must evolve from a working Tesla integration into a scalable multi-provider reporting platform.

Business Requirement

The business depends on one question being answered correctly for each vehicle:

For this VIN, what provider data do we have, what failed, why did it fail, and is the result sufficient to produce a trustworthy report?

If the architecture cannot answer that clearly, the system is not production grade.

Architectural Goals

  1. Make vehicle-data acquisition a first-class subsystem.
  2. Keep the API as a modular monolith for now.
  3. Make the core provider-agnostic, with Tesla as the first provider adapter.
  4. Model partial data and failure states explicitly.
  5. Keep token handling isolated from report rendering and PDF generation.
  6. Support providers with unequal capabilities without degrading the entire domain model into provider-specific hacks.
  7. Preserve a future path to async workers and service extraction without requiring microservices now.

Current Problems

The current implementation has these structural issues:

Current Implementation Status

As of March 13, 2026, the refactor has started and the codebase is no longer fully in the original mixed shape.

Already extracted from the old mixed service area:

Still remaining in tesla_fleet.py:

This means the architecture direction is active in code, but the top-level application boundary cleanup is not finished yet.

Current Active Flow

The current active backend flow is now:

router
  -> TeslaFleetService (workflow coordinator)
    -> OrderSessionService
    -> TeslaProviderAuthService
    -> TeslaProviderAcquisitionService
    -> TeslaProviderClientService
    -> TeslaProviderNormalizationService

This is important:

Core Principle

The system must be capability-driven, not provider-field-driven.

The API should not assume that every manufacturer exposes the same data. Instead, each provider declares which business capabilities it supports and the quality level of that support.

Provider-Agnostic Capability Model

Canonical capability modules:

Each provider supports each capability at one of these levels:

Example:

{
  "provider": "tesla",
  "capabilities": {
    "inventory": "full",
    "identity": "full",
    "vehicle_state": "partial",
    "battery": "full",
    "charging": "partial",
    "service": "partial",
    "warranty": "partial",
    "software": "full",
    "factory_options": "partial",
    "technical_specs": "premium"
  }
}

Canonical Data Acquisition Model

1. Acquisition Request

@dataclass
class VehicleAcquisitionRequest:
    session_id: str
    provider: str
    vehicle_id: str
    vin: str
    package: str
    lang: str
    requested_modules: list[str]

2. Module Result

@dataclass
class ModuleResult:
    module: str
    status: str
    source_auth: str | None
    http_status: int | None
    error_code: str | None
    error_detail: str | None
    raw_payload: dict | None

Allowed status values:

3. Canonical Vehicle Snapshot

@dataclass
class VehicleSnapshot:
    provider: str
    vin: str
    inventory: dict
    identity: dict
    vehicle_state: dict
    battery: dict
    charging: dict
    service: dict
    warranty: dict
    software: dict
    factory_options: dict
    technical_specs: dict
    module_status: dict[str, ModuleResult]

4. Acquisition Outcome

@dataclass
class VehicleAcquisitionOutcome:
    vin: str
    provider: str
    core_status: str
    reportability: str
    snapshot: VehicleSnapshot | None
    module_results: dict[str, ModuleResult]

Allowed core_status:

Allowed reportability:

End-To-End Data Flow

The business-critical data flow must be:

Provider API payloads
        │
        ▼
Provider acquisition layer
        │
        ▼
ModuleResult set (raw provider outcomes)
        │
        ▼
Provider normalizer
        │
        ▼
Canonical VehicleSnapshot
        │
        ▼
Reportability decision
        │
        ▼
WUI response model / PDF view model / export view model

This means:

  1. The provider acquisition layer fetches raw provider payloads.
  2. Those payloads are wrapped in structured ModuleResult objects.
  3. The provider normalizer converts successful payloads into one canonical internal VehicleSnapshot.
  4. Reportability logic decides whether the vehicle is complete, partial, or failed.
  5. The WUI and PDF layers render only from the canonical snapshot and module statuses.

Renderers must never inspect provider-specific endpoint wrappers directly.

Raw Provider Payload vs Canonical Snapshot

Raw Tesla Payload

This is provider data in Tesla's own shape.

Example:

{
  "response": {
    "vin": "LRW3E7EK1RC988948",
    "display_name": "JT3",
    "state": "online",
    "vehicle_config": {
      "car_type": "model3",
      "trim_badging": "long_range"
    },
    "charge_state": {
      "battery_level": 82,
      "usable_battery_level": 80,
      "battery_range": 248.6,
      "charge_limit_soc": 90,
      "charging_state": "Complete"
    },
    "vehicle_state": {
      "car_version": "2024.8.7",
      "sentry_mode": false,
      "valet_mode": true
    },
    "drive_state": {
      "latitude": 60.1708,
      "longitude": 24.9375
    }
  }
}

Problems with raw provider payloads:

Canonical Snapshot

This is the internal model the rest of the system should use.

Example:

{
  "provider": "tesla",
  "vin": "LRW3E7EK1RC988948",
  "inventory": {
    "vehicle_id": "149293992919",
    "state": "online"
  },
  "identity": {
    "display_name": "JT3",
    "make": "Tesla",
    "model": "Model 3",
    "trim": "Long Range"
  },
  "vehicle_state": {
    "software_version": "2024.8.7",
    "sentry_mode": false,
    "valet_mode": true,
    "location": {
      "latitude": 60.1708,
      "longitude": 24.9375,
      "status": "available"
    }
  },
  "battery": {
    "state_of_charge_pct": 82,
    "usable_state_of_charge_pct": 80,
    "rated_range_km": 400.1,
    "charge_limit_pct": 90,
    "charging_state": "complete"
  },
  "charging": {},
  "service": {},
  "warranty": {},
  "software": {},
  "factory_options": {},
  "technical_specs": {},
  "module_status": {
    "identity": {
      "module": "identity",
      "status": "success"
    },
    "vehicle_state": {
      "module": "vehicle_state",
      "status": "success"
    },
    "battery": {
      "module": "battery",
      "status": "success"
    }
  }
}

Benefits of the canonical snapshot:

Design Rule: Normalize Before Rendering

The sequence must always be:

fetch -> classify -> normalize -> assess -> render

Never:

fetch -> render directly -> patch missing fields in UI/PDF

That second pattern is what creates fragile provider-specific behavior and spreads business logic into presentation code.

Tesla Example Mapping

Example mapping from Tesla raw payload to canonical snapshot:

Tesla Raw Field Canonical Snapshot Field
response.vin vin
response.display_name identity.display_name
response.vehicle_config.car_type identity.model
response.vehicle_config.trim_badging identity.trim
response.charge_state.battery_level battery.state_of_charge_pct
response.charge_state.usable_battery_level battery.usable_state_of_charge_pct
response.charge_state.battery_range battery.rated_range_km
response.charge_state.charge_limit_soc battery.charge_limit_pct
response.charge_state.charging_state battery.charging_state
response.vehicle_state.car_version vehicle_state.software_version
response.vehicle_state.sentry_mode vehicle_state.sentry_mode
response.vehicle_state.valet_mode vehicle_state.valet_mode
response.drive_state.latitude vehicle_state.location.latitude
response.drive_state.longitude vehicle_state.location.longitude

Missing Data Example

If Tesla returns vehicle_data but omits drive_state, the canonical snapshot must still be valid, but explicit:

{
  "vehicle_state": {
    "software_version": "2024.8.7",
    "location": {
      "latitude": null,
      "longitude": null,
      "status": "missing_in_payload"
    }
  },
  "module_status": {
    "vehicle_state": {
      "module": "vehicle_state",
      "status": "partial"
    }
  }
}

That is the difference between a professional acquisition model and an implicit "empty field" workaround.

Hard vs Soft Dependencies

Hard Dependencies

Hard dependencies determine whether a vehicle is reportable at all.

Current Tesla rule:

If a hard dependency fails:

Soft Dependencies

Soft dependencies enrich the report but do not define basic reportability.

Current Tesla examples:

If a soft dependency fails:

Premium / Separate-Auth Dependencies

These require special handling, pricing, or auth.

Current Tesla example:

If a premium module fails because partner auth fails:

drive_state Rule

drive_state is not a separate business capability.

It is a submodule inside vehicle_state.

Current Tesla rule:

That avoids treating a partially-populated Tesla payload as either a fake success or a total vehicle failure.

Security Boundaries

Security must be built into the architecture.

Trust Zones

  1. provider_auth
  2. owns user/partner token exchange and refresh
  3. no rendering/PDF code may access token material

  4. provider_acquisition

  5. fetches raw provider payloads
  6. may access tokens only via auth service
  7. logs only masked identifiers

  8. normalization

  9. converts raw provider payloads to canonical snapshot
  10. strips/avoids non-essential sensitive fields

  11. presentation

  12. shapes API responses and PDFs
  13. consumes canonical snapshot only
  14. no direct token or raw auth context access

Sensitive Asset Rules

Session Isolation Rules

Cross-user data flow is a business-critical failure and must be treated as a severity-one security defect.

Required rules:

Fraud and Cross-User Risk Model

The architecture must explicitly defend against:

  1. Session guessing or leakage
  2. if a user can guess or obtain another user's session_id, they may access reports, vehicles, or artifacts unless all retrieval flows treat the session id as a protected bearer secret and enforce TTL/rotation rules

  3. Cross-session cache contamination

  4. if report data is cached by VIN, payment id, or task id without session scoping, one user's Tesla data can appear in another user's report flow

  5. Async artifact mix-ups

  6. if PDF jobs or export jobs are not bound to the originating session, a completed artifact can be downloaded by the wrong user

  7. Payment/report mismatches

  8. if payment verification and report generation are joined only by loose ids, a paid session could accidentally unlock the wrong acquisition outcome

  9. Debug/log leakage

  10. if raw payloads, full VINs, or signed URLs are logged or surfaced in support tooling, business-sensitive vehicle data may leak outside the customer boundary

Session Lifetime Policy

Session length must be defined as part of the API contract, not left implicit.

Recommended target policy:

Session lifecycle rules:

Persistence and Retention Rules

Logging and Audit Rules

Target Package Structure

Inside one deployable API application:

app/
├── domain/
│   ├── acquisition.py
│   ├── capabilities.py
│   ├── reportability.py
│   └── snapshot.py
├── application/
│   ├── sessions/
│   ├── acquisition/
│   └── reporting/
├── providers/
│   ├── base/
│   │   ├── auth.py
│   │   ├── capabilities.py
│   │   ├── inventory.py
│   │   ├── acquisition.py
│   │   └── normalizer.py
│   └── tesla/
│       ├── auth.py
│       ├── inventory.py
│       ├── modules.py
│       ├── acquisition.py
│       └── normalizer.py
├── presentation/
│   ├── api/
│   └── pdf/
└── core/
    ├── security/
    ├── observability/
    └── resilience/

Responsibility Boundaries

Session Layer

Responsible for:

Not responsible for:

Provider Auth Layer

Responsible for:

Not responsible for:

Acquisition Layer

Responsible for:

Not responsible for:

Normalization Layer

Responsible for:

Not responsible for:

Reportability Layer

Responsible for:

Presentation Layer

Responsible for:

Must consume:

Must not consume:

API Response Contract Direction

The WUI should move from “a map of report blobs” to a per-vehicle acquisition result contract.

Target shape:

{
  "status": "complete",
  "vehicles": [
    {
      "vin": "LRW3E7EK1RC98****",
      "provider": "tesla",
      "core_status": "partial",
      "reportability": "billable_partial",
      "report": {},
      "modules": {
        "vehicle_state": {"status": "success"},
        "charging": {"status": "success"},
        "technical_specs": {"status": "auth_failed"}
      }
    }
  ]
}

This lets the WUI render:

without guessing from malformed report payloads.

Performance Rules

  1. Shared datasets are fetched once per request when the provider contract allows it.
  2. Per-vehicle modules are fetched concurrently with bounded parallelism.
  3. Partner tokens are cached separately from user tokens.
  4. Normalization happens once and feeds:
  5. WUI response
  6. PDF rendering
  7. invoice/export generation
  8. Renderers must never re-fetch provider data.

Observability Rules

For each vehicle acquisition, structured logs must include:

Raw payloads are excluded by default.

Definition of Working

The reporting core is considered working only when all of the following are true for one full end-to-end session:

  1. A user can start OAuth and return with a valid session.
  2. The session is bound to the correct vehicles and package only, and may trigger downstream fetches, photos, exports, PDFs, and artifacts only for that same vehicle/package scope.
  3. Payment verification cannot unlock another user's session.
  4. Report generation fetches provider data only for the selected vehicles in the owning session.
  5. Each vehicle ends in an explicit complete, partial, or failed state.
  6. The WUI can render the result without guessing whether a report object is actually an error.
  7. PDF generation uses only normalized report data and returns artifacts only to the owning session.
  8. Session expiry, reconnect, and artifact expiry all fail closed.

If any of these are false, the core is not yet production-grade.

Execution Path to Green

The shortest professional path from the current implementation to a stable system is:

Step 1 - Lock Session Safety First

Step 2 - Stabilize Provider Auth

Step 3 - Stabilize Acquisition

Step 4 - Normalize Before Rendering

Step 5 - Make Outcome Semantics Explicit

Step 6 - Prove It in Dev

Acceptance Criteria

The architecture refactor is not complete until these criteria pass:

Session and Security

Provider Data

Rendering

Operations

Migration Plan

Phase 1 - Stabilize Current Tesla Flow

Phase 2 - Extract Tesla Acquisition Internals

Phase 3 - Change API Response Shape

Phase 4 - Provider Abstraction

Phase 5 - Future Service Extraction (Only If Needed)

Potential future extraction candidates:

These are optional and should happen only after internal boundaries are stable.

Decision

The project should proceed as a secure modular monolith with a provider-agnostic core and Tesla as the first provider implementation.

Do not jump to microservices yet. Do not continue adding Tesla/report fixes inside one oversized service.

The next architectural work must be organized around: