03-Development / 03.10.Infrastructure-Style-Guide

03.10.Infrastructure Style Guide

03.10. Infrastructure Style Guide

Step Function File path Why this order / dependencies Privileges Notes
1 (manual or one-time) Someone with rights to create projects and link billing must exist your personal account or BUSINESS_OWNER Usually done once per org — skip if already done
2 do_gcp_create_project gcp-create-project.func.sh Creates project + links billing — all later steps require project to exist roles/resourcemanager.projectCreator + billing user First script to run for a new environment
3 do_gcp_create_project_service_account gcp-create-project-service-account.func.sh Creates project-named SA + downloads key (temp disables key creation constraint) personal account (needs org policy rights temp) Critical — most automation uses this SA
4 do_gcp_configure_proj_sa_permissions gcp-configure-proj-sa-permissions.func.sh Grants owner + many powerful roles to the project SA personal account or powerful SA Probably what you want to run right now if project+SA already exist
5 do_gcp_project_apis_enable gcp-project-apis-enable.func.sh Enables APIs required by the rest of the stack (Storage, Secret Manager, Cloud Functions…) usually the new project SA (after step 4) Run after permissions are granted

CREATE THE PROJECTS

ORG=bnc APP=cpt ENV=dev ORG_ID=1080340024101 GCP_ACCOUNT=yordan.georgiev@csitea.net GCP_BILLING_ACCOUNT_ID=016958-C0B218-5A0B90 ./run -a do_gcp_000_create_project


ORG=bnc APP=cpt ENV=tst ORG_ID=1080340024101 GCP_ACCOUNT=yordan.georgiev@csitea.net GCP_BILLING_ACCOUNT_ID=016958-C0B218-5A0B90 ./run -a do_gcp_000_create_project

ORG=bnc APP=cpt ENV=prd ORG_ID=1080340024101 GCP_ACCOUNT=yordan.georgiev@csitea.net GCP_BILLING_ACCOUNT_ID=016958-C0B218-5A0B90 ./run -a do_gcp_000_create_project

ORG=bnc APP=cpt ENV=all ORG_ID=1080340024101 GCP_ACCOUNT=yordan.georgiev@csitea.net GCP_BILLING_ACCOUNT_ID=016958-C0B218-5A0B90 ./run -a do_gcp_000_create_project

ORG=bnc APP=cpt ENV=inf ORG_ID=1080340024101 GCP_ACCOUNT=yordan.georgiev@csitea.net GCP_BILLING_ACCOUNT_ID=016958-C0B218-5A0B90 ./run -a do_gcp_000_create_project

CLOUD RUN DOMAIN MAPPING

Cloud Run requires domain verification before custom domains can be mapped to services. This is a one-time manual setup per domain.

Why Domain Verification is Needed

Domain Verification Process

Step 1: Verify Parent Domain via Google Search Console

Verify the parent domain carpulsetracker.com to cover all subdomains (api., dev.api., tst.api., inf.api.):

  1. Run: gcloud domains verify carpulsetracker.com --project=bnc-cpt-prd
  2. This opens Google Search Console
  3. Select "Domain" property type
  4. Enter: carpulsetracker.com
  5. Copy the provided TXT record value (e.g., google-site-verification=xxxxx)

Step 2: Add DNS TXT Record

Add the verification TXT record to Cloud DNS. If there's an existing TXT record (e.g., SPF), update it to include both:

# Authenticate
gcloud auth activate-service-account --key-file=$HOME/.gcp/.bnc/key-bnc-cpt-prd.json

# Check existing TXT records
gcloud dns record-sets list --zone=subzone-bnc-cpt-prd --project=bnc-cpt-prd --filter="type=TXT"

# Update TXT record (preserving SPF, adding verification)
gcloud dns record-sets update carpulsetracker.com. \
  --type=TXT \
  --ttl=300 \
  --rrdatas='"v=spf1 include:_spf.google.com ~all"','"google-site-verification=YOUR_TOKEN_HERE"' \
  --zone=subzone-bnc-cpt-prd \
  --project=bnc-cpt-prd

Step 3: Complete Verification in Search Console

Click "Verify" in Google Search Console after DNS propagation (may take up to 24 hours, usually minutes).

Step 4: Create Domain Mappings via GCP Console

The domain verification is tied to your personal Google account, not the service account. Create mappings via GCP Console:

  1. Go to: https://console.cloud.google.com/run?project=bnc-cpt-prd
  2. Click on the service (e.g., bnc-cpt-api-prd)
  3. Go to "Manage Custom Domains" tab
  4. Click "Add Mapping"
  5. Enter the domain (e.g., api.carpulsetracker.com)

Repeat for all environments:

Environment GCP Project Service Name Custom Domain
prd bnc-cpt-prd bnc-cpt-api-prd api.carpulsetracker.com
prd (alias) bnc-cpt-prd bnc-cpt-api-prd prd.api.carpulsetracker.com
dev bnc-cpt-dev bnc-cpt-api-dev dev.api.carpulsetracker.com
tst bnc-cpt-tst bnc-cpt-api-tst tst.api.carpulsetracker.com
inf bnc-cpt-inf bnc-cpt-api-inf inf.api.carpulsetracker.com

The prd.api.carpulsetracker.com alias allows uniform URL pattern: {env}.api.carpulsetracker.com for all environments.

DNS Configuration (Already in Terraform 007-dns)

The CNAME records pointing to ghs.googlehosted.com are managed by Terraform step 007-dns:

api.carpulsetracker.com.      CNAME  ghs.googlehosted.com.
prd.api.carpulsetracker.com.  CNAME  ghs.googlehosted.com.  # alias for uniform URL pattern
dev.api.carpulsetracker.com.  CNAME  ghs.googlehosted.com.
tst.api.carpulsetracker.com.  CNAME  ghs.googlehosted.com.
inf.api.carpulsetracker.com.  CNAME  ghs.googlehosted.com.

This enables the uniform URL pattern: https://{env}.api.carpulsetracker.com for all environments.

CD Pipeline Behavior

The CD pipeline (.github/workflows/cd.yaml) handles domain mapping automatically:

Troubleshooting

"Domain does not appear to be verified" - Verify the parent domain carpulsetracker.com in Search Console - Create the mapping via GCP Console using the account that verified the domain

CNAME conflict with TXT record - DNS doesn't allow CNAME + TXT on the same hostname - Verify the parent domain instead of the subdomain

DNS propagation delay - TXT record changes may take up to 24 hours to propagate - Custom domain health checks may fail until DNS propagates

DNS SYNCHRONIZATION FOR API SUBDOMAINS

All API subdomain DNS records are centralized in the prd Terraform state (step 007-dns). This means a single make do-provision ENV=prd STEP=007-dns creates/updates CNAME records for all environments (dev, tst, inf, prd).

Architecture: Why Centralized in prd?

The TLD zone carpulsetracker.com is hosted in the bnc-cpt-prd GCP project (subzone-bnc-cpt-prd Cloud DNS zone). All *.api.carpulsetracker.com records live in that zone, so they must be managed from the prd state — regardless of which environment they point traffic to.

Terraform source: bnc-cpt-inf/src/terraform/007-dns/05.03.api-dns-records.tf

All 5 records use count = var.env == "prd" ? 1 : 0 to ensure they are only created when provisioning the prd environment:

Record Target Purpose
api.carpulsetracker.com ghs.googlehosted.com. Production API (primary)
prd.api.carpulsetracker.com ghs.googlehosted.com. Production API (uniform pattern alias)
dev.api.carpulsetracker.com ghs.googlehosted.com. Development API
tst.api.carpulsetracker.com ghs.googlehosted.com. Testing API
inf.api.carpulsetracker.com ghs.googlehosted.com. Infrastructure API

Prerequisites

  1. Service account key: ~/.gcp/.bnc/key-bnc-cpt-prd.json
  2. Docker container: con-bnc-cpt-tf-runner running
  3. Config repo: bnc-cpt-cnf up to date
  4. Terraform state bucket: gs://bnc-cpt-prd-tf-state accessible

Workflow (3 Steps)

Run all commands from bnc-cpt-utl/:

Step 1: Generate Config

make do-generate-config-for-step ENV=prd STEP=007-dns

This generates bnc-cpt-cnf/bnc-cpt/prd/tf/007-dns.vars.tfvars from the YAML config source of truth.

Step 2: Plan (Dry Run)

make do-tf-plan ENV=prd STEP=007-dns

Review the plan output. Expected resources for a fresh run: - google_dns_record_set.api_prd[0] - google_dns_record_set.api_prd_alias[0] - google_dns_record_set.api_dev[0] - google_dns_record_set.api_tst[0] - google_dns_record_set.api_inf[0]

Step 3: Provision (Apply)

make do-provision ENV=prd STEP=007-dns

Verification

After provisioning, verify all records resolve correctly:

for sub in api prd.api dev.api tst.api inf.api; do
  echo "=== ${sub}.carpulsetracker.com ===" && dig +short "${sub}.carpulsetracker.com" CNAME
done

Expected output: each record returns ghs.googlehosted.com.

Safety Notes


GCP SECRETS SYNC FROM GOOGLE SHEET

This mechanism syncs secret values from a Google Sheet to GCP Secret Manager. It reads secrets from the sheet and updates them in GCP Secret Manager for each environment.

Prerequisites

  1. Service Account Keys at ~/.gcp/.bnc/:
  2. key-bnc-cpt-all.json - For reading Google Sheet (cross-project access)
  3. key-bnc-cpt-{env}.json - For writing to GCP Secret Manager (env = inf, dev, tst, prd)

  4. Google Sheet shared with the service account email bnc-cpt-all@bnc-cpt-all.iam.gserviceaccount.com

  5. Secret containers must exist in GCP (created via Terraform step 029-create-gcp-secrets)

  6. Docker container con-bnc-cpt-tf-runner must be running

Google Sheet Structure

Worksheet Logic

  1. Reads all worksheet first (base values)
  2. Reads environment-specific worksheet (e.g., dev) if exists
  3. Environment values override base values from all
  4. Empty VAR_VALUE cells are automatically set to "n/a" (Cloud Run requires all mounted secrets to have values)

Naming Convention

VAR_NAME in sheet is converted to GCP secret ID:

VAR_NAME (Sheet)          →  GCP Secret ID
STRIPE_SECRET_KEY         →  bnc-cpt-stripe-secret-key
JWT_ACCESS_TOKEN_EXPIRE   →  bnc-cpt-jwt-access-token-expire
PAYPAL_CLIENT_SECRET      →  bnc-cpt-paypal-client-secret

Formula: {org}-{app}-{var_name.lower().replace('_', '-')}

Usage (via Make)

Run from bnc-cpt-utl/:

cd /opt/bnc/bnc-cpt/bnc-cpt-utl

# Sync secrets for a single environment
make do-gcp-sync-secrets ENV=dev

# Dry run (show what would be updated, no changes)
make do-gcp-sync-secrets ENV=dev DRY_RUN=1

# Sync all environments
for env in inf dev tst prd; do
  make do-gcp-sync-secrets ENV=$env
done

# Dry run all environments
for env in inf dev tst prd; do
  make do-gcp-sync-secrets ENV=$env DRY_RUN=1
done

Usage (Display Only - Read Sheet)

To only display secrets from the sheet without syncing:

make do-gcp-update-secrets ENV=dev

Execution Flow

  1. Make target runs docker exec into con-bnc-cpt-tf-runner
  2. Container executes ./run -a do_gcp_sync_secrets
  3. Shell function activates Python via Poetry
  4. Python script:
  5. Authenticates to Google Sheets via service account
  6. Reads all worksheet + environment worksheet
  7. Merges values (env overrides all)
  8. Authenticates to GCP via service account
  9. Updates each secret in GCP Secret Manager

Output Example

===============================================
Syncing secrets for environment: dev
===============================================
2026-02-04 10:30:00 UTC ::: INFO: Connecting to Google Sheet: 1bYK6...
2026-02-04 10:30:01 UTC ::: INFO: Available worksheets: all, dev, tst, prd
2026-02-04 10:30:01 UTC ::: INFO: Reading 'all' worksheet (base values)...
2026-02-04 10:30:02 UTC ::: INFO:   Found 30 secrets in 'all' worksheet
2026-02-04 10:30:02 UTC ::: INFO: Reading 'dev' worksheet (environment overrides)...
2026-02-04 10:30:03 UTC ::: INFO:   Found 5 secrets (3 overrides, 2 new)
2026-02-04 10:30:03 UTC ::: INFO: Total secrets to sync: 32
2026-02-04 10:30:03 UTC ::: INFO: Syncing secrets to project: bnc-cpt-dev
2026-02-04 10:30:04 UTC ::: OK:   Updated: bnc-cpt-stripe-secret-key
2026-02-04 10:30:05 UTC ::: WARN:   Using 'n/a' for bnc-cpt-apple-pay-domain-name (VAR_NAME=APPLE_PAY_DOMAIN_NAME has no VAR_VALUE)
2026-02-04 10:30:06 UTC ::: OK:   Updated: bnc-cpt-apple-pay-domain-name
...
===============================================
2026-02-04 10:30:30 UTC ::: INFO: Summary:
2026-02-04 10:30:30 UTC ::: INFO:   Success: 22
2026-02-04 10:30:30 UTC ::: INFO:   Skipped: 10
2026-02-04 10:30:30 UTC ::: INFO:   Errors:  0
===============================================

Troubleshooting

"Secret does not exist in project" - Run Terraform step 029-create-gcp-secrets first: bash make do-generate-config-for-step ENV=dev STEP=029-create-gcp-secrets make do-provision ENV=dev STEP=029-create-gcp-secrets

"Spreadsheet not found" - Verify the sheet is shared with bnc-cpt-all@bnc-cpt-all.iam.gserviceaccount.com - Check sheet URL in all.env.yaml

"Failed to activate GCP service account" - Verify key file exists at ~/.gcp/.bnc/key-bnc-cpt-{env}.json - Check file permissions

"Missing required package: gspread" - Run Poetry install inside the container: bash docker exec -it con-bnc-cpt-tf-runner bash cd /opt/bnc/bnc-cpt/bnc-cpt-inf/src/python/gsheet-secrets-to-gcp poetry install

File Purpose
bnc-cpt-cnf/bnc-cpt/all.env.yaml Sheet URL configuration
bnc-cpt-inf/src/python/gsheet-secrets-to-gcp/ Python module
bnc-cpt-utl/src/bash/run/gcp-sync-secrets.func.sh Shell action
bnc-cpt-utl/src/make/tf-tasks.func.mk Make targets
bnc-cpt-inf/src/terraform/029-create-gcp-secrets/ Terraform for secret containers
## SYSTEM ARCHITECTURE & DATA FLOW

The following diagram describes the control and data flow for the Car Pulse Tracker infrastructure across environments.

                                     [ USER BROWSER ]
                                            |
                                            | (1) HTTPS Request
                                            v
                                    [ GOOGLE FRONTEND ]
                                            |
                                            | (2) SSL / Routing
                                            v
                    +-------------------------------------------------------+
                    |                  CLOUD DNS (PRD PROJECT)              |
                    |  - *.api.carpulsetracker.com -> ghs.googlehosted.com  |
                    |  - carpulsetracker.com -> Cloud Run (WUI)             |
                    +-------------------------------------------------------+
                                            |
                                            | (3) Environment Routing
                                            v
  +===========================================================================================+
  |                              GCP PROJECT (ENV)                                            |
  |                                                                                           |
  |   +-----------------------+           +---------------------------+                       |
  |   |   CLOUD RUN (API)     |           |   MEMORYSTORE (REDIS)     |                       |
  |   |                       |           |                           |                       |
  |   |  - Auth / Logic       | <-------> |  - Session Cache          |                       |
  |   |  - Tesla Proxy        |    (4)    |  - Rate Limiting          |                       |
  |   |  - Payment Flows      |           |  - Task Status Store      |                       |
  |   |  - Worker Endpoint    |           |  - (Step 027 -> 030)      |                       |
  |   +-----------+-----------+           +---------------------------+                       |
  |         |     |     ^                                                                     |
  |    (5a) |     |     | (7c) HTTP callback                                                  |
  |  Secrets|     |     |      POST /api/v1/worker/generate-pdf                               |
  |         v     |     |                                                                     |
  |   +----------+|  +--+------------------------+          +-----------------------------+   |
  |   | SECRET   ||  |   CLOUD TASKS (Step 031)  |          |  GCS REPORTS BUCKET         |   |
  |   | MANAGER  ||  |                           |          |  (Step 016)                 |   |
  |   |          ||  |  - Queue: pdf-generation   |          |                             |   |
  |   | - API    ||  |  - Rate: 10/s, 5 conc.    |          |  - bnc-cpt-{env}-reports    |   |
  |   |   Keys   ||  |  - Retry: 3x, 10-120s     |          |  - Lifecycle: 7-day delete  |   |
  |   | - Creds  ||  |  - Max duration: 5 min     |          |  - Uniform bucket-level IAM |   |
  |   +----------+|  +---------------------------+          +-----------------------------+   |
  |               |     ^                    |                    ^              |             |
  |               |     | (7a) Enqueue       | (7b) Dispatch      | (8) Upload   | (9) Signed |
  |               |     |      task          |      task          |    PDF       |    URL     |
  |               |     |                    v                    |              v             |
  |               |   +-+--------------------+--------------------+--------------+--+         |
  |               |   |              PDF GENERATION FLOW (within Cloud Run)         |         |
  |               |   |                                                            |         |
  |               |   |  1. API receives report request                            |         |
  |               |   |  2. Enqueues task to Cloud Tasks (7a)                      |         |
  |               |   |  3. Returns task_id immediately to client                  |         |
  |               |   |  4. Cloud Tasks dispatches HTTP callback (7b → 7c)         |         |
  |               |   |  5. Worker renders PDF via WeasyPrint + Jinja2             |         |
  |               |   |  6. Uploads PDF to GCS bucket (8)                          |         |
  |               |   |  7. Generates signed URL (9) (1-hour TTL)                  |         |
  |               |   |  8. Stores {status, download_url} in Redis                 |         |
  |               |   |  9. Client polls task status → receives signed URL         |         |
  |               |   +------------------------------------------------------------+         |
  |               |                                                                           |
  |   +-----------------------+           +---------------------------+                       |
  |   |    VPC ACCESS         |           |  (FUTURE) CLOUD SQL       |                       |
  |   |    CONNECTOR          |           |                           |                       |
  |   |                       |           |  - Report records         |                       |
  |   |  - Internal Routing   |           |  - Audit trail            |                       |
  |   |  - Private IP Access  |           |  - (Step 032)             |                       |
  |   +-----------------------+           +---------------------------+                       |
  |                                                                                           |
  +===========================================================================================+
                                            |
                                            | (6) External APIs
                                            v
                                    [ USER BROWSER (Tesla OAuth) ]
                                    [ TESLA FLEET API ]
                                    [ STRIPE / PAYPAL ]

PDF Report Generation — Detailed Control Flow

  CLIENT (Browser)                 API (Cloud Run)              Cloud Tasks           GCS Bucket
  ================                 ===============              ===========           ==========
        |                                |                          |                     |
        | POST /tesla/vehicle-           |                          |                     |
        |   report/pdf/async             |                          |                     |
        |------------------------------->|                          |                     |
        |                                |                          |                     |
        |                                | enqueue_pdf_task()       |                     |
        |                                |  payload: task_id,       |                     |
        |                                |  report, lang, photos    |                     |
        |                                |------------------------->|                     |
        |                                |                          |                     |
        |  {"task_id":"...",             |                          |                     |
        |   "status":"queued"}           |                          |                     |
        |<-------------------------------|                          |                     |
        |                                |                          |                     |
        |                                |     HTTP POST callback   |                     |
        |                                |  /api/v1/worker/         |                     |
        |                                |    generate-pdf          |                     |
        |                                |<-------------------------|                     |
        |                                |                          |                     |
        |                                | Redis: task → processing |                     |
        |                                | WeasyPrint: render PDF   |                     |
        |                                |                          |                     |
        |                                | upload_pdf_to_gcs()      |                     |
        |                                |--------------------------------------->|       |
        |                                |                          |             |       |
        |                                | generate_signed_url()    |             |       |
        |                                |  (1-hour TTL, ADC auth)  |             |       |
        |                                |<---------------------------------------|       |
        |                                |                          |                     |
        |                                | Redis: task → complete   |                     |
        |                                |  + download_url          |                     |
        |                                |                          |                     |
        | GET /tesla/reports/            |                          |                     |
        |   status/{task_id}             |                          |                     |
        |------------------------------->|                          |                     |
        |                                |                          |                     |
        |  {"status":"complete",         |                          |                     |
        |   "download_url":"https://     |                          |                     |
        |    storage.googleapis.com/     |                          |                     |
        |    bnc-cpt-{env}-reports/..."}  |                          |                     |
        |<-------------------------------|                          |                     |
        |                                                           |                     |
        | GET signed URL (direct)                                   |                     |
        |------------------------------------------------------------>------------------->|
        |                                                           |                     |
        | <-- PDF binary -----------------------------------------<-|---------------------|
        |                                                           |                     |

Synchronous Fallback

When CLOUD_TASKS_QUEUE is empty (local dev), the API falls back to synchronous PDF generation: - POST /tesla/vehicle-report/pdf renders the PDF inline and returns the binary as application/pdf - When REPORTS_BUCKET is also empty, PDFs are saved to local storage/reports/ directory - This allows full local development without GCS or Cloud Tasks dependencies

Infrastructure Provisioning Steps (Report Storage)

Step Resource Purpose
016-gcp-reports-bucket google_storage_bucket.reports GCS bucket bnc-cpt-{env}-reports with 7-day lifecycle auto-delete
016-gcp-reports-bucket google_storage_bucket_iam_member.api_storage_admin Grants roles/storage.objectAdmin to Cloud Run SA
029-create-gcp-secrets Secret containers REPORTS_BUCKET, GCP_PROJECT, CLOUD_TASKS_QUEUE, API_URL
030-gcp-cloud-run Cloud Run service Mounts secrets as env vars, connects to VPC for Redis
031-gcp-cloud-tasks google_cloud_tasks_queue Queue pdf-generation with rate/retry config
031-gcp-cloud-tasks google_project_iam_member.cloud_tasks_enqueuer Grants roles/cloudtasks.enqueuer to Cloud Run SA

Control Flow (Infrastructure-as-Code)

  1. Config Source: bnc-cpt-cnf/*.env.yaml (Single Source of Truth)
  2. Template Generation: make do-generate-config-for-step (YAML -> JSON -> tfvars)
  3. Provisioning: make do-provision (wraps terraform apply)
    • Remote State: States stored in gs://bnc-cpt-{env}-tf-state
    • Cross-Step Dependencies: Step 030 (Cloud Run) fetches redis_url from Step 027 (Redis) via terraform_remote_state.
    • Report Storage Dependencies: Step 016 (GCS Bucket) must be provisioned before Step 030 (Cloud Run). Step 031 (Cloud Tasks) can be provisioned independently.

Uniform URL Pattern

All environments follow the pattern: https://{env}.api.carpulsetracker.com - prd: api.carpulsetracker.com (Primary) + prd.api.carpulsetracker.com (Alias) - dev: dev.api.carpulsetracker.com - tst: tst.api.carpulsetracker.com - inf: inf.api.carpulsetracker.com


HORIZONTAL SCALABILITY

1. Shared Storage: Google Cloud Storage (GCS)

To solve the "Instance A vs. Instance B" problem and ensure high availability of generated PDFs, the local filesystem is replaced with GCS.

Infrastructure (Terraform — Step 016): - Bucket: bnc-cpt-{env}-reports with 7-day lifecycle auto-delete - IAM: roles/storage.objectAdmin granted to Cloud Run SA - Uniform bucket-level access (no object ACLs)

Backend (Python — app/core/gcs.py): - upload_pdf_to_gcs() — memory-to-cloud upload via google-cloud-storage - generate_signed_url() — 1-hour TTL, ADC-based auth (no secret keys) - Graceful fallback to local disk when REPORTS_BUCKET is empty

2. Global Rate Limiting (Redis Backend)

Rate limits use Redis as the storage backend (slowapi + RedisStorage), making limits global across all Cloud Run instances. Without this, scaling to N instances would effectively multiply the rate limit by N.

3. Persistent Data: Cloud SQL (PostgreSQL)

Redis handles transient data (sessions, task status). Cloud SQL is planned for long-term storage: - Payment records and report metadata (ReportRecord model) - Audit trail for GDPR compliance - Infrastructure: Step 026/032 (Terraform), db-f1-micro tier, private VPC networking

4. Asynchronous PDF Generation (Cloud Tasks)

PDF rendering is CPU-heavy. The async architecture decouples the request from generation: 1. Client calls POST /tesla/vehicle-report/pdf/async 2. API enqueues a Cloud Task and returns {"task_id": "...", "status": "queued"} 3. Cloud Tasks dispatches HTTP callback to POST /api/v1/worker/generate-pdf 4. Worker renders PDF (WeasyPrint + Jinja2), uploads to GCS, generates signed URL 5. Client polls GET /tesla/reports/status/{task_id} until status: "complete"

Queue config (Step 031): 10 dispatches/sec, 5 concurrent, 3 retries (10–120s backoff), 5-min max duration.

5. Security & Identity (Zero-Key Auth)

6. Deployment Strategy

Phase Action Status
1 Provision GCS buckets via Terraform (Step 016) Done
2 Update code to use GCS for storage + Redis for rate limiting Done
3 Deploy to dev for E2E validation Done
4 Introduce Cloud SQL + Alembic migrations (Step 026/032) Planned
5 Cutover to async PDF generation (Cloud Tasks, Step 031) Done

PERFORMANCE & LATENCY

Current Status

Gap Analysis

Mitigations (Implemented)

Mitigation Status Details
Skeleton loaders Done Vue skeleton states in ReportDashboard.vue for perceived performance
Multi-stage Docker Done Dockerfile split into runtime-base / build-base / production stages — build tools excluded from final image
Async PDF generation Done Cloud Tasks decouples request from rendering; UI shows progress bar instead of static spinner
Signed URL delivery Done PDF binary served directly from GCS edge, not proxied through Cloud Run

Remaining (Planned)

Mitigation Details
Cloud CDN Enable edge caching on the Load Balancer for static assets (Vue.js, images, CSS)
Multi-region deployment gcp_regions variable added to Terraform; deploy Cloud Run to EU + NA for latency reduction
Min instances Set Cloud Run min_instances = 1 in production to eliminate cold starts for the first request

Detailed Specifications (Planned Items)

Skeleton Loaders (Done)

Cloud CDN

Cold Start Optimization (Done)

Multi-Region API


SECURITY & DATA PRIVACY

Security Status: 7/10

Token Protection: Tesla Refresh Tokens

Tesla tokens are encrypted at rest in Redis using Fernet (AES-128-CBC + HMAC): - Encryption key: TOKEN_ENCRYPTION_KEY loaded from Secret Manager via settings - Implementation: app/core/encryption.pyencrypt_token() / decrypt_token() with graceful no-op fallback when key is not configured - Backward compat: InvalidToken exception returns the raw value (handles pre-encryption tokens during migration) - Tokens encrypted before _serialize_session(), decrypted after _deserialize_session()

OWASP Compliance

Category Status Details
SQL Injection N/A (Redis) When Cloud SQL is added: SQLAlchemy ORM only, no f-string SQL
XSS Mitigated Vue 3 auto-escaping + CSP header
CSRF Mitigated JWT Bearer tokens (not cookies)
Security Headers Done HSTS, CSP, Referrer-Policy, Permissions-Policy in app/main.py middleware

GDPR Compliance

Requirement Status Implementation
Transient data auto-delete Done Redis TTL 30 min for sessions
PDF report auto-delete Done GCS lifecycle policy: 7-day delete (Step 016)
localStorage auto-clear Done 60s interval timer in App.vue, clears on session expiry
sessionStorage fallback Done sessionId prefers sessionStorage (dies with browser close)
"Delete My Data" button Done Session-gated in Header dropdown, 9 locales
VIN masking in logs Done app/core/privacy.pymask_vin() shows first 5 chars + ****
localStorage encryption Planned Per-session derived key — Linear issue created

Secret Management

Implementation Roadmap

Phase 1 — Immediate Enhancements (all done): - [x] Fernet encryption for Tesla tokens (app/core/encryption.py) - [x] Security headers middleware (app/main.py) - [x] sessionStorage fallback for sessionId (stores/app.ts) - [x] VIN masking utility + retrofit (app/core/privacy.py) - [x] Auto-clear timer for expired sessions (App.vue)

Phase 2 — GDPR & Privacy: - [x] GCS bucket lifecycle policies (7-day delete) - [x] "Delete My Data" button (Header dropdown, session-gated, 9 locales) - [ ] localStorage encryption (per-session derived key) — Linear issue created

Phase 3 — Monitoring & Audit: - [ ] Cloud Audit Logs for Secret Manager and Cloud Storage — Linear issue created - [ ] Alerts for failed login attempts and rate limit triggers


RELIABILITY & FAULT TOLERANCE

Current Status

Retry Strategy (Exponential Backoff)

All outbound Tesla API calls use @retry_transient decorator (app/core/resilience.py): - Retryable errors: HTTP 429, 502, 503, 504 + httpx.ConnectError / httpx.TimeoutException - Strategy: Exponential backoff with random jitter (prevents thundering herd) - Schedule: 500ms → 2s → 8s (3 attempts, max 10s delay) - Applied to _fleet_api_get_raw() which wraps all Tesla Fleet API GET requests

Circuit Breaker

Singleton tesla_circuit in app/core/resilience.py protects against cascading failures:

Parameter Value Description
failure_threshold 50% Open circuit when 50% of requests fail
window_seconds 300 5-minute sliding window
min_requests 5 Minimum requests before evaluating threshold
recovery_timeout 300 5 minutes before attempting half-open probe

States: CLOSED (normal) → OPEN (fail fast with 503) → HALF_OPEN (probe one request) → CLOSED

Health endpoint: GET /api/v1/tesla/health returns circuit breaker state for frontend degradation checks.

Graceful UI Degradation (Planned)

Feature Status Details
System maintenance alert Planned Show user-friendly message on Tesla API 503
Cache persistence Existing vehicleReport in Pinia store remains readable offline
Disable generation Planned Disable "Generate New Report" while maintaining "Download Existing PDF"

Multi-Region Disaster Recovery (Planned)

Feature Status Details
Global Load Balancer Planned Backend services in europe-north1 + us-central1
Health-check failover Planned Automatic routing to healthy region on EU outage
Multi-region Redis Planned VPC Peering for cross-region access or region-local caching
Terraform gcp_regions Done Variable added, supports europe-north1, us-central1, asia-east1

Implementation Progress