07-Security-Testing / 07.05.Redis-Scalability-Security

07.05.Redis Scalability Security

07.05. Redis Scalability Security

1. Executive Summary

The current Redis implementation provides a solid foundation for session management and transient state. However, it faces critical risks in high-load scenarios and production environments due to single-instance infrastructure, lack of transit security, and rudimentary connection handling.


2. Infrastructure Analysis (GCP Memorystore)

2.1 Weakness: BASIC Tier (No High Availability)

2.2 Weakness: Lack of Transit Encryption (TLS)

2.3 Weakness: Authentication (Auth String)


3. End-to-End (E2E) Application Analysis

3.1 Robustness: Connection Pooling & Retries

3.2 Security: Application-Level Encryption

3.3 Scalability: Memory Management (TTL)


4. Concrete Specifications for Fixes

4.1 Infrastructure (Terraform) - Step 027

Spec: Upgrade to Standard HA and Enable Security 1. Upgrade Tier: Change tier from BASIC to STANDARD_HA. 2. Enable Auth: Set auth_enabled = true and retrieve the auth_string via a data source or output to store in Secret Manager. 3. Enable TLS: Set transit_encryption_mode = "SERVER_AUTHENTICATION".

4.2 Application (Python) - app/core/redis.py

Spec: Robust Connection Management 1. Tuned Connection Pool:

self._client = redis.from_url(
    settings.REDIS_URL,
    decode_responses=True,
    max_connections=20, # Adjust based on Cloud Run concurrency
    socket_timeout=5.0,
    socket_keepalive=True,
    retry_on_timeout=True
)
  1. TLS Support: Update REDIS_URL to rediss:// (note the double 's') and configure SSL context to trust GCP's CA.

4.3 Security - Key Rotation

Spec: Envelope Encryption 1. Move from a static TOKEN_ENCRYPTION_KEY to GCP KMS (Key Management Service). 2. Use the Service Account identity to "Wrap/Unwrap" session keys, ensuring the actual master key never leaves GCP's Hardware Security Modules (HSM).

4.4 Monitoring

Spec: Redis Health Dashboard 1. Add alerting for redis.googleapis.com/stats/memory/usage_ratio > 0.8. 2. Monitor redis.googleapis.com/network/instanteous_ops_per_sec to detect thundering herd issues during surges.