# MadBase Caching Strategy ## Overview MadBase implements a **two-tier caching architecture** that maintains the simplicity of the 4-pillar system while providing enterprise-grade caching capabilities. ## Architecture ### Tier 1: L1 Cache (In-Memory) - **Technology**: moka (Rust) - **Location**: Proxy / Worker nodes - **Purpose**: Ultra-low latency for frequently accessed data - **Typical Use Cases**: - Project configurations - JWT validation cache - Hot database query results - API response caching ### Tier 2: L2 Cache (Redis) - **Technology**: Redis 7 - **Location**: State Pillar (Pillar 3) - **Purpose**: Shared state across the entire cluster - **Typical Use Cases**: - Distributed session storage - Realtime presence tracking - Rate limiting counters - Distributed locking - Pub/Sub messaging ## State Pillar Integration The **State Pillar** (formerly "Database Pillar") now hosts both PostgreSQL and Redis: ``` ┌─────────────────────────────────────────┐ │ State Pillar Node │ ├─────────────────────────────────────────┤ │ ┌──────────┐ ┌─────────────┐ │ │ │PostgreSQL│ │ Redis │ │ │ │ :5432 │ │ :6379 │ │ │ └──────────┘ └─────────────┘ │ │ │ │ │ │ └─────────┬─────────┘ │ │ ▼ │ │ ┌─────────────┐ │ │ │ HAProxy │ │ │ │ :5433/:6379 │ │ │ └─────────────┘ │ └─────────────────────────────────────────┘ ``` ### Why This Approach? 1. **Resource Symmetry**: Both PostgreSQL and Redis are memory-intensive and share the same VPS requirements 2. **HA Piggybacking**: Pillar 3 already manages HA via Patroni and etcd. Redis benefits from the same infrastructure 3. **Centralized State**: Maintains clean separation of Compute (Worker/Proxy) vs. State (DB/Redis) 4. **Zero Complexity**: No new pillar needed, just enhanced the existing one ## Features ### 1. Shared Auth Sessions Users can now stay logged in even if the Proxy node handling their request changes: ```rust use auth::SessionManager; // Create a session let session_token = session_manager .create_session(user_id, email, "authenticated".to_string()) .await?; // Validate on any proxy node let session = session_manager .validate_session(&session_token) .await?; ``` ### 2. Realtime Presence Track "Who is online" across multiple Worker nodes: ```rust use realtime::PresenceManager; // User joins a channel presence_manager .join_channel(user_id, "public-chat".to_string(), None) .await?; // Get online count let count = presence_manager .get_channel_online_count("public-chat".to_string()) .await?; ``` ### 3. Distributed Locking Prevent race conditions during background operations: ```rust use common::DistributedLock; let lock = DistributedLock::new( redis_client, "migration:lock".to_string(), 30, // 30 seconds TTL ); if lock.acquire().await? { // Perform critical section lock.release().await?; } ``` ### 4. Rate Limiting Distributed rate limiting across all instances: ```rust use gateway::rate_limit::RateLimitMiddleware; // Check IP-based rate limit if !middleware.check_ip(&user_ip).await? { return Err("Rate limit exceeded"); } ``` ## Configuration ### Environment Variables ```bash # PostgreSQL DATABASE_URL="postgres://user:pass@db:5432/madbase" # Redis (Optional - will fallback to L1 only) REDIS_URL="redis://db:6379/0" # Cache TTL CACHE_TTL_SECONDS=3600 ``` ### Cache Keyspaces | Pattern | Purpose | TTL | |---------|---------|-----| | `session:{token}` | User sessions | 3600s | | `presence:channel:{name}:user:{id}` | User presence | 60s | | `ratelimit:ip:{addr}` | IP rate limiting | 60s | | `ratelimit:user:{id}` | User rate limiting | 60s | | `lock:{name}` | Distributed locks | Configurable | ## HAProxy Configuration The State Pillar's HAProxy routes both PostgreSQL and Redis traffic: ```haproxy listen primary bind *:5433 mode tcp server patroni1 patroni:5432 check listen redis bind *:6379 mode tcp server redis1 redis:6379 check ``` ## Scaling Strategy ### Horizontal Scaling - **Proxy Nodes**: Add more proxies, all share the same Redis cache - **Worker Nodes**: Add more workers, presence tracking works seamlessly - **State Nodes**: Scale to 3 or 5 nodes for HA, Redis is replicated via Sentinel/Cluster ### Vertical Scaling - Upgrade State Node plan for more RAM (benefits both PostgreSQL and Redis) - Typical: CX21 (8GB) → CX31 (16GB) → CX41 (32GB) ## Monitoring Redis is monitored alongside PostgreSQL: - **HAProxy Stats**: http://db-node:7000 - **Grafana Dashboard**: "State Pillar Performance" - **Metrics**: - Redis memory usage - Cache hit/miss ratios - Connection pool utilization - Rate limit enforcement ## Best Practices 1. **Session Management**: Use appropriate TTLs (shorter for sensitive data) 2. **Presence Tracking**: Implement heartbeats to keep users "online" 3. **Rate Limiting**: Use different limits for different user tiers 4. **Distributed Locks**: Always set reasonable TTLs to prevent deadlocks 5. **Cache Invalidation**: Use versioned keys or explicit deletion ## Migration Guide ### From Single-Node to Cluster 1. Update State Pillar image to include Redis 2. Set `REDIS_URL` in all Proxy/Worker configurations 3. Deploy SessionManager in Auth handlers 4. Enable presence tracking in Realtime module 5. Update rate limiting to use distributed counters ### Testing ```bash # Test Redis connection redis-cli -h db-node ping # Test session creation curl -X POST http://localhost:8000/auth/v1/token \ -d '{"email":"test@example.com","password":"password"}' # Check presence redis-cli -h db-node SMEMBERS "presence:channel:public:users" ``` ## Performance ### Expected Latency | Operation | L1 Cache (moka) | L2 Cache (Redis) | Database | |-----------|-----------------|------------------|----------| | Get | <1μs | 1-2ms | 10-50ms | | Set | <1μs | 1-2ms | 10-50ms | | Delete | <1μs | 1-2ms | 10-50ms | ### Cache Hit Ratios - **L1 Hit**: 95%+ for frequently accessed data - **L2 Hit**: 80%+ for shared state - **Miss**: Falls through to database ## Future Enhancements - [ ] Redis Cluster for horizontal scaling - [ ] Pub/Sub for real-time events - [ ] Bloom filters for existence checks - [ ] HyperLogLog for cardinality estimation - [ ] Geospatial indexing for location features