madbase/_milestones/M0_security_hardening.md

# Milestone 0: Security Hardening (CRITICAL)

**Goal:** Eliminate every exploitable vulnerability before any deployment or beta.

**Depends on:** Nothing — this is the first milestone.

**Blocks:** Everything else. Do not proceed to M1+ until every task here is complete.

---

## 0.1 — Secrets & Credential Hygiene

### 0.1.1 Remove all secret logging

**Files to audit:**

| File | Line | What's logged | Severity |
|------|------|---------------|----------|
| `auth/src/middleware.rs` | 46 | `tracing::info!("Using project-specific JWT secret: '{}'", ctx.jwt_secret)` | CRITICAL |
| `auth/src/middleware.rs` | 49 | `tracing::warn!("...Using global JWT secret: '{}'", state.config.jwt_secret)` | CRITICAL |
| `gateway/src/middleware.rs` | 139 | `tracing::info!("Injecting tenant pool for project={} db_url={}", ...)` — logs full DB URL with password | CRITICAL |
| `auth/src/handlers.rs` | 81-84 | `tracing::info!("Sending confirmation email to {}: token={}", ...)` — logs confirmation token | HIGH |
| `auth/src/handlers.rs` | 297-300 | `tracing::info!("Sending recovery email to {}: token={}", ...)` — logs recovery token | HIGH |

**How to fix:** Replace each with a sanitized log that omits the secret value:

```rust
// BEFORE (auth/src/middleware.rs:46)
tracing::info!("Using project-specific JWT secret: '{}'", ctx.jwt_secret);

// AFTER
tracing::debug!("Using project-specific JWT secret for project");
```

For DB URLs, log only the host/database, never the password:

```rust
// BEFORE (gateway/src/middleware.rs:139)
tracing::info!("Injecting tenant pool for project={} db_url={}", project_ctx.project_ref, db_url);

// AFTER
tracing::info!(project = %project_ctx.project_ref, "Injecting tenant pool");
```

**Audit procedure:** Run `rg 'jwt_secret|db_url|password|token=' --type rust -n` and review every hit. Replace INFO/WARN-level secret logs with DEBUG-level sanitized versions or remove entirely.

### 0.1.2 Make JWT_SECRET required

**File:** `common/src/config.rs` line 31

```rust
// BEFORE
let jwt_secret = env::var("JWT_SECRET")
    .unwrap_or_else(|_| "super-secret-key-please-change".to_string());

// AFTER
let jwt_secret = env::var("JWT_SECRET")
    .expect("JWT_SECRET must be set. Generate one with: openssl rand -hex 32");
```

Also enforce minimum length:

```rust
if jwt_secret.len() < 32 {
    panic!("JWT_SECRET must be at least 32 characters");
}
```

### 0.1.3 Make ADMIN_PASSWORD required and enforce strength

**File:** `control_plane/src/lib.rs` line 335

```rust
// BEFORE
let admin_password = std::env::var("ADMIN_PASSWORD").unwrap_or_else(|_| "admin".to_string());

// AFTER
let admin_password = std::env::var("ADMIN_PASSWORD")
    .expect("ADMIN_PASSWORD must be set");
```

### 0.1.4 Remove hardcoded fallback S3 credentials

**File:** `storage/src/backend.rs` lines 29-34

```rust
// BEFORE
let access_key = env::var("S3_ACCESS_KEY")
    .or_else(|_| env::var("MINIO_ROOT_USER"))
    .unwrap_or_else(|_| "minioadmin".to_string());

// AFTER
let access_key = env::var("S3_ACCESS_KEY")
    .or_else(|_| env::var("MINIO_ROOT_USER"))
    .expect("S3_ACCESS_KEY or MINIO_ROOT_USER must be set");
```

Apply the same to `S3_SECRET_KEY` / `MINIO_ROOT_PASSWORD`.

### 0.1.5 Remove Serialize derive from Config

**File:** `common/src/config.rs` line 4

```rust
// BEFORE
#[derive(Clone, Debug, Deserialize, Serialize)]
pub struct Config {

// AFTER
#[derive(Clone, Debug, Deserialize)]
pub struct Config {
```

Remove `use serde::Serialize` if no longer needed elsewhere in the module.

---

## 0.2 — Authentication & Authorization Fixes

### 0.2.1 Fix admin auth middleware

**File:** `gateway/src/admin_auth.rs` lines 27-33

**Current broken logic:**
```rust
let has_session = req.headers()
    .get(axum::http::header::COOKIE)
    .and_then(|h| h.to_str().ok())
    .map(|s| s.contains("madbase_admin_session"))  // Only checks name exists!
    .unwrap_or(false)
    || req.headers().contains_key("x-admin-token");  // Any value!
```

**Replacement approach:** Use HMAC-signed session tokens. The admin login endpoint should:
1. Verify the password (hashed with Argon2)
2. Generate a random session ID
3. Store it in Redis with a TTL (e.g., 24h)
4. Set an `HttpOnly`, `SameSite=Strict`, `Secure` cookie with the session ID
5. The middleware reads the cookie, looks up the session in Redis, rejects if missing/expired

**Implementation sketch:**

```rust
pub async fn admin_auth_middleware(
    State(state): State<AdminAuthState>,
    req: Request,
    next: Next,
) -> Result<Response, StatusCode> {
    let path = req.uri().path();
    if path == "/dashboard" || path == "/platform/v1/login" {
        return Ok(next.run(req).await);
    }
    if !path.starts_with("/platform/v1") {
        return Ok(next.run(req).await);
    }

    // Extract session token from cookie
    let session_token = req.headers()
        .get(axum::http::header::COOKIE)
        .and_then(|h| h.to_str().ok())
        .and_then(|cookies| {
            cookies.split(';')
                .find_map(|c| {
                    let c = c.trim();
                    c.strip_prefix("madbase_admin_session=")
                })
        });

    // Also check X-Admin-Token header
    let token = session_token
        .or_else(|| req.headers()
            .get("x-admin-token")
            .and_then(|v| v.to_str().ok()));

    let token = token.ok_or(StatusCode::UNAUTHORIZED)?;

    // Validate against session store
    let valid = state.session_store.validate(token).await
        .map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;

    if !valid {
        return Err(StatusCode::UNAUTHORIZED);
    }

    Ok(next.run(req).await)
}
```

**New struct needed:** `AdminAuthState` with a Redis-backed session store, or a shared `CacheLayer`.

### 0.2.2 Hash admin password

**File:** `control_plane/src/lib.rs` — `login` function (line 330+)

Use Argon2 to hash on first startup and verify on login:

```rust
pub async fn login(
    State(state): State<ControlPlaneState>,
    Json(payload): Json<LoginRequest>,
) -> Result<(CookieJar, StatusCode), (StatusCode, String)> {
    let admin_password_hash = std::env::var("ADMIN_PASSWORD_HASH")
        .expect("ADMIN_PASSWORD_HASH must be set");

    let valid = auth::utils::verify_password(&payload.password, &admin_password_hash)
        .map_err(|e| (StatusCode::INTERNAL_SERVER_ERROR, e.to_string()))?;

    if !valid {
        return Err((StatusCode::UNAUTHORIZED, "Invalid password".to_string()));
    }

    // Generate session...
}
```

Provide a CLI helper or startup script to generate the hash: `cargo run --bin hash_password -- "my-password"`.

### 0.2.3 Add auth to control-plane-api

**File:** `control-plane-api/src/lib.rs`

Add an API key middleware that reads `X-Api-Key` header and validates against `CONTROL_PLANE_API_KEY` env var. Apply to all routes except `/health`.

### 0.2.4 Add auth to function deploy/invoke

**File:** `functions/src/handlers.rs`

The function routes are nested under the auth middleware in `gateway/src/worker.rs`, so JWT auth should already apply. Verify this is actually enforced — currently the `/functions/v1/:name` POST route may be accessible with just an anon key. Deploy should require `service_role` or admin, invoke should require at least `authenticated`.

### 0.2.5 Add authorization to WebSocket subscriptions

**File:** `realtime/src/ws.rs` (or wherever WS join is handled)

On channel join, validate that the user's JWT role has SELECT permission on the requested table. Query `information_schema.role_table_grants` or attempt a `SELECT 1 FROM <table> LIMIT 0` within an RLS-scoped transaction.

---

## 0.3 — Injection & Input Sanitization

### 0.3.1 Fix SQL injection in SET LOCAL role

**Files:** `data_api/src/handlers.rs` and `storage/src/handlers.rs` (appears ~15 times)

```rust
// BEFORE (appears in every handler)
let role_query = format!("SET LOCAL role = '{}'", auth_ctx.role);
sqlx::query(&role_query).execute(&mut *tx).await?;

// AFTER — validate against allowlist
const ALLOWED_ROLES: &[&str] = &["anon", "authenticated", "service_role"];
if !ALLOWED_ROLES.contains(&auth_ctx.role.as_str()) {
    return Err((StatusCode::FORBIDDEN, "Invalid role".to_string()));
}
let role_query = format!("SET LOCAL role = '{}'", auth_ctx.role);
sqlx::query(&role_query).execute(&mut *tx).await?;
```

> **Note:** PostgreSQL doesn't support `$1` parameter binding for `SET LOCAL role`. The allowlist approach is the correct fix. This will be further cleaned up in M1 when we extract the RLS middleware.

### 0.3.2 Fix SQL injection in table browser

**File:** `control_plane/src/lib.rs` — `get_table_data` function (line 278)

```rust
// BEFORE
let query = format!("SELECT * FROM \"{}\".\"{}\" LIMIT 100", schema, table);

// AFTER — validate against information_schema
let exists: Option<(String,)> = sqlx::query_as(
    "SELECT table_name FROM information_schema.tables WHERE table_schema = $1 AND table_name = $2"
)
.bind(&schema)
.bind(&table)
.fetch_optional(&state.tenant_db)
.await
.map_err(|e| (StatusCode::INTERNAL_SERVER_ERROR, e.to_string()))?;

if exists.is_none() {
    return Err((StatusCode::NOT_FOUND, "Table not found".to_string()));
}

let query = format!("SELECT * FROM \"{}\".\"{}\" LIMIT 100", schema, table);
```

### 0.3.3 Fix JavaScript injection in Deno runtime

**File:** `functions/src/deno_runtime.rs` lines 122-156

The current code does:
```rust
let module_code = format!(r#"
    const req = new Request("http://localhost", {{
        method: "POST",
        body: {payload_json},    // <-- RAW interpolation!
        headers: {headers_json}   // <-- RAW interpolation!
    }});
"#);
```

If `payload_json` contains backticks, template literal syntax, or escape sequences, it breaks out of the string context.

**Fix:** Pass data through a global variable set via the V8 API, not string interpolation:

```rust
// Before user code execution, set globalThis.__PAYLOAD__ and __HEADERS__
let setup_code = format!(
    "globalThis.__PAYLOAD__ = JSON.parse({});globalThis.__HEADERS__ = JSON.parse({});",
    serde_json::to_string(&payload_json)?,
    serde_json::to_string(&headers_json)?
);
runtime.execute_script("<setup>", setup_code)?;
```

This double-serializes: the inner JSON becomes a string literal in JS, then `JSON.parse` deserializes it safely.

### 0.3.4 Fix path traversal in TUS uploads

**File:** `storage/src/tus.rs` — `get_upload_path` function (line 30)

```rust
// BEFORE
fn get_upload_path(id: &str) -> PathBuf {
    let mut path = std::env::temp_dir();
    path.push("madbase_tus");
    path.push(id);  // id could be "../../etc/passwd"
    path
}

// AFTER
fn get_upload_path(id: &str) -> Result<PathBuf, &'static str> {
    // Validate UUID format
    Uuid::parse_str(id).map_err(|_| "Invalid upload ID")?;
    let mut path = std::env::temp_dir();
    path.push("madbase_tus");
    path.push(id);
    Ok(path)
}
```

Apply the same to `get_info_path`. Update all callers to propagate the error.

---

## 0.4 — Token & Session Security

### 0.4.1 Gate token issuance on email confirmation

**File:** `auth/src/handlers.rs` — `signup` function (lines 88-103)

```rust
// AFTER email confirmation check
// If auto-confirm is disabled (the default), return user without tokens
let auto_confirm = std::env::var("AUTH_AUTO_CONFIRM")
    .map(|v| v == "true")
    .unwrap_or(false);

if auto_confirm {
    // Set confirmed_at immediately
    sqlx::query("UPDATE users SET confirmed_at = now(), email_confirmed_at = now() WHERE id = $1")
        .bind(user.id)
        .execute(&db)
        .await
        .map_err(|e| (StatusCode::INTERNAL_SERVER_ERROR, e.to_string()))?;

    // Issue tokens (existing logic)
    let (token, expires_in, _) = generate_token(...)?;
    let refresh_token = issue_refresh_token(...).await?;
    return Ok(Json(AuthResponse { access_token: token, ... }));
}

// Not auto-confirmed: return user without tokens
Ok(Json(serde_json::json!({
    "id": user.id,
    "email": user.email,
    "confirmation_sent_at": chrono::Utc::now(),
})))
```

### 0.4.2 Check confirmation status on login

**File:** `auth/src/handlers.rs` — `login` function, after password verification (line ~130)

```rust
// After verify_password succeeds:
if user.confirmed_at.is_none() && user.email_confirmed_at.is_none() {
    return Err((StatusCode::FORBIDDEN, "Email not confirmed".to_string()));
}
```

### 0.4.3 Validate OAuth CSRF state

**File:** `auth/src/oauth.rs` — `authorize` and `callback`

In `authorize`, store the CSRF token in Redis:

```rust
let (auth_url, csrf_token) = auth_request.url();

// Store CSRF token with 10-minute TTL
if let Some(redis) = &cache.redis {
    let key = format!("oauth_csrf:{}", csrf_token.secret());
    cache.set(&key, &"valid").await.ok();
}
```

In `callback`, validate:

```rust
let csrf_key = format!("oauth_csrf:{}", query.state);
let valid = cache.exists(&csrf_key).await.unwrap_or(false);
if !valid {
    return Err((StatusCode::BAD_REQUEST, "Invalid or expired state parameter".to_string()));
}
cache.delete(&csrf_key).await.ok();
```

### 0.4.4 Fix OAuth account takeover

**File:** `auth/src/oauth.rs` — callback, around line 234

Currently: if an existing user has the same email, the OAuth login returns that user's tokens.

**Fix:** Instead of implicit linking, create a new `identities` table. When an OAuth login matches an existing email:
- If the identity (provider + provider_id) is already linked, allow login
- If not linked, return an error: "An account with this email already exists. Log in with your password and link this provider from settings."

This is a larger change that can be partially deferred to M3 (identity linking), but the immediate fix is to **reject** the implicit match:

```rust
if existing_user.is_some() && !identity_linked {
    return Err((StatusCode::CONFLICT,
        "An account with this email already exists. Link this provider from your account settings.".to_string()));
}
```

---

## 0.5 — CORS & Transport Security

### 0.5.1 Restrict CORS origins

**Files:** `gateway/src/control.rs` line 104, `gateway/src/worker.rs` line 127

```rust
// BEFORE
CorsLayer::new()
    .allow_origin(Any)
    .allow_methods(Any)
    .allow_headers(Any)

// AFTER
use tower_http::cors::AllowOrigin;

let allowed_origins = std::env::var("ALLOWED_ORIGINS")
    .unwrap_or_else(|_| "http://localhost:3000,http://localhost:8000".to_string());

let origins: Vec<HeaderValue> = allowed_origins
    .split(',')
    .filter_map(|s| s.trim().parse().ok())
    .collect();

CorsLayer::new()
    .allow_origin(origins)
    .allow_methods([Method::GET, Method::POST, Method::PUT, Method::DELETE, Method::PATCH, Method::OPTIONS])
    .allow_headers([header::AUTHORIZATION, header::CONTENT_TYPE, HeaderName::from_static("apikey"), HeaderName::from_static("x-project-ref")])
    .allow_credentials(true)
```

### 0.5.2 Stop exposing secrets in API responses

**File:** `control_plane/src/lib.rs`

In `list_projects` (line 61), create a `ProjectSummary` struct that omits `db_url`, `jwt_secret`, `anon_key`, `service_role_key`:

```rust
#[derive(Serialize, sqlx::FromRow)]
pub struct ProjectSummary {
    pub id: Uuid,
    pub name: String,
    pub status: String,
    pub created_at: Option<chrono::DateTime<chrono::Utc>>,
}

pub async fn list_projects(...) -> Result<Json<Vec<ProjectSummary>>, ...> {
    let projects = sqlx::query_as::<_, ProjectSummary>(
        "SELECT id, name, status, created_at FROM projects"
    )...
}
```

Create a separate `GET /projects/:id/keys` endpoint that returns secrets only for the specifically requested project, requiring admin auth.

---

## Completion Requirements

This milestone is **not complete** until every item below is satisfied.

### 1. Full Test Suite — All Green

- [ ] `cargo test --workspace` passes with **zero failures**
- [ ] All **pre-existing tests** still pass (no regressions)
- [ ] **New unit tests** are written for every fix in this milestone:

| Test | Location | What it validates |
|------|----------|-------------------|
| `test_jwt_secret_required` | `common/src/config.rs` | `Config::new()` panics when `JWT_SECRET` is unset |
| `test_jwt_secret_min_length` | `common/src/config.rs` | `Config::new()` panics when `JWT_SECRET` < 32 chars |
| `test_admin_password_required` | `control_plane/src/lib.rs` | Login panics when `ADMIN_PASSWORD` is unset |
| `test_s3_credentials_required` | `storage/src/backend.rs` | `AwsS3Backend::new()` panics when `S3_ACCESS_KEY` is unset |
| `test_admin_auth_rejects_forged_cookie` | `gateway/src/admin_auth.rs` | Middleware rejects `madbase_admin_session=anything` |
| `test_admin_auth_rejects_empty_token` | `gateway/src/admin_auth.rs` | Middleware rejects empty `X-Admin-Token` |
| `test_admin_auth_requires_valid_session` | `gateway/src/admin_auth.rs` | Middleware requires session ID in Redis |
| `test_role_allowlist` | `common/src/rls.rs` or `data_api/src/handlers.rs` | `SET LOCAL role` rejects roles not in `[anon, authenticated, service_role]` |
| `test_signup_no_tokens_without_confirm` | `auth/src/handlers.rs` | Signup with `AUTH_AUTO_CONFIRM=false` returns user without tokens |
| `test_login_rejects_unconfirmed` | `auth/src/handlers.rs` | Login returns 403 for `confirmed_at = NULL` |
| `test_oauth_csrf_validated` | `auth/src/oauth.rs` | Callback rejects mismatched/missing CSRF state |
| `test_tus_path_traversal_blocked` | `storage/src/tus.rs` | `get_upload_path("../../etc/passwd")` returns an error |
| `test_config_not_serializable` | `common/src/config.rs` | `Config` does not implement `Serialize` (compile-time; remove `Serialize` derive) |
| `test_cors_rejects_unknown_origin` | `gateway/src/worker.rs` or integration | Request from unlisted origin gets no `Access-Control-Allow-Origin` |
| `test_list_projects_hides_secrets` | `control_plane/src/lib.rs` | `list_projects` response does not contain `jwt_secret` or `db_url` |

### 2. Manual / Integration Verification

- [ ] `rg 'jwt_secret|db_url|password' --type rust -n` — audit every hit; no INFO/WARN-level secret logging remains
- [ ] Starting without `JWT_SECRET` env var panics with a clear message
- [ ] Starting without `ADMIN_PASSWORD` env var panics with a clear message
- [ ] Starting without `S3_ACCESS_KEY` panics with a clear message
- [ ] `curl -H "Cookie: madbase_admin_session=anything" http://localhost:8001/platform/v1/projects` returns 401
- [ ] `curl -H "X-Admin-Token: anything" http://localhost:8001/platform/v1/projects` returns 401
- [ ] `curl http://localhost:8001/platform/v1/projects` returns 401 (no credentials at all)
- [ ] SQL injection in `SET LOCAL role` is blocked by allowlist
- [ ] OAuth flow stores and validates CSRF state
- [ ] Signup without `AUTH_AUTO_CONFIRM=true` does not return access tokens
- [ ] Login with unconfirmed email returns 403

### 3. CI Gate

- [ ] All of the above tests are included in `cargo test --workspace`
- [ ] CI pipeline runs these tests (once M7 CI is in place, retroactively verify M0 tests are green in CI)