Files
madbase/_milestones/M0_security_hardening.md
Vlad Durnea cffdf8af86
Some checks failed
CI/CD Pipeline / unit-tests (push) Failing after 1m16s
CI/CD Pipeline / integration-tests (push) Failing after 2m32s
CI/CD Pipeline / lint (push) Successful in 5m22s
CI/CD Pipeline / e2e-tests (push) Has been skipped
CI/CD Pipeline / build (push) Has been skipped
wip:milestone 0 fixes
2026-03-15 12:35:42 +02:00

541 lines
19 KiB
Markdown

# Milestone 0: Security Hardening (CRITICAL)
**Goal:** Eliminate every exploitable vulnerability before any deployment or beta.
**Depends on:** Nothing — this is the first milestone.
**Blocks:** Everything else. Do not proceed to M1+ until every task here is complete.
---
## 0.1 — Secrets & Credential Hygiene
### 0.1.1 Remove all secret logging
**Files to audit:**
| File | Line | What's logged | Severity |
|------|------|---------------|----------|
| `auth/src/middleware.rs` | 46 | `tracing::info!("Using project-specific JWT secret: '{}'", ctx.jwt_secret)` | CRITICAL |
| `auth/src/middleware.rs` | 49 | `tracing::warn!("...Using global JWT secret: '{}'", state.config.jwt_secret)` | CRITICAL |
| `gateway/src/middleware.rs` | 139 | `tracing::info!("Injecting tenant pool for project={} db_url={}", ...)` — logs full DB URL with password | CRITICAL |
| `auth/src/handlers.rs` | 81-84 | `tracing::info!("Sending confirmation email to {}: token={}", ...)` — logs confirmation token | HIGH |
| `auth/src/handlers.rs` | 297-300 | `tracing::info!("Sending recovery email to {}: token={}", ...)` — logs recovery token | HIGH |
**How to fix:** Replace each with a sanitized log that omits the secret value:
```rust
// BEFORE (auth/src/middleware.rs:46)
tracing::info!("Using project-specific JWT secret: '{}'", ctx.jwt_secret);
// AFTER
tracing::debug!("Using project-specific JWT secret for project");
```
For DB URLs, log only the host/database, never the password:
```rust
// BEFORE (gateway/src/middleware.rs:139)
tracing::info!("Injecting tenant pool for project={} db_url={}", project_ctx.project_ref, db_url);
// AFTER
tracing::info!(project = %project_ctx.project_ref, "Injecting tenant pool");
```
**Audit procedure:** Run `rg 'jwt_secret|db_url|password|token=' --type rust -n` and review every hit. Replace INFO/WARN-level secret logs with DEBUG-level sanitized versions or remove entirely.
### 0.1.2 Make JWT_SECRET required
**File:** `common/src/config.rs` line 31
```rust
// BEFORE
let jwt_secret = env::var("JWT_SECRET")
.unwrap_or_else(|_| "super-secret-key-please-change".to_string());
// AFTER
let jwt_secret = env::var("JWT_SECRET")
.expect("JWT_SECRET must be set. Generate one with: openssl rand -hex 32");
```
Also enforce minimum length:
```rust
if jwt_secret.len() < 32 {
panic!("JWT_SECRET must be at least 32 characters");
}
```
### 0.1.3 Make ADMIN_PASSWORD required and enforce strength
**File:** `control_plane/src/lib.rs` line 335
```rust
// BEFORE
let admin_password = std::env::var("ADMIN_PASSWORD").unwrap_or_else(|_| "admin".to_string());
// AFTER
let admin_password = std::env::var("ADMIN_PASSWORD")
.expect("ADMIN_PASSWORD must be set");
```
### 0.1.4 Remove hardcoded fallback S3 credentials
**File:** `storage/src/backend.rs` lines 29-34
```rust
// BEFORE
let access_key = env::var("S3_ACCESS_KEY")
.or_else(|_| env::var("MINIO_ROOT_USER"))
.unwrap_or_else(|_| "minioadmin".to_string());
// AFTER
let access_key = env::var("S3_ACCESS_KEY")
.or_else(|_| env::var("MINIO_ROOT_USER"))
.expect("S3_ACCESS_KEY or MINIO_ROOT_USER must be set");
```
Apply the same to `S3_SECRET_KEY` / `MINIO_ROOT_PASSWORD`.
### 0.1.5 Remove Serialize derive from Config
**File:** `common/src/config.rs` line 4
```rust
// BEFORE
#[derive(Clone, Debug, Deserialize, Serialize)]
pub struct Config {
// AFTER
#[derive(Clone, Debug, Deserialize)]
pub struct Config {
```
Remove `use serde::Serialize` if no longer needed elsewhere in the module.
---
## 0.2 — Authentication & Authorization Fixes
### 0.2.1 Fix admin auth middleware
**File:** `gateway/src/admin_auth.rs` lines 27-33
**Current broken logic:**
```rust
let has_session = req.headers()
.get(axum::http::header::COOKIE)
.and_then(|h| h.to_str().ok())
.map(|s| s.contains("madbase_admin_session")) // Only checks name exists!
.unwrap_or(false)
|| req.headers().contains_key("x-admin-token"); // Any value!
```
**Replacement approach:** Use HMAC-signed session tokens. The admin login endpoint should:
1. Verify the password (hashed with Argon2)
2. Generate a random session ID
3. Store it in Redis with a TTL (e.g., 24h)
4. Set an `HttpOnly`, `SameSite=Strict`, `Secure` cookie with the session ID
5. The middleware reads the cookie, looks up the session in Redis, rejects if missing/expired
**Implementation sketch:**
```rust
pub async fn admin_auth_middleware(
State(state): State<AdminAuthState>,
req: Request,
next: Next,
) -> Result<Response, StatusCode> {
let path = req.uri().path();
if path == "/dashboard" || path == "/platform/v1/login" {
return Ok(next.run(req).await);
}
if !path.starts_with("/platform/v1") {
return Ok(next.run(req).await);
}
// Extract session token from cookie
let session_token = req.headers()
.get(axum::http::header::COOKIE)
.and_then(|h| h.to_str().ok())
.and_then(|cookies| {
cookies.split(';')
.find_map(|c| {
let c = c.trim();
c.strip_prefix("madbase_admin_session=")
})
});
// Also check X-Admin-Token header
let token = session_token
.or_else(|| req.headers()
.get("x-admin-token")
.and_then(|v| v.to_str().ok()));
let token = token.ok_or(StatusCode::UNAUTHORIZED)?;
// Validate against session store
let valid = state.session_store.validate(token).await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
if !valid {
return Err(StatusCode::UNAUTHORIZED);
}
Ok(next.run(req).await)
}
```
**New struct needed:** `AdminAuthState` with a Redis-backed session store, or a shared `CacheLayer`.
### 0.2.2 Hash admin password
**File:** `control_plane/src/lib.rs``login` function (line 330+)
Use Argon2 to hash on first startup and verify on login:
```rust
pub async fn login(
State(state): State<ControlPlaneState>,
Json(payload): Json<LoginRequest>,
) -> Result<(CookieJar, StatusCode), (StatusCode, String)> {
let admin_password_hash = std::env::var("ADMIN_PASSWORD_HASH")
.expect("ADMIN_PASSWORD_HASH must be set");
let valid = auth::utils::verify_password(&payload.password, &admin_password_hash)
.map_err(|e| (StatusCode::INTERNAL_SERVER_ERROR, e.to_string()))?;
if !valid {
return Err((StatusCode::UNAUTHORIZED, "Invalid password".to_string()));
}
// Generate session...
}
```
Provide a CLI helper or startup script to generate the hash: `cargo run --bin hash_password -- "my-password"`.
### 0.2.3 Add auth to control-plane-api
**File:** `control-plane-api/src/lib.rs`
Add an API key middleware that reads `X-Api-Key` header and validates against `CONTROL_PLANE_API_KEY` env var. Apply to all routes except `/health`.
### 0.2.4 Add auth to function deploy/invoke
**File:** `functions/src/handlers.rs`
The function routes are nested under the auth middleware in `gateway/src/worker.rs`, so JWT auth should already apply. Verify this is actually enforced — currently the `/functions/v1/:name` POST route may be accessible with just an anon key. Deploy should require `service_role` or admin, invoke should require at least `authenticated`.
### 0.2.5 Add authorization to WebSocket subscriptions
**File:** `realtime/src/ws.rs` (or wherever WS join is handled)
On channel join, validate that the user's JWT role has SELECT permission on the requested table. Query `information_schema.role_table_grants` or attempt a `SELECT 1 FROM <table> LIMIT 0` within an RLS-scoped transaction.
---
## 0.3 — Injection & Input Sanitization
### 0.3.1 Fix SQL injection in SET LOCAL role
**Files:** `data_api/src/handlers.rs` and `storage/src/handlers.rs` (appears ~15 times)
```rust
// BEFORE (appears in every handler)
let role_query = format!("SET LOCAL role = '{}'", auth_ctx.role);
sqlx::query(&role_query).execute(&mut *tx).await?;
// AFTER — validate against allowlist
const ALLOWED_ROLES: &[&str] = &["anon", "authenticated", "service_role"];
if !ALLOWED_ROLES.contains(&auth_ctx.role.as_str()) {
return Err((StatusCode::FORBIDDEN, "Invalid role".to_string()));
}
let role_query = format!("SET LOCAL role = '{}'", auth_ctx.role);
sqlx::query(&role_query).execute(&mut *tx).await?;
```
> **Note:** PostgreSQL doesn't support `$1` parameter binding for `SET LOCAL role`. The allowlist approach is the correct fix. This will be further cleaned up in M1 when we extract the RLS middleware.
### 0.3.2 Fix SQL injection in table browser
**File:** `control_plane/src/lib.rs``get_table_data` function (line 278)
```rust
// BEFORE
let query = format!("SELECT * FROM \"{}\".\"{}\" LIMIT 100", schema, table);
// AFTER — validate against information_schema
let exists: Option<(String,)> = sqlx::query_as(
"SELECT table_name FROM information_schema.tables WHERE table_schema = $1 AND table_name = $2"
)
.bind(&schema)
.bind(&table)
.fetch_optional(&state.tenant_db)
.await
.map_err(|e| (StatusCode::INTERNAL_SERVER_ERROR, e.to_string()))?;
if exists.is_none() {
return Err((StatusCode::NOT_FOUND, "Table not found".to_string()));
}
let query = format!("SELECT * FROM \"{}\".\"{}\" LIMIT 100", schema, table);
```
### 0.3.3 Fix JavaScript injection in Deno runtime
**File:** `functions/src/deno_runtime.rs` lines 122-156
The current code does:
```rust
let module_code = format!(r#"
const req = new Request("http://localhost", {{
method: "POST",
body: {payload_json}, // <-- RAW interpolation!
headers: {headers_json} // <-- RAW interpolation!
}});
"#);
```
If `payload_json` contains backticks, template literal syntax, or escape sequences, it breaks out of the string context.
**Fix:** Pass data through a global variable set via the V8 API, not string interpolation:
```rust
// Before user code execution, set globalThis.__PAYLOAD__ and __HEADERS__
let setup_code = format!(
"globalThis.__PAYLOAD__ = JSON.parse({});globalThis.__HEADERS__ = JSON.parse({});",
serde_json::to_string(&payload_json)?,
serde_json::to_string(&headers_json)?
);
runtime.execute_script("<setup>", setup_code)?;
```
This double-serializes: the inner JSON becomes a string literal in JS, then `JSON.parse` deserializes it safely.
### 0.3.4 Fix path traversal in TUS uploads
**File:** `storage/src/tus.rs``get_upload_path` function (line 30)
```rust
// BEFORE
fn get_upload_path(id: &str) -> PathBuf {
let mut path = std::env::temp_dir();
path.push("madbase_tus");
path.push(id); // id could be "../../etc/passwd"
path
}
// AFTER
fn get_upload_path(id: &str) -> Result<PathBuf, &'static str> {
// Validate UUID format
Uuid::parse_str(id).map_err(|_| "Invalid upload ID")?;
let mut path = std::env::temp_dir();
path.push("madbase_tus");
path.push(id);
Ok(path)
}
```
Apply the same to `get_info_path`. Update all callers to propagate the error.
---
## 0.4 — Token & Session Security
### 0.4.1 Gate token issuance on email confirmation
**File:** `auth/src/handlers.rs``signup` function (lines 88-103)
```rust
// AFTER email confirmation check
// If auto-confirm is disabled (the default), return user without tokens
let auto_confirm = std::env::var("AUTH_AUTO_CONFIRM")
.map(|v| v == "true")
.unwrap_or(false);
if auto_confirm {
// Set confirmed_at immediately
sqlx::query("UPDATE users SET confirmed_at = now(), email_confirmed_at = now() WHERE id = $1")
.bind(user.id)
.execute(&db)
.await
.map_err(|e| (StatusCode::INTERNAL_SERVER_ERROR, e.to_string()))?;
// Issue tokens (existing logic)
let (token, expires_in, _) = generate_token(...)?;
let refresh_token = issue_refresh_token(...).await?;
return Ok(Json(AuthResponse { access_token: token, ... }));
}
// Not auto-confirmed: return user without tokens
Ok(Json(serde_json::json!({
"id": user.id,
"email": user.email,
"confirmation_sent_at": chrono::Utc::now(),
})))
```
### 0.4.2 Check confirmation status on login
**File:** `auth/src/handlers.rs``login` function, after password verification (line ~130)
```rust
// After verify_password succeeds:
if user.confirmed_at.is_none() && user.email_confirmed_at.is_none() {
return Err((StatusCode::FORBIDDEN, "Email not confirmed".to_string()));
}
```
### 0.4.3 Validate OAuth CSRF state
**File:** `auth/src/oauth.rs``authorize` and `callback`
In `authorize`, store the CSRF token in Redis:
```rust
let (auth_url, csrf_token) = auth_request.url();
// Store CSRF token with 10-minute TTL
if let Some(redis) = &cache.redis {
let key = format!("oauth_csrf:{}", csrf_token.secret());
cache.set(&key, &"valid").await.ok();
}
```
In `callback`, validate:
```rust
let csrf_key = format!("oauth_csrf:{}", query.state);
let valid = cache.exists(&csrf_key).await.unwrap_or(false);
if !valid {
return Err((StatusCode::BAD_REQUEST, "Invalid or expired state parameter".to_string()));
}
cache.delete(&csrf_key).await.ok();
```
### 0.4.4 Fix OAuth account takeover
**File:** `auth/src/oauth.rs` — callback, around line 234
Currently: if an existing user has the same email, the OAuth login returns that user's tokens.
**Fix:** Instead of implicit linking, create a new `identities` table. When an OAuth login matches an existing email:
- If the identity (provider + provider_id) is already linked, allow login
- If not linked, return an error: "An account with this email already exists. Log in with your password and link this provider from settings."
This is a larger change that can be partially deferred to M3 (identity linking), but the immediate fix is to **reject** the implicit match:
```rust
if existing_user.is_some() && !identity_linked {
return Err((StatusCode::CONFLICT,
"An account with this email already exists. Link this provider from your account settings.".to_string()));
}
```
---
## 0.5 — CORS & Transport Security
### 0.5.1 Restrict CORS origins
**Files:** `gateway/src/control.rs` line 104, `gateway/src/worker.rs` line 127
```rust
// BEFORE
CorsLayer::new()
.allow_origin(Any)
.allow_methods(Any)
.allow_headers(Any)
// AFTER
use tower_http::cors::AllowOrigin;
let allowed_origins = std::env::var("ALLOWED_ORIGINS")
.unwrap_or_else(|_| "http://localhost:3000,http://localhost:8000".to_string());
let origins: Vec<HeaderValue> = allowed_origins
.split(',')
.filter_map(|s| s.trim().parse().ok())
.collect();
CorsLayer::new()
.allow_origin(origins)
.allow_methods([Method::GET, Method::POST, Method::PUT, Method::DELETE, Method::PATCH, Method::OPTIONS])
.allow_headers([header::AUTHORIZATION, header::CONTENT_TYPE, HeaderName::from_static("apikey"), HeaderName::from_static("x-project-ref")])
.allow_credentials(true)
```
### 0.5.2 Stop exposing secrets in API responses
**File:** `control_plane/src/lib.rs`
In `list_projects` (line 61), create a `ProjectSummary` struct that omits `db_url`, `jwt_secret`, `anon_key`, `service_role_key`:
```rust
#[derive(Serialize, sqlx::FromRow)]
pub struct ProjectSummary {
pub id: Uuid,
pub name: String,
pub status: String,
pub created_at: Option<chrono::DateTime<chrono::Utc>>,
}
pub async fn list_projects(...) -> Result<Json<Vec<ProjectSummary>>, ...> {
let projects = sqlx::query_as::<_, ProjectSummary>(
"SELECT id, name, status, created_at FROM projects"
)...
}
```
Create a separate `GET /projects/:id/keys` endpoint that returns secrets only for the specifically requested project, requiring admin auth.
---
## Completion Requirements
This milestone is **not complete** until every item below is satisfied.
### 1. Full Test Suite — All Green
- [ ] `cargo test --workspace` passes with **zero failures**
- [ ] All **pre-existing tests** still pass (no regressions)
- [ ] **New unit tests** are written for every fix in this milestone:
| Test | Location | What it validates |
|------|----------|-------------------|
| `test_jwt_secret_required` | `common/src/config.rs` | `Config::new()` panics when `JWT_SECRET` is unset |
| `test_jwt_secret_min_length` | `common/src/config.rs` | `Config::new()` panics when `JWT_SECRET` < 32 chars |
| `test_admin_password_required` | `control_plane/src/lib.rs` | Login panics when `ADMIN_PASSWORD` is unset |
| `test_s3_credentials_required` | `storage/src/backend.rs` | `AwsS3Backend::new()` panics when `S3_ACCESS_KEY` is unset |
| `test_admin_auth_rejects_forged_cookie` | `gateway/src/admin_auth.rs` | Middleware rejects `madbase_admin_session=anything` |
| `test_admin_auth_rejects_empty_token` | `gateway/src/admin_auth.rs` | Middleware rejects empty `X-Admin-Token` |
| `test_admin_auth_requires_valid_session` | `gateway/src/admin_auth.rs` | Middleware requires session ID in Redis |
| `test_role_allowlist` | `common/src/rls.rs` or `data_api/src/handlers.rs` | `SET LOCAL role` rejects roles not in `[anon, authenticated, service_role]` |
| `test_signup_no_tokens_without_confirm` | `auth/src/handlers.rs` | Signup with `AUTH_AUTO_CONFIRM=false` returns user without tokens |
| `test_login_rejects_unconfirmed` | `auth/src/handlers.rs` | Login returns 403 for `confirmed_at = NULL` |
| `test_oauth_csrf_validated` | `auth/src/oauth.rs` | Callback rejects mismatched/missing CSRF state |
| `test_tus_path_traversal_blocked` | `storage/src/tus.rs` | `get_upload_path("../../etc/passwd")` returns an error |
| `test_config_not_serializable` | `common/src/config.rs` | `Config` does not implement `Serialize` (compile-time; remove `Serialize` derive) |
| `test_cors_rejects_unknown_origin` | `gateway/src/worker.rs` or integration | Request from unlisted origin gets no `Access-Control-Allow-Origin` |
| `test_list_projects_hides_secrets` | `control_plane/src/lib.rs` | `list_projects` response does not contain `jwt_secret` or `db_url` |
### 2. Manual / Integration Verification
- [ ] `rg 'jwt_secret|db_url|password' --type rust -n` — audit every hit; no INFO/WARN-level secret logging remains
- [ ] Starting without `JWT_SECRET` env var panics with a clear message
- [ ] Starting without `ADMIN_PASSWORD` env var panics with a clear message
- [ ] Starting without `S3_ACCESS_KEY` panics with a clear message
- [ ] `curl -H "Cookie: madbase_admin_session=anything" http://localhost:8001/platform/v1/projects` returns 401
- [ ] `curl -H "X-Admin-Token: anything" http://localhost:8001/platform/v1/projects` returns 401
- [ ] `curl http://localhost:8001/platform/v1/projects` returns 401 (no credentials at all)
- [ ] SQL injection in `SET LOCAL role` is blocked by allowlist
- [ ] OAuth flow stores and validates CSRF state
- [ ] Signup without `AUTH_AUTO_CONFIRM=true` does not return access tokens
- [ ] Login with unconfirmed email returns 403
### 3. CI Gate
- [ ] All of the above tests are included in `cargo test --workspace`
- [ ] CI pipeline runs these tests (once M7 CI is in place, retroactively verify M0 tests are green in CI)