Files
cloudlysis/gateway/LOAD_TESTING.md
Vlad Durnea 1298d9a3df
Some checks failed
ci / rust (push) Failing after 2m34s
ci / ui (push) Failing after 30s
Monorepo consolidation: workspace, shared types, transport plans, docker/swam assets
2026-03-30 11:40:42 +03:00

45 lines
1.3 KiB
Markdown

# Load and Failure Testing Strategy
## Goals
- Verify the Gateway stays responsive under sustained traffic.
- Verify auth flows behave correctly under concurrency.
- Verify routing reloads are atomic and safe under load.
- Verify upstream failures are bounded (timeouts) and observable (metrics/logs).
## Scenarios
### AuthN
- Sign up once, then:
- Burst sign-in attempts to verify rate limits and correct 401/429 behavior.
- Parallel refresh calls to verify refresh rotation correctness.
### Routing Reload
- Run steady traffic to:
- `POST /v1/query/{view_type}`
- `POST /v1/commands/{aggregate_type}/{aggregate_id}`
- Trigger `POST /admin/routing/reload` repeatedly and verify:
- No 500s from partial routing table reads.
- Routing decisions switch only at revision boundaries.
### Upstream Failure Modes
- Configure routing to a shard endpoint that:
- Refuses connections (ECONNREFUSED)
- Hangs (no response)
- Returns 5xx
- Verify:
- Gateway timeouts are enforced.
- Errors are surfaced as 5xx to callers.
- `gateway_http_requests_total` and duration histograms capture the failures.
### HA Behavior (Swarm)
- Run `gateway` with 2 replicas and no sticky sessions.
- Verify:
- Refresh works across replicas.
- IAM updates become effective immediately on both replicas.
- Rolling update keeps at least 1 replica ready.