Files
cloudlysis/gateway/LOAD_TESTING.md
Vlad Durnea 1298d9a3df
Some checks failed
ci / rust (push) Failing after 2m34s
ci / ui (push) Failing after 30s
Monorepo consolidation: workspace, shared types, transport plans, docker/swam assets
2026-03-30 11:40:42 +03:00

1.3 KiB

Load and Failure Testing Strategy

Goals

  • Verify the Gateway stays responsive under sustained traffic.
  • Verify auth flows behave correctly under concurrency.
  • Verify routing reloads are atomic and safe under load.
  • Verify upstream failures are bounded (timeouts) and observable (metrics/logs).

Scenarios

AuthN

  • Sign up once, then:
    • Burst sign-in attempts to verify rate limits and correct 401/429 behavior.
    • Parallel refresh calls to verify refresh rotation correctness.

Routing Reload

  • Run steady traffic to:
    • POST /v1/query/{view_type}
    • POST /v1/commands/{aggregate_type}/{aggregate_id}
  • Trigger POST /admin/routing/reload repeatedly and verify:
    • No 500s from partial routing table reads.
    • Routing decisions switch only at revision boundaries.

Upstream Failure Modes

  • Configure routing to a shard endpoint that:
    • Refuses connections (ECONNREFUSED)
    • Hangs (no response)
    • Returns 5xx
  • Verify:
    • Gateway timeouts are enforced.
    • Errors are surfaced as 5xx to callers.
    • gateway_http_requests_total and duration histograms capture the failures.

HA Behavior (Swarm)

  • Run gateway with 2 replicas and no sticky sessions.
  • Verify:
    • Refresh works across replicas.
    • IAM updates become effective immediately on both replicas.
    • Rolling update keeps at least 1 replica ready.