Files
cloudlysis/S3_PLAN.md
Vlad Durnea 5992044b7e
Some checks failed
ci / ui (push) Failing after 28s
images / build-and-push (push) Failing after 19s
ci / rust (push) Failing after 2m30s
docs: restructure S3 plan into dep-ordered milestones; make S3 mandatory for document storage
2026-03-30 14:48:24 +03:00

6.5 KiB

S3-Compatible Object Storage Plan (Hetzner in Prod, MinIO Locally)

Principles

  • S3-compatible object storage is mandatory for platform document storage in every environment:
    • Local development uses MinIO.
    • Production uses Hetzner Object Storage (S3 API compatible).
  • Each milestone is stop-the-line gated:
    • All tasks completed
    • All milestone tests pass
    • Workspace verification commands pass
  • Secrets are never committed and never logged:
    • Access keys via Swarm secrets in production
    • .env or compose env in local dev

Goals

  • Introduce a single, shared S3-compatible configuration surface for the platform.
  • Make document storage always backed by S3 (no filesystem fallback for documents).
  • Keep the implementation incremental and test-gated per milestone.
  • Optionally expand to observability object storage after document storage is stable.

Definitions

Document Storage

“Documents” are versioned blobs the platform needs to store and retrieve reliably:

  • Deployment bundles and artifacts
  • Definitions/manifests (projection programs, saga/effects definitions, schema bundles)
  • Exported audit/log bundles, diagnostics, or snapshots that are not part of the primary KV/MDBX state

Document storage must support:

  • Tenant-scoped namespaces (prefixes)
  • Content-addressed or versioned keys (immutability preferred)
  • Listing by prefix for admin workflows

Configuration Contract (Platform-Wide)

Common Settings

  • S3_ENDPOINT (Hetzner: HTTPS endpoint; MinIO: http://minio:9000)
  • S3_REGION (required even for some S3-compatible providers)
  • S3_ACCESS_KEY_ID (secret)
  • S3_SECRET_ACCESS_KEY (secret)
  • S3_FORCE_PATH_STYLE (true/false)
  • S3_INSECURE (true/false, only allowed for local MinIO)

Buckets and Prefixes

  • S3_BUCKET_DOCS (required everywhere)
  • S3_PREFIX_DOCS (default docs/)

Optional (later milestones):

  • S3_BUCKET_LOKI, S3_PREFIX_LOKI
  • S3_BUCKET_TEMPO, S3_PREFIX_TEMPO

Target Architecture

Local Development

  • MinIO is part of the local stack for parity.
  • Control API is the document gateway:
    • Upload/download via signed URLs or streamed proxy endpoints
    • Metadata stored in existing storage/KV (document index) or derived from key scheme

Production

  • Hetzner Object Storage provides S3-compatible bucket(s).
  • Credentials and bucket details injected via Swarm secrets and stack env.

Development Plan (Milestones by Dependency)

Milestone 0: S3 Contract + Local MinIO Baseline

Dependencies

  • None

Goal

Provide a consistent local S3-compatible endpoint and stable bucket naming to unblock higher milestones.

Tasks

  • Add MinIO to local development stack:
    • Add minio service to compose (API + console)
    • Add minio-init job to create required buckets
  • Define standard bucket/prefix defaults for local dev:
    • S3_BUCKET_DOCS=cloudlysis-docs
    • S3_PREFIX_DOCS=docs/
  • Document local workflow to enable MinIO-backed document storage.

Required Tests (Gate)

  • Workspace verification commands
  • Local manual verification checklist:
    • cloudlysis-docs bucket exists
    • credentials work from a container in the compose network

Milestone 1: Document Storage API (Control API)

Dependencies

  • Milestone 0

Goal

Make document storage a first-class platform API and require it in all environments.

Tasks

  • Add an S3 client module to Control API:
    • parse config from env with strict validation (endpoint, bucket, keys)
    • support path-style and TLS/insecure options
  • Implement document primitives:
    • Put (upload) and Get (download)
    • List by prefix (tenant + doc-type)
    • Delete (admin-only) if needed
  • Decide and document a key scheme:
    • tenant-scoped prefix
    • immutable keys preferred (content hash + metadata)
  • Add authz rules for document operations (deny-by-default, tenant-scoped).

Required Tests (Gate)

  • Workspace verification commands
  • Unit tests:
    • config parsing/validation
    • key generation stability
  • Gated integration tests (MinIO):
    • put/get roundtrip
    • list by prefix
    • tenant isolation (cannot read other tenant prefix)

Milestone 2: Control UI Integration (Upload/Download Flows)

Dependencies

  • Milestone 1

Goal

Make document workflows usable from the Control UI without leaking credentials.

Tasks

  • Add Control API endpoints for signed URLs (recommended) or streamed proxy:
    • create upload URL (PUT)
    • create download URL (GET)
  • Implement Control UI flows for a first document type:
    • upload
    • list
    • download
  • Ensure correlation/trace propagation on Control API operations.

Required Tests (Gate)

  • Workspace verification commands
  • Control UI unit tests for routing/component render stability
  • Gated end-to-end checklist (local):
    • upload appears in list
    • download returns expected bytes

Milestone 3: Production Rollout (Hetzner)

Dependencies

  • Milestone 2

Goal

Deploy document storage on Hetzner S3-compatible backend with production-grade secret handling.

Tasks

  • Provision buckets and lifecycle policies (docs bucket):
    • retention rules appropriate to documents
    • access policy scoped to required actions
  • Swarm deployment:
    • add secrets for access keys
    • configure Control API with endpoint/region/bucket/prefix
  • Rollback plan:
    • switch to a fallback bucket or MinIO-on-prod if needed

Required Tests (Gate)

  • Workspace verification commands
  • Production smoke runbook:
    • upload/list/download for a tenant
    • verify objects exist under expected prefixes

Milestone 4 (Optional): Observability Storage on S3 (Loki + Tempo)

Dependencies

  • Milestone 3

Goal

Store logs and traces in S3-compatible storage (MinIO locally; Hetzner in production).

Tasks

  • Loki:
    • add S3 config variant and compose overlay
    • validate log query and bucket objects
  • Tempo:
    • add S3 config variant and compose overlay
    • validate traces and bucket objects

Required Tests (Gate)

  • Workspace verification commands
  • Gated local validation:
    • Loki writes objects to bucket/prefix after ingest
    • Tempo writes objects to bucket/prefix after ingest

Workspace Verification Commands

  • cargo fmt --check
  • cargo clippy --workspace --all-targets -- -D warnings
  • cargo test --workspace
  • cd control/ui && npm ci && npm run lint && npm run typecheck && npm run test && npm run build