Files
cloudlysis/S3_PLAN.md
Vlad Durnea 5992044b7e
Some checks failed
ci / ui (push) Failing after 28s
images / build-and-push (push) Failing after 19s
ci / rust (push) Failing after 2m30s
docs: restructure S3 plan into dep-ordered milestones; make S3 mandatory for document storage
2026-03-30 14:48:24 +03:00

188 lines
6.5 KiB
Markdown

# S3-Compatible Object Storage Plan (Hetzner in Prod, MinIO Locally)
## Principles
- S3-compatible object storage is mandatory for platform document storage in every environment:
- Local development uses MinIO.
- Production uses Hetzner Object Storage (S3 API compatible).
- Each milestone is stop-the-line gated:
- All tasks completed
- All milestone tests pass
- Workspace verification commands pass
- Secrets are never committed and never logged:
- Access keys via Swarm secrets in production
- `.env` or compose env in local dev
## Goals
- Introduce a single, shared S3-compatible configuration surface for the platform.
- Make document storage always backed by S3 (no filesystem fallback for documents).
- Keep the implementation incremental and test-gated per milestone.
- Optionally expand to observability object storage after document storage is stable.
## Definitions
### Document Storage
“Documents” are versioned blobs the platform needs to store and retrieve reliably:
- Deployment bundles and artifacts
- Definitions/manifests (projection programs, saga/effects definitions, schema bundles)
- Exported audit/log bundles, diagnostics, or snapshots that are not part of the primary KV/MDBX state
Document storage must support:
- Tenant-scoped namespaces (prefixes)
- Content-addressed or versioned keys (immutability preferred)
- Listing by prefix for admin workflows
## Configuration Contract (Platform-Wide)
### Common Settings
- `S3_ENDPOINT` (Hetzner: HTTPS endpoint; MinIO: `http://minio:9000`)
- `S3_REGION` (required even for some S3-compatible providers)
- `S3_ACCESS_KEY_ID` (secret)
- `S3_SECRET_ACCESS_KEY` (secret)
- `S3_FORCE_PATH_STYLE` (`true/false`)
- `S3_INSECURE` (`true/false`, only allowed for local MinIO)
### Buckets and Prefixes
- `S3_BUCKET_DOCS` (required everywhere)
- `S3_PREFIX_DOCS` (default `docs/`)
Optional (later milestones):
- `S3_BUCKET_LOKI`, `S3_PREFIX_LOKI`
- `S3_BUCKET_TEMPO`, `S3_PREFIX_TEMPO`
## Target Architecture
### Local Development
- MinIO is part of the local stack for parity.
- Control API is the document gateway:
- Upload/download via signed URLs or streamed proxy endpoints
- Metadata stored in existing storage/KV (document index) or derived from key scheme
### Production
- Hetzner Object Storage provides S3-compatible bucket(s).
- Credentials and bucket details injected via Swarm secrets and stack env.
## Development Plan (Milestones by Dependency)
## Milestone 0: S3 Contract + Local MinIO Baseline
### Dependencies
- None
### Goal
Provide a consistent local S3-compatible endpoint and stable bucket naming to unblock higher milestones.
### Tasks
- [ ] Add MinIO to local development stack:
- [ ] Add `minio` service to compose (API + console)
- [ ] Add `minio-init` job to create required buckets
- [ ] Define standard bucket/prefix defaults for local dev:
- [ ] `S3_BUCKET_DOCS=cloudlysis-docs`
- [ ] `S3_PREFIX_DOCS=docs/`
- [ ] Document local workflow to enable MinIO-backed document storage.
### Required Tests (Gate)
- [ ] Workspace verification commands
- [ ] Local manual verification checklist:
- [ ] `cloudlysis-docs` bucket exists
- [ ] credentials work from a container in the compose network
## Milestone 1: Document Storage API (Control API)
### Dependencies
- Milestone 0
### Goal
Make document storage a first-class platform API and require it in all environments.
### Tasks
- [ ] Add an S3 client module to Control API:
- [ ] parse config from env with strict validation (endpoint, bucket, keys)
- [ ] support path-style and TLS/insecure options
- [ ] Implement document primitives:
- [ ] Put (upload) and Get (download)
- [ ] List by prefix (tenant + doc-type)
- [ ] Delete (admin-only) if needed
- [ ] Decide and document a key scheme:
- [ ] tenant-scoped prefix
- [ ] immutable keys preferred (content hash + metadata)
- [ ] Add authz rules for document operations (deny-by-default, tenant-scoped).
### Required Tests (Gate)
- [ ] Workspace verification commands
- [ ] Unit tests:
- [ ] config parsing/validation
- [ ] key generation stability
- [ ] Gated integration tests (MinIO):
- [ ] put/get roundtrip
- [ ] list by prefix
- [ ] tenant isolation (cannot read other tenant prefix)
## Milestone 2: Control UI Integration (Upload/Download Flows)
### Dependencies
- Milestone 1
### Goal
Make document workflows usable from the Control UI without leaking credentials.
### Tasks
- [ ] Add Control API endpoints for signed URLs (recommended) or streamed proxy:
- [ ] create upload URL (PUT)
- [ ] create download URL (GET)
- [ ] Implement Control UI flows for a first document type:
- [ ] upload
- [ ] list
- [ ] download
- [ ] Ensure correlation/trace propagation on Control API operations.
### Required Tests (Gate)
- [ ] Workspace verification commands
- [ ] Control UI unit tests for routing/component render stability
- [ ] Gated end-to-end checklist (local):
- [ ] upload appears in list
- [ ] download returns expected bytes
## Milestone 3: Production Rollout (Hetzner)
### Dependencies
- Milestone 2
### Goal
Deploy document storage on Hetzner S3-compatible backend with production-grade secret handling.
### Tasks
- [ ] Provision buckets and lifecycle policies (docs bucket):
- [ ] retention rules appropriate to documents
- [ ] access policy scoped to required actions
- [ ] Swarm deployment:
- [ ] add secrets for access keys
- [ ] configure Control API with endpoint/region/bucket/prefix
- [ ] Rollback plan:
- [ ] switch to a fallback bucket or MinIO-on-prod if needed
### Required Tests (Gate)
- [ ] Workspace verification commands
- [ ] Production smoke runbook:
- [ ] upload/list/download for a tenant
- [ ] verify objects exist under expected prefixes
## Milestone 4 (Optional): Observability Storage on S3 (Loki + Tempo)
### Dependencies
- Milestone 3
### Goal
Store logs and traces in S3-compatible storage (MinIO locally; Hetzner in production).
### Tasks
- [ ] Loki:
- [ ] add S3 config variant and compose overlay
- [ ] validate log query and bucket objects
- [ ] Tempo:
- [ ] add S3 config variant and compose overlay
- [ ] validate traces and bucket objects
### Required Tests (Gate)
- [ ] Workspace verification commands
- [ ] Gated local validation:
- [ ] Loki writes objects to bucket/prefix after ingest
- [ ] Tempo writes objects to bucket/prefix after ingest
## Workspace Verification Commands
- `cargo fmt --check`
- `cargo clippy --workspace --all-targets -- -D warnings`
- `cargo test --workspace`
- `cd control/ui && npm ci && npm run lint && npm run typecheck && npm run test && npm run build`