docs: add docs folder (architecture, developer, usage); update README; wire probe TTL cache + concurrency notes into docs
Some checks failed
ci / ui (push) Failing after 28s
images / build-and-push (push) Failing after 19s
ci / rust (push) Failing after 2m26s

This commit is contained in:
2026-03-30 14:32:47 +03:00
parent 90c307016d
commit e9a0142396
12 changed files with 203 additions and 805 deletions

18
docs/README.md Normal file
View File

@@ -0,0 +1,18 @@
# Cloudlysis Documentation
## Sections
- Architecture
- docs/architecture/overview.md
- docs/architecture/transport.md
- Developer
- docs/developer/setup.md
- docs/developer/testing.md
- Usage
- docs/usage/quickstart.md
- docs/usage/api.md
- docs/usage/nats.md
## Conventions
- HTTP edge remains JSON over REST via Gateway.
- Internal RPC uses gRPC between Gateway and nodes (Aggregate, Projection, Runner).
- Async backbone uses NATS JetStream and KV with standardized subjects, headers, and stream/consumer policies.

View File

@@ -0,0 +1,21 @@
# Architecture Overview
## Monorepo
- Rust workspace: aggregate, projection, runner, gateway, control/api, shared
- Frontend: control/ui
- Infra: docker, observability, swarm
## Data Flow
- Clients → Gateway (HTTP/JSON)
- Gateway ↔ Nodes (gRPC)
- Nodes ↔ NATS (JetStream + KV)
## Services
- Aggregate: command handling + event sourcing; publishes events to JetStream
- Projection: materialized views; consumes aggregate events; exposes QueryService (gRPC)
- Runner: workflow/saga engine + effects/outbox; exposes RunnerAdmin (gRPC)
- Gateway: edge, authn/z, routing to nodes, admin entry points
## Observability
- /health, /ready, /metrics on all services
- Correlation and tracing propagated across HTTP, gRPC, and NATS

View File

@@ -0,0 +1,23 @@
# Transport Contracts
## Context
- Tenant: HTTP x-tenant-id; NATS tenant-id
- Correlation: HTTP x-correlation-id; NATS x-correlation-id + correlation-id
- Trace: HTTP traceparent; NATS traceparent + trace-id
## Internal RPC (gRPC)
- Aggregate: CommandService (submit commands)
- Projection: QueryService (execute queries)
- Runner: RunnerAdmin (drain, status, reload)
- All calls set deadlines and propagate context metadata
## NATS JetStream
- Streams:
- AGGREGATE_EVENTS: tenant.*.aggregate.*.*
- WORKFLOW_COMMANDS, WORKFLOW_EVENTS
- Producers set headers: tenant-id, Nats-Msg-Id, correlation, traceparent/trace-id
- Consumers use AckPolicy::Explicit, bounded ack_wait, max_deliver, max_ack_pending
## Routing
- Gateway routes per-tenant to shards for Aggregate/Projection/Runner
- Routing tables hot-reload atomically

26
docs/developer/setup.md Normal file
View File

@@ -0,0 +1,26 @@
# Developer Setup
## Prerequisites
- Rust toolchain (stable)
- Node.js (LTS) for control/ui
- Docker (optional) for local stack
## Build
```bash
cargo build
cd control/ui && npm ci && npm run build
```
## Workspace Verification
```bash
cargo fmt --check
cargo clippy --workspace --all-targets -- -D warnings
cargo test --workspace
cd control/ui && npm ci && npm run lint && npm run typecheck && npm run test && npm run build
```
## Environment
- Gateway: routing config using file or KV
- Projection: PROJECTION_GRPC_ADDR
- Runner: RUNNER_GRPC_ADDR
- NATS: URLs via service-specific settings

27
docs/developer/testing.md Normal file
View File

@@ -0,0 +1,27 @@
# Testing
## Unit and Integration
```bash
cargo test --workspace
```
## Gated Tests (require external services)
- Runner NATS:
```bash
RUNNER_TEST_NATS_URL=nats://127.0.0.1:4222 cargo test -p runner -- --ignored
```
- Projection NATS:
```bash
PROJECTION_TEST_NATS_URL=nats://127.0.0.1:4222 cargo test -p projection -- --ignored
```
- Docker-based gates:
```bash
cargo test -p gateway -- --ignored
```
## Control UI
```bash
cd control/ui
npm ci
npm run test
```

38
docs/usage/api.md Normal file
View File

@@ -0,0 +1,38 @@
# Usage: API Examples
## Projection Query via Gateway (HTTP → gRPC)
```bash
curl -sS -X POST \
-H "x-tenant-id: tenant-a" \
-H "x-correlation-id: demo" \
-H "traceparent: 00-00000000000000000000000000000001-0000000000000001-01" \
http://localhost:8080/v1/query/User \
-d '{"uqf":"{\"eq\":{\"id\":\"u1\"}}"}'
```
## Projection Query via gRPC (direct, internal)
```bash
grpcurl -d '{"tenant_id":"tenant-a","view_type":"User","uqf":"{}"}' \
-H 'x-tenant-id: tenant-a' \
-H 'x-correlation-id: demo' \
-H 'traceparent: 00-00000000000000000000000000000001-0000000000000001-01' \
-plaintext localhost:9090 projection.gateway.v1.QueryService/ExecuteQuery
```
## Aggregate Command via Gateway (HTTP → gRPC)
```bash
curl -sS -X POST \
-H "x-tenant-id: tenant-a" \
-H "x-correlation-id: demo" \
-H "traceparent: 00-00000000000000000000000000000001-0000000000000001-01" \
http://localhost:8080/v1/aggregate/BankAccount/command \
-d '{"id":"acc-1","command_type":"Open","payload":{"owner":"Alice"}}'
```
## Runner Admin via Gateway (HTTP → gRPC)
```bash
curl -sS -X POST \
-H "x-tenant-id: tenant-a" \
-H "authorization: Bearer <token>" \
http://localhost:8080/admin/runner/drain?wait_ms=0
```

17
docs/usage/nats.md Normal file
View File

@@ -0,0 +1,17 @@
# NATS Reference
## Subjects
- Aggregate events: tenant.<tenant>.aggregate.<type>.<id>
- Workflow commands/events: shared helpers define exact formats
## Headers (Producers)
- tenant-id: required
- Nats-Msg-Id: idempotency key (event_id, command_id, etc.)
- x-correlation-id and correlation-id
- traceparent and trace-id
## Consumers
- AckPolicy::Explicit
- ack_wait: bounded timeout
- max_deliver: bounded
- max_ack_pending: aligned with concurrency

26
docs/usage/quickstart.md Normal file
View File

@@ -0,0 +1,26 @@
# Quick Start
## Compose
```bash
docker compose up -d --build
```
Full stack with observability:
```bash
docker compose -f docker-compose.yml -f observability/docker-compose.yml up -d --build
```
## Local Dev (minimal)
- Start NATS locally
- Run services:
```bash
cargo run -p aggregate
cargo run -p projection
cargo run -p runner
cargo run -p gateway
```
## Verify
- Gateway: GET /health, /ready, /metrics
- Projection: GET /health, /ready, /metrics
- Runner: GET /health, /ready, /metrics