Monorepo consolidation: workspace, shared types, transport plans, docker/swam assets
This commit is contained in:
109
projection/prd.md
Normal file
109
projection/prd.md
Normal file
@@ -0,0 +1,109 @@
|
||||
The **Projection** is the "Read Side" of your CQRS (Command Query Responsibility Segregation) architecture. While Aggregates focus on **writing** valid data, Projections focus on **reading** and **formatting** that data for the end-user or application.
|
||||
|
||||
In your framework, Projections are **event-driven views** that transform the stream of facts from **NATS JetStream** into highly optimized, queryable state in `edge-storage` `KvStore`, queryable via the embedded `query-engine` (UQF).
|
||||
|
||||
---
|
||||
|
||||
### 🧱 Component: Projection (Read Model)
|
||||
**Definition:**
|
||||
A Projection is a standalone Rust-based container that consumes Events from **NATS JetStream** and incrementally updates one or more "Read Models" in `edge-storage`. Its sole purpose is to provide a high-performance, pre-computed view of the system state that is optimized for specific queries, bypassing the need to rehydrate Aggregate state or replay event streams at query time.
|
||||
|
||||
**Multi-Tenancy:**
|
||||
The Projection supports optional multi-tenancy via `tenant_id`. When enabled:
|
||||
- **Subject Naming:** JetStream subjects include `tenant_id` (e.g., `tenant.<tenant_id>.aggregate.<aggregate_type>.<aggregate_id>`)
|
||||
- **Storage Namespacing:** Views and checkpoints are namespaced by `tenant_id` to prevent cross-tenant reads
|
||||
- **Query Isolation:** Queries are tenant-scoped (e.g., `x-tenant-id` header) and only scan tenant-prefixed keys
|
||||
- **Backward Compatibility:** Deployments without multi-tenancy use a default/empty `tenant_id`
|
||||
|
||||
**Dependencies:**
|
||||
* Core crates pulled from the custom Cargo registry:
|
||||
```toml
|
||||
[registries.madapes]
|
||||
index = "sparse+https://git.madapes.com/api/packages/madapes/cargo/"
|
||||
```
|
||||
|
||||
| Crate | Purpose |
|
||||
|-------|---------|
|
||||
| `edge-storage` | libmdbx-backed `KvStore` for durable view storage |
|
||||
| `runtime-function` | Deterministic DAG execution for `project` programs |
|
||||
| `edge-logger` | High-performance logging (UDS + Protobuf, Loki sink) |
|
||||
| `query-engine` | UQF query support for filtering/querying view state |
|
||||
| `async-nats` | NATS JetStream client for event consumption |
|
||||
|
||||
* Source code available at `../../madapes/`
|
||||
* **Note:** This is a standalone container — it does not use `event-bus` or gRPC `Consume`/`FetchBatch` APIs
|
||||
|
||||
#### 1. Core Responsibilities
|
||||
* **Event Consumption:** Subscribes to one or more JetStream subjects (typically Aggregate event subjects) using a durable consumer, filtering with subject wildcards.
|
||||
* **State Transformation:** Uses a `project` program (`runtime-function` DAG) to map an incoming event to a state change (e.g., `IncrementCounter`, `UpdateUserEmail`, `AddToList`).
|
||||
* **Read Model Persistence:** Stores the resulting "View" in `edge-storage` `KvStore` as a JSON document, keyed by `view:{tenant_id}:{view_type}:{view_id}` (e.g., `view:tenant_a:UserDashboard:user_123`).
|
||||
* **Query Serving:** Provides read access via `query-engine` UQF queries. The existing `KvStore::query()` integration performs prefix scans and applies UQF filters/sorts.
|
||||
* **Checkpointing:** Tracks its stream position (JetStream stream sequence) in `edge-storage` `KvStore` (key: `checkpoint:{tenant_id}:{view_type}`) to resume correctly after a restart.
|
||||
* **Safe Acknowledgement:** Acks JetStream messages only after the view update and checkpoint are durably committed.
|
||||
|
||||
#### 2. The Lifecycle of a Projection Update
|
||||
1. **Ingestion:** The Projection receives a JetStream message whose payload is a `FrameworkEnvelope` (or equivalent event envelope). It extracts the message metadata (at minimum, the JetStream **stream sequence**) used for idempotency.
|
||||
2. **Context Loading:**
|
||||
* The Projection fetches the current "View" from `edge-storage` `KvStore` (e.g., `kv.get("view:tenant_a:UserDashboard:user_123")`).
|
||||
3. **Transformation (`runtime-function`):**
|
||||
* It executes the `project` DAG program: `(current_view_state, incoming_event) → new_view_state`.
|
||||
* Alternatively, it can use `KvStore::query()` (with `query-engine` UQF) to perform cross-projection lookups to build the new state.
|
||||
4. **Atomic Update:**
|
||||
* The Projection saves the `new_view_state` back to `edge-storage` `KvStore`.
|
||||
* **Critical:** It must save the **checkpoint** (JetStream stream sequence) as part of the same MDBX transaction (e.g., `kv.put_sync("checkpoint:tenant_a:UserDashboard", stream_sequence)`). This ensures crash-recovery correctness.
|
||||
5. **Acknowledge:** After the transaction commits, the Projection acks the JetStream message so it will not be redelivered.
|
||||
6. **Query Availability:** The updated state is immediately available for applications to query via `query-engine` UQF queries.
|
||||
|
||||
#### 3. Technical Constraints & Guarantees
|
||||
* **Eventual Consistency:** Projections are inherently "behind" the Aggregate. There is a sub-second (usually) delay between an event being committed and the Projection reflecting that change.
|
||||
* **Idempotency:** Since JetStream provides **at-least-once** delivery, the Projection must use its stored **Checkpoint** (stream sequence) to ignore events it has already processed.
|
||||
* **Disposable & Rebuildable:** Because JetStream is a durable log, Projections are "disposable." If a business requirement changes, you can delete a Projection's KV entries, create a new `runtime-function` program, and **replay** the entire history from JetStream (starting from sequence 1) to build a new view from scratch.
|
||||
* **Read-Only:** Projections never produce events or commands. They are strictly "sinks" for data.
|
||||
|
||||
#### 4. Replay & Recovery Model
|
||||
* **Catch-up Mode:** When a new Projection is deployed (no checkpoint exists), it starts from the beginning of the JetStream stream (sequence 1) and consumes as fast as possible until it reaches the tail.
|
||||
* **Live Mode:** Once caught up, it continues consuming in real time using the same durable consumer, relying on JetStream acks/redelivery for reliability.
|
||||
|
||||
#### 5. Snapshots (Relationship to Aggregates)
|
||||
The Projection does not require Aggregate snapshots to function, because its source of truth for changes is the JetStream event stream. However, snapshots are still relevant in two ways:
|
||||
|
||||
* **Aggregate Snapshots (Write Side):** Aggregates persist versioned snapshots in `edge-storage` `AggregateStore` to speed up Aggregate rehydration. These snapshots are not a read API for projections and should not be treated as a substitute for consuming events.
|
||||
* **Projection State (Read Side):** A Projection’s stored View in `edge-storage` `KvStore` is effectively its own “snapshot” of the read model at a specific checkpoint (stream sequence).
|
||||
* **Fast Recovery:** On restart, the Projection loads `checkpoint:{tenant_id}:{view_type}`, resumes JetStream consumption from the next sequence, and continues updating existing View records in place. No replay is required unless the checkpoint is missing or the view schema/logic has changed.
|
||||
* **Optional Seeding:** For very large histories, a Projection may optionally seed an initial View state from a recent Aggregate snapshot or an external export, then set its checkpoint to a known JetStream stream sequence and continue consuming events forward from that point. This preserves incremental correctness while reducing rebuild time.
|
||||
|
||||
#### 6. Hot Provisioning (Rolling Scale + Rolling Upgrades)
|
||||
Projections are designed to be provisioned and updated without downtime.
|
||||
|
||||
* **Hot Scale-Out:** Multiple Projection replicas can run concurrently per `tenant_id` and `view_type`. JetStream consumer configuration is used to ensure each event is processed by exactly one replica within a replica set.
|
||||
* **Hot Restart:** A restarted instance resumes from the persisted checkpoint and continues consumption; recovery time is proportional to the gap between the checkpoint and the stream tail.
|
||||
* **Hot Upgrade (Projection Logic):** To change a `project` program safely:
|
||||
* Deploy a new Projection version under a new `view_type` (or `view_type` + version suffix) with its own checkpoint.
|
||||
* Backfill by consuming from sequence 1 (or from a chosen seed sequence) until caught up.
|
||||
* Switch query routing from the old view keys to the new view keys.
|
||||
* Retire old view data and checkpoint after the cutover.
|
||||
* **In-Place Migration:** If the schema change is backward compatible, a Projection may evolve the stored View shape incrementally while processing events, but this requires strict versioning in the View payload.
|
||||
|
||||
#### 7. Caveats & Operational Notes
|
||||
* **Ordering Guarantees:** JetStream preserves ordering per stream, but if the Projection processes messages concurrently it can violate per-entity ordering. If ordering matters for a `view_id`, enforce per-key serialization in the Projection.
|
||||
* **At-Least-Once Reality:** Redeliveries can happen (network splits, ack timeouts, restarts). The Projection must be idempotent via checkpoint checks and/or per-event dedupe keyed by stream sequence.
|
||||
* **Ack Discipline:** Never ack before the MDBX transaction commits. Treat “view update + checkpoint update + ack” as one logical commit.
|
||||
* **Poison Messages:** A single malformed event or incompatible schema can stall a durable consumer. Define a policy for retries, quarantine, and alerting (including whether to skip and record the failure).
|
||||
* **Schema Evolution:** Projection logic must be able to handle old event versions or explicitly version the stream/subjects. Projection View schemas also need versioning if you support in-place migrations.
|
||||
* **Backpressure & Lag:** Catch-up replays can saturate storage and CPU. Monitor consumer lag, redeliveries, and processing latency; apply limits (max in-flight, batching) to protect the node.
|
||||
* **Rebuild Semantics:** Rebuilds must delete both View keys and checkpoints for the target `tenant_id`/`view_type`. Partial deletes can create “mixed era” views.
|
||||
* **Cross-View Lookups:** Using `KvStore::query()` to join across projections is convenient but can amplify read load and introduce consistency anomalies between views. Prefer event-local computation when possible.
|
||||
|
||||
#### 8. Data Structure (The View Envelope)
|
||||
* `view_id`: The unique key for the record (e.g., `user_id`). Used in KvStore key: `view:{tenant_id}:{view_type}:{view_id}`.
|
||||
* `view_type`: The name of the projection (e.g., `active_users_list`).
|
||||
* `last_event_sequence`: The checkpoint (JetStream stream sequence) of the last event processed. Stored separately in `checkpoint:{tenant_id}:{view_type}`.
|
||||
* `data`: The actual payload (JSON) optimized for the UI or API, stored as the KvStore value.
|
||||
|
||||
---
|
||||
|
||||
### 💡 Key Distinction for your PRD:
|
||||
In your framework, the **Projection** is where the "Distributed" part of the system becomes visible to the user.
|
||||
|
||||
* **Aggregates** are for **Consistency** (The Truth).
|
||||
* **Projections** are for **Performance** (The Speed).
|
||||
Reference in New Issue
Block a user