11 KiB
The Projection is the "Read Side" of your CQRS (Command Query Responsibility Segregation) architecture. While Aggregates focus on writing valid data, Projections focus on reading and formatting that data for the end-user or application.
In your framework, Projections are event-driven views that transform the stream of facts from NATS JetStream into highly optimized, queryable state in edge-storage KvStore, queryable via the embedded query-engine (UQF).
🧱 Component: Projection (Read Model)
Definition:
A Projection is a standalone Rust-based container that consumes Events from NATS JetStream and incrementally updates one or more "Read Models" in edge-storage. Its sole purpose is to provide a high-performance, pre-computed view of the system state that is optimized for specific queries, bypassing the need to rehydrate Aggregate state or replay event streams at query time.
Multi-Tenancy:
The Projection supports optional multi-tenancy via tenant_id. When enabled:
- Subject Naming: JetStream subjects include
tenant_id(e.g.,tenant.<tenant_id>.aggregate.<aggregate_type>.<aggregate_id>) - Storage Namespacing: Views and checkpoints are namespaced by
tenant_idto prevent cross-tenant reads - Query Isolation: Queries are tenant-scoped (e.g.,
x-tenant-idheader) and only scan tenant-prefixed keys - Backward Compatibility: Deployments without multi-tenancy use a default/empty
tenant_id
Dependencies:
-
Core crates pulled from the custom Cargo registry:
[registries.madapes] index = "sparse+https://git.madapes.com/api/packages/madapes/cargo/"Crate Purpose edge-storagelibmdbx-backed KvStorefor durable view storageruntime-functionDeterministic DAG execution for projectprogramsedge-loggerHigh-performance logging (UDS + Protobuf, Loki sink) query-engineUQF query support for filtering/querying view state async-natsNATS JetStream client for event consumption -
Source code available at
../../madapes/ -
Note: This is a standalone container — it does not use
event-busor gRPCConsume/FetchBatchAPIs
1. Core Responsibilities
- Event Consumption: Subscribes to one or more JetStream subjects (typically Aggregate event subjects) using a durable consumer, filtering with subject wildcards.
- State Transformation: Uses a
projectprogram (runtime-functionDAG) to map an incoming event to a state change (e.g.,IncrementCounter,UpdateUserEmail,AddToList). - Read Model Persistence: Stores the resulting "View" in
edge-storageKvStoreas a JSON document, keyed byview:{tenant_id}:{view_type}:{view_id}(e.g.,view:tenant_a:UserDashboard:user_123). - Query Serving: Provides read access via
query-engineUQF queries. The existingKvStore::query()integration performs prefix scans and applies UQF filters/sorts. - Checkpointing: Tracks its stream position (JetStream stream sequence) in
edge-storageKvStore(key:checkpoint:{tenant_id}:{view_type}) to resume correctly after a restart. - Safe Acknowledgement: Acks JetStream messages only after the view update and checkpoint are durably committed.
2. The Lifecycle of a Projection Update
- Ingestion: The Projection receives a JetStream message whose payload is a
FrameworkEnvelope(or equivalent event envelope). It extracts the message metadata (at minimum, the JetStream stream sequence) used for idempotency. - Context Loading:
- The Projection fetches the current "View" from
edge-storageKvStore(e.g.,kv.get("view:tenant_a:UserDashboard:user_123")).
- The Projection fetches the current "View" from
- Transformation (
runtime-function):- It executes the
projectDAG program:(current_view_state, incoming_event) → new_view_state. - Alternatively, it can use
KvStore::query()(withquery-engineUQF) to perform cross-projection lookups to build the new state.
- It executes the
- Atomic Update:
- The Projection saves the
new_view_stateback toedge-storageKvStore. - Critical: It must save the checkpoint (JetStream stream sequence) as part of the same MDBX transaction (e.g.,
kv.put_sync("checkpoint:tenant_a:UserDashboard", stream_sequence)). This ensures crash-recovery correctness.
- The Projection saves the
- Acknowledge: After the transaction commits, the Projection acks the JetStream message so it will not be redelivered.
- Query Availability: The updated state is immediately available for applications to query via
query-engineUQF queries.
3. Technical Constraints & Guarantees
- Eventual Consistency: Projections are inherently "behind" the Aggregate. There is a sub-second (usually) delay between an event being committed and the Projection reflecting that change.
- Idempotency: Since JetStream provides at-least-once delivery, the Projection must use its stored Checkpoint (stream sequence) to ignore events it has already processed.
- Disposable & Rebuildable: Because JetStream is a durable log, Projections are "disposable." If a business requirement changes, you can delete a Projection's KV entries, create a new
runtime-functionprogram, and replay the entire history from JetStream (starting from sequence 1) to build a new view from scratch. - Read-Only: Projections never produce events or commands. They are strictly "sinks" for data.
4. Replay & Recovery Model
- Catch-up Mode: When a new Projection is deployed (no checkpoint exists), it starts from the beginning of the JetStream stream (sequence 1) and consumes as fast as possible until it reaches the tail.
- Live Mode: Once caught up, it continues consuming in real time using the same durable consumer, relying on JetStream acks/redelivery for reliability.
5. Snapshots (Relationship to Aggregates)
The Projection does not require Aggregate snapshots to function, because its source of truth for changes is the JetStream event stream. However, snapshots are still relevant in two ways:
- Aggregate Snapshots (Write Side): Aggregates persist versioned snapshots in
edge-storageAggregateStoreto speed up Aggregate rehydration. These snapshots are not a read API for projections and should not be treated as a substitute for consuming events. - Projection State (Read Side): A Projection’s stored View in
edge-storageKvStoreis effectively its own “snapshot” of the read model at a specific checkpoint (stream sequence). - Fast Recovery: On restart, the Projection loads
checkpoint:{tenant_id}:{view_type}, resumes JetStream consumption from the next sequence, and continues updating existing View records in place. No replay is required unless the checkpoint is missing or the view schema/logic has changed. - Optional Seeding: For very large histories, a Projection may optionally seed an initial View state from a recent Aggregate snapshot or an external export, then set its checkpoint to a known JetStream stream sequence and continue consuming events forward from that point. This preserves incremental correctness while reducing rebuild time.
6. Hot Provisioning (Rolling Scale + Rolling Upgrades)
Projections are designed to be provisioned and updated without downtime.
- Hot Scale-Out: Multiple Projection replicas can run concurrently per
tenant_idandview_type. JetStream consumer configuration is used to ensure each event is processed by exactly one replica within a replica set. - Hot Restart: A restarted instance resumes from the persisted checkpoint and continues consumption; recovery time is proportional to the gap between the checkpoint and the stream tail.
- Hot Upgrade (Projection Logic): To change a
projectprogram safely:- Deploy a new Projection version under a new
view_type(orview_type+ version suffix) with its own checkpoint. - Backfill by consuming from sequence 1 (or from a chosen seed sequence) until caught up.
- Switch query routing from the old view keys to the new view keys.
- Retire old view data and checkpoint after the cutover.
- Deploy a new Projection version under a new
- In-Place Migration: If the schema change is backward compatible, a Projection may evolve the stored View shape incrementally while processing events, but this requires strict versioning in the View payload.
7. Caveats & Operational Notes
- Ordering Guarantees: JetStream preserves ordering per stream, but if the Projection processes messages concurrently it can violate per-entity ordering. If ordering matters for a
view_id, enforce per-key serialization in the Projection. - At-Least-Once Reality: Redeliveries can happen (network splits, ack timeouts, restarts). The Projection must be idempotent via checkpoint checks and/or per-event dedupe keyed by stream sequence.
- Ack Discipline: Never ack before the MDBX transaction commits. Treat “view update + checkpoint update + ack” as one logical commit.
- Poison Messages: A single malformed event or incompatible schema can stall a durable consumer. Define a policy for retries, quarantine, and alerting (including whether to skip and record the failure).
- Schema Evolution: Projection logic must be able to handle old event versions or explicitly version the stream/subjects. Projection View schemas also need versioning if you support in-place migrations.
- Backpressure & Lag: Catch-up replays can saturate storage and CPU. Monitor consumer lag, redeliveries, and processing latency; apply limits (max in-flight, batching) to protect the node.
- Rebuild Semantics: Rebuilds must delete both View keys and checkpoints for the target
tenant_id/view_type. Partial deletes can create “mixed era” views. - Cross-View Lookups: Using
KvStore::query()to join across projections is convenient but can amplify read load and introduce consistency anomalies between views. Prefer event-local computation when possible.
8. Data Structure (The View Envelope)
view_id: The unique key for the record (e.g.,user_id). Used in KvStore key:view:{tenant_id}:{view_type}:{view_id}.view_type: The name of the projection (e.g.,active_users_list).last_event_sequence: The checkpoint (JetStream stream sequence) of the last event processed. Stored separately incheckpoint:{tenant_id}:{view_type}.data: The actual payload (JSON) optimized for the UI or API, stored as the KvStore value.
💡 Key Distinction for your PRD:
In your framework, the Projection is where the "Distributed" part of the system becomes visible to the user.
- Aggregates are for Consistency (The Truth).
- Projections are for Performance (The Speed).