Files
cloudlysis/scripts/billing/README_ROLLBACK.md
Vlad Durnea 2595e7f1c5
Some checks failed
ci / ui (push) Failing after 28s
ci / rust (push) Failing after 2m40s
images / build-and-push (push) Failing after 19s
feat(billing): implement tenant subscription entitlements system (milestones 0-6)
2026-03-30 18:41:23 +03:00

43 lines
1.7 KiB
Markdown

# Rollback Plan: Billing Enforced Gating
This document outlines the emergency procedure for disabling subscription-based entitlement gating in the Cloudlysis Control Plane.
## Symptoms
- Tenants receiving `402 Payment Required` errors even with valid active subscriptions.
- `JobEngine` refusing valid configuration updates due to incorrect resource limit enforcement.
- S3 Document Storage being inaccessible for authorized Pro/Enterprise tenants.
## Emergency Rollback Steps
### 1. Disable Global Enforcement
The quickest way to restore service is to disable enforcement via the environment variable toggle. This preserves all billing data and synchronization logic but bypasses the "Payment Required" blocks.
```bash
# In your Swarm stack file (e.g. control-plane.yml):
services:
control-api:
environment:
- CONTROL_BILLING_ENFORCEMENT_ENABLED=false
```
### 2. Deploy the Update
Deploy the stack to apply the change:
```bash
docker stack deploy -c control-plane.yml control
```
### 3. Verify System State
Confirm that tenants can now perform previously blocked operations (e.g., uploading documents or updating deployment configurations).
## Forensic Analysis
Once the system is stable, perform the following:
1. **Check Reconciliation Logs**: Look for `failed to fetch subscription` or `failed to apply reconciled billing event`.
2. **Verify Metrics**: Check `billing_webhook_requests_total{status="error"}` in Prometheus.
3. **Audit Drift**: Compare the `CONTROL_BILLING_STATE_PATH` file content against the Stripe Dashboard for the affected `tenant_id`.
## Recovery
To re-enable gating (after the root cause is resolved):
1. Set `CONTROL_BILLING_ENFORCEMENT_ENABLED=true`.
2. Redeploy the stack.
3. Monitor logs and metrics for 30 minutes.