Files
madbase/docs/NODE_TEMPLATES.md
Vlad Durnea cffdf8af86
Some checks failed
CI/CD Pipeline / unit-tests (push) Failing after 1m16s
CI/CD Pipeline / integration-tests (push) Failing after 2m32s
CI/CD Pipeline / lint (push) Successful in 5m22s
CI/CD Pipeline / e2e-tests (push) Has been skipped
CI/CD Pipeline / build (push) Has been skipped
wip:milestone 0 fixes
2026-03-15 12:35:42 +02:00

16 KiB

Node Templates - Quick Reference

Complete guide to MadBase node templates for Hetzner Cloud deployment.

Template Overview

Template Pillar Min Plan Cost/Mo Use Case Services
Template Pillar Zone Min Plan Cost/Mo Use Case
---------- -------- ------ ---------- --------- ----------
system-node System Public CX21 €6.94 Cluster Root
proxy-api-node Proxy / API Public CX11 €3.69 Scalable Ingress
worker-node Worker Private CX11 €3.69 Horizontal scaling
db-node DB / State Private CX21 €6.94 Production database HA
worker-db-combo Mixed CX31 €14.21 Smaller deployments Worker + PostgreSQL + etcd + HAProxy
worker-monitor-combo Mixed CX21 €6.94 Cost-optimized Worker + VictoriaMetrics + Loki
all-in-one Unified CX41 €25.60 Development/MVP All services on one node

= Composite template (mixes multiple service types)


Pure Templates (Single Service Type)

1. Database Node (db-node.yaml)

Best for: Production deployments requiring database HA

Server: CX21 (4GB RAM, 2 vCPU)

Services:

  • PostgreSQL 15 with Patroni (auto-failover)
  • etcd (distributed consensus)
  • HAProxy (connection pooling + read/write splitting)

Scaling: 3-7 nodes (odd number for quorum)

When to use:

  • Production traffic >1000 req/min
  • Need database auto-failover
  • Want separate database cluster

2. Worker Node (worker-node.yaml)

Best for: Horizontal scaling of API workers

Server: CX11 (4GB RAM, 2 vCPU)

Services:

  • MadBase Worker (API processing)
  • vmagent (metrics collection)

Scaling: 1-20 nodes

Auto-scaling rules:

  • Scale up: CPU > 70%
  • Scale down: CPU < 20%

When to use:

  • Need to scale workers independently
  • Separate database cluster already exists
  • Production deployments

3. Control Plane Node (control-plane-node.yaml)

Best for: Management UI and APIs

Server: CX11 (4GB RAM, 2 vCPU)

Services:

  • Gateway Proxy (port 8080)
  • Control Plane API (port 8001)
  • Grafana (port 3030)
  • Keepalived (HA with floating IP)

Scaling: 1-2 nodes (HA mode)

When to use:

  • Need web UI for server management
  • Want to provision servers via API
  • Production deployments

4. Monitoring Node (monitoring-node.yaml)

Best for: Centralized metrics and logging

Server: CX11 (4GB RAM, 2 vCPU)

Services:

  • VictoriaMetrics (metrics database)
  • Loki (log aggregation)
  • Alertmanager (optional)

Scaling: 1-2 nodes (can be HA)

When to use:

  • Production deployments
  • Want centralized monitoring
  • Need log aggregation

Composite Templates (Mix Multiple Service Types)

5. Worker + Database Combo (worker-db-combo.yaml)

Best for: 2-3 server deployments with database and worker on same node

Server: CX31 (8GB RAM, 2 vCPU)

Services:

  • PostgreSQL 15 with Patroni
  • etcd
  • HAProxy
  • MadBase Worker
  • vmagent

Why use this:

  • Cost savings (€6.94 vs €10.63 for separate nodes)
  • Simpler architecture for smaller deployments
  • Easy to scale later

Scaling: 1-2 nodes

Upgrade path: When CPU > 60% or RAM > 70%, migrate to dedicated db-node + worker-node

Deployment example:

Server 1 (worker-db-combo): PostgreSQL + Worker
Server 2 (control-plane): Proxy + Control + Grafana
Server 3 (monitoring): VictoriaMetrics + Loki

6. Worker + Monitoring Combo (worker-monitor-combo.yaml)

Best for: Cost-optimized deployments with monitoring on worker node

Server: CX21 (4GB RAM, 2 vCPU)

Services:

  • MadBase Worker
  • VictoriaMetrics
  • Loki
  • vmagent
  • Promtail

Why use this:

  • Save €3.69/mo (no dedicated monitoring node)
  • Monitoring co-located with worker
  • Good for 2-3 server deployments

Scaling: 1-3 nodes

When to upgrade:

  • Worker CPU > 60% (monitoring competes for resources)
  • Need to scale workers horizontally

Deployment example:

Server 1 (worker-monitor-combo): Worker + VictoriaMetrics + Loki
Server 2 (db-node): PostgreSQL + etcd + HAProxy
Server 3 (control-plane): Proxy + Control + Grafana

7. All-in-One (all-in-one.yaml)

Best for: Development, testing, or MVP deployments

Server: CX41 (16GB RAM, 4 vCPU)

Services: ALL (PostgreSQL, etcd, HAProxy, Redis, MinIO, Workers, Proxy, Control, VictoriaMetrics, Loki, Grafana)

Why use this:

  • Simplest deployment
  • Single server for everything
  • Great for development/testing

When to upgrade:

  • Production traffic > 100 req/min
  • CPU usage > 70% sustained
  • Need HA for database

Monitoring Stack: VictoriaMetrics + Loki

How It Works

┌──────────────┐         ┌──────────────┐         ┌──────────────┐
│   Node 1     │         │   Node 2     │         │   Node 3     │
│              │         │              │         │              │
│ ┌──────────┐ │         │ ┌──────────┐ │         │ ┌──────────┐ │
│ │ vmagent  │─┼─────────┼─│ vmagent  │─┼─────────┼─│ vmagent  │─┼──┐
│ └──────────┘ │         │ └──────────┘ │         │ └──────────┘ │  │
│   Scans:     │         │   Scans:     │         │   Scans:     │  │
│   - worker   │         │   - worker   │         │   - db       │  │
│   - system   │         │   - system   │         │   - system   │  │
└──────────────┘         └──────────────┘         └──────────────┘  │
                                                                       │
                                                                       ▼
                                                    ┌───────────────────────┐
                                                    │  VictoriaMetrics      │
                                                    │  Port: 8428           │
                                                    │  Type: Metrics DB     │
                                                    └───────────┬───────────┘
                                                                │
                                                                ▼
                                                    ┌───────────────────────┐
                                                    │      Grafana          │
                                                    │  Port: 3030           │
                                                    │  Queries VM + Loki    │
                                                    └───────────────────────┘

┌──────────────┐         ┌──────────────┐
│   Node 1     │         │   Node 2     │
│              │         │              │
│ ┌──────────┐ │         │ ┌──────────┐ │
│ │ Promtail │─┼─────────┼─│ Promtail │─┼───┐
│ └──────────┘ │         │ └──────────┘ │   │
│   Reads:     │         │   Reads:     │   │
│   - logs/*   │         │   - logs/*   │   │
└──────────────┘         └──────────────┘   │
                                             │
                                             ▼
                                  ┌───────────────────────┐
                                  │         Loki          │
                                  │  Port: 3100           │
                                  │  Type: Log Aggregation│
                                  └───────────┬───────────┘
                                              │
                                              ▼
                                  ┌───────────────────────┐
                                  │      Grafana          │
                                  │  LogQL Queries        │
                                  └───────────────────────┘

Components

VictoriaMetrics (Metrics Database)

Purpose: Store and query time-series metrics

Location:

  • Dedicated monitoring-node (recommended)
  • worker-monitor-combo (cost-optimized)
  • all-in-one (development)

Data Flow:

  1. vmagent on each node scrapes metrics every 15s
  2. Metrics sent to VictoriaMetrics via remote write
  3. VictoriaMetrics stores metrics with 10x compression
  4. Grafana queries VictoriaMetrics for dashboards

Metrics Collected:

  • Worker: Request rate, error rate, latency, queue depth
  • PostgreSQL: Connections, transactions, replication lag
  • System: CPU, memory, disk, network
  • HAProxy: Connection count, response time

Storage Requirements:

  • ~1GB per million time series per day (compressed)
  • Default retention: 30 days
  • RAM: Minimal, scales with active queries

Loki (Log Aggregation)

Purpose: Store and query logs

Location:

  • Dedicated monitoring-node (recommended)
  • worker-monitor-combo (cost-optimized)
  • all-in-one (development)

Data Flow:

  1. Promtail on each node tails log files
  2. Logs sent to Loki via HTTP API
  3. Loki indexes logs by labels (service, level, host)
  4. Grafana queries Loki using LogQL

Logs Collected:

  • Worker: /var/log/madbase/worker.log
  • PostgreSQL: /var/log/postgresql/*.log
  • System: /var/log/syslog

Storage Requirements:

  • ~10% of raw log size (with compression)
  • Default retention: 30 days
  • RAM: Minimal, scales with active queries

vmagent (Metrics Collector)

Purpose: Scrape metrics and send to VictoriaMetrics

Location: Runs on EVERY node

Port: 8429 (local debug endpoint)

Configuration: config/vmagent.yml

Scrape Targets:

  • Worker: localhost:8002/metrics
  • Patroni: localhost:8008/metrics
  • Node Exporter: localhost:9100/metrics
  • HAProxy: localhost:7000/metrics

Resource Usage:

  • CPU: <5% of 1 core
  • Memory: ~50MB

Promtail (Log Collector)

Purpose: Tail log files and send to Loki

Location: Runs on EVERY node

Configuration: config/promtail.yml

Log Sources:

  • /var/log/madbase/worker.log (worker logs)
  • /var/log/postgresql/*.log (database logs)
  • /var/log/syslog (system logs)

Resource Usage:

  • CPU: <2% of 1 core
  • Memory: ~30MB

Grafana Integration

Grafana connects to both VictoriaMetrics and Loki:

Example Dashboard Query: ``yaml Panel 1: Request Rate (Metrics) Query: rate(http_requests_total[5m])

Panel 2: Error Rate (Metrics) Query: rate(http_requests_total{status=~"5.."}[5m])

Panel 3: Recent Errors (Logs) Query: {level="error"} | line format "{{.message}}"

Panel 4: Trace Request by ID (Logs) Query: {trace_id="abc123"} |= "timeout"


### Deployment Scenarios

#### Scenario 1: Dedicated Monitoring Node (Production)

``yaml
servers:
  - name: server1
    template: control-plane-node
    plan: CX11
  - name: server2
    template: db-node
    plan: CX21
  - name: server3
    template: worker-node
    plan: CX11
  - name: server4
    template: monitoring-node  ← Dedicated monitoring
    plan: CX11

Cost: €17.22/mo (4 servers) Best for: Production with >1000 req/min

Scenario 2: Worker + Monitoring Combo (Cost-Optimized)

``yaml servers:

  • name: server1 template: control-plane-node plan: CX11
  • name: server2 template: db-node plan: CX21
  • name: server3 template: worker-monitor-combo ← Combined plan: CX21

**Cost**: €13.53/mo (3 servers)
**Best for**: Cost-optimized production with <1000 req/min

#### Scenario 3: All-in-One (Development)

``yaml
servers:
  - name: dev-server
    template: all-in-one
    plan: CX41

Cost: €25.60/mo (1 server) Best for: Development, testing, MVP


Deployment Examples

Example 1: Small Production (3 servers)

``yaml Server 1 (CX21 - €6.94): Template: worker-db-combo Services: PostgreSQL + Worker

Server 2 (CX11 - €3.69): Template: control-plane-node Services: Proxy + Control + Grafana

Server 3 (CX11 - €3.69): Template: worker-monitor-combo Services: Worker + VictoriaMetrics + Loki

Total: €14.32/mo


### Example 2: Medium Production (4 servers)

``yaml
Server 1 (CX21 - €6.94):
  Template: db-node
  Services: PostgreSQL + etcd + HAProxy
  
Server 2 (CX11 - €3.69):
  Template: worker-node
  Services: Worker + vmagent
  
Server 3 (CX11 - €3.69):
  Template: control-plane-node
  Services: Proxy + Control + Grafana
  
Server 4 (CX11 - €3.69):
  Template: monitoring-node
  Services: VictoriaMetrics + Loki

Total: €17.22/mo

Example 3: Large Production (6 servers)

``yaml Server 1-3 (CX21 - €6.94 each): Template: db-node Services: PostgreSQL cluster (3 nodes)

Server 4-5 (CX11 - €3.69 each): Template: worker-node Services: Workers (2 nodes)

Server 6 (CX11 - €3.69): Template: control-plane-node Services: Proxy + Control + Grafana + VictoriaMetrics + Loki

Total: €30.70/mo


---

## Template Selection Guide

**Start with these questions**:

1. **What's your budget?**
   - €15/mo → Use composite templates
   - €25/mo → Use pure templates

2. **What's your traffic?**
   - <100 req/min → all-in-one
   - <1000 req/min → worker-db-combo
   - >1000 req/min → pure templates

3. **Do you need database HA?**
   - Yes → db-node (3 nodes minimum)
   - No → worker-db-combo

4. **Do you need centralized monitoring?**
   - Yes → monitoring-node or worker-monitor-combo
   - No → Skip (use worker vmagent only)

---

## Control Plane API Integration

Templates are used by the Control Plane API to provision servers:

```http
POST /api/v1/servers
Content-Type: application/json

{
  "name": "worker-1",
  "template": "worker-node",
  "hetzner_plan": "CX11",
  "region": "fsn1",
  "features": ["worker", "monitoring"],
  "environment": "production"
}

Response: ``json { "server_id": "abc123", "status": "provisioning", "ip_address": "167.235.123.45", "services": [ {"name": "worker", "port": 8002}, {"name": "vmagent", "port": 8429} ] }


---

## Resource Profiles

Each service can be tuned with resource profiles:

``yaml
minimal:
  cpu_limit: "0.5"
  memory_limit: "512Mi"
  
balanced:
  cpu_limit: "2"
  memory_limit: "2Gi"
  
cpu_intensive:
  cpu_limit: "4"
  memory_limit: "4Gi"

Default profiles are assigned in templates but can be overridden:

POST /api/v1/servers

{
  "template": "worker-node",
  "overrides": {
    "worker": {
      "resource_profile": "cpu_intensive"
    }
  }
}

Next Steps

  1. Choose template based on budget and traffic
  2. Provision servers via Control Plane API or Hetzner CLI
  3. Configure monitoring (vmagent + promtail)
  4. Verify health with Grafana dashboards
  5. Scale up/down as needed

For more details, see:

  • STORAGE_CONFIGURATION.md - Storage backend setup
  • QUICKSTART_HETZNER_STORAGE.md - Hetzner Bucket Storage guide
  • 4SERVER_DEPLOYMENT_GUIDE.md - Multi-server deployment