Deployment

Production deployment guide for running Tenzro Network nodes with Docker, Kubernetes, monitoring, and best practices.

Note: This deployment guide describes the production architecture for Tenzro Network nodes.

System Requirements

Validator Node

CPU: 8+ cores (16+ recommended)
RAM: 32 GB (64 GB recommended)
Storage: 1 TB NVMe SSD (fast I/O critical)
Network: 1 Gbps symmetrical, low latency
TEE: Intel TDX / AMD SEV-SNP (optional, 2x leader weight)

Model Provider Node

CPU: 32+ cores
RAM: 128 GB+
GPU: NVIDIA H100 / A100 (for large models)
Storage: 2 TB+ NVMe SSD (model storage)
Network: 10 Gbps

Light Client

CPU: 2+ cores
RAM: 4 GB
Storage: 50 GB SSD
Network: 100 Mbps

Docker Deployment

Dockerfile

FROM rust:1.75 as builder

WORKDIR /build
COPY . .

RUN cargo build --release --bin tenzro-node

FROM debian:bookworm-slim

RUN apt-get update && apt-get install -y \
    ca-certificates \
    libssl3 \
    librocksdb-dev \
    && rm -rf /var/lib/apt/lists/*

COPY --from=builder /build/target/release/tenzro-node /usr/local/bin/

EXPOSE 9000 8545 8080

ENTRYPOINT ["tenzro-node"]

docker-compose.yml

version: '3.8'

services:
  tenzro-validator:
    build: .
    container_name: tenzro-validator
    restart: unless-stopped
    ports:
      - "9000:9000"   # P2P
      - "8545:8545"   # JSON-RPC
      - "8080:8080"   # Web Verify API
    volumes:
      - ./data:/data
    environment:
      - RUST_LOG=info
    command:
      - --role=validator
      - --listen-addr=/ip4/0.0.0.0/tcp/9000
      - --rpc-addr=0.0.0.0:8545
      - --data-dir=/data
      - --boot-nodes=/ip4/your-bootstrap-ip/tcp/9000/p2p/12D3KooW...

  prometheus:
    image: prom/prometheus:latest
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus-data:/prometheus

  grafana:
    image: grafana/grafana:latest
    ports:
      - "3000:3000"
    volumes:
      - grafana-data:/var/lib/grafana

volumes:
  prometheus-data:
  grafana-data:

Kubernetes Deployment

validator-deployment.yaml

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: tenzro-validator
spec:
  serviceName: tenzro-validator
  replicas: 1
  selector:
    matchLabels:
      app: tenzro-validator
  template:
    metadata:
      labels:
        app: tenzro-validator
    spec:
      containers:
      - name: tenzro-node
        image: tenzro/tenzro-node:latest
        ports:
        - containerPort: 9000
          name: p2p
        - containerPort: 8545
          name: rpc
        - containerPort: 8080
          name: verify-api
        volumeMounts:
        - name: data
          mountPath: /data
        resources:
          requests:
            memory: "32Gi"
            cpu: "8"
          limits:
            memory: "64Gi"
            cpu: "16"
        env:
        - name: RUST_LOG
          value: "info"
        args:
        - --role=validator
        - --listen-addr=/ip4/0.0.0.0/tcp/9000
        - --rpc-addr=0.0.0.0:8545
        - --data-dir=/data
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 1Ti
      storageClassName: fast-ssd

service.yaml

apiVersion: v1
kind: Service
metadata:
  name: tenzro-validator
spec:
  type: LoadBalancer
  selector:
    app: tenzro-validator
  ports:
  - name: p2p
    port: 9000
    targetPort: 9000
  - name: rpc
    port: 8545
    targetPort: 8545
  - name: verify-api
    port: 8080
    targetPort: 8080

Reverse Proxy (Caddy)

# Caddyfile
rpc.tenzro.example.com {
    reverse_proxy localhost:8545
}

verify.tenzro.example.com {
    reverse_proxy localhost:8080
}

# Rate limiting
@rpc {
    path /
}
rate_limit @rpc {
    zone rpc_zone 10m
    rate 100r/s
}

Monitoring

Prometheus Configuration

# prometheus.yml
global:
  scrape_interval: 15s

scrape_configs:
  - job_name: 'tenzro-node'
    static_configs:
      - targets: ['localhost:9091']
    metrics_path: '/metrics'

Key Metrics

tenzro_block_height — Current block height (gauge)
tenzro_blocks_processed_total — Cumulative blocks processed (counter)
tenzro_blocks_per_second — Block-production rate (gauge)
tenzro_transactions_processed_total — Cumulative tx processed (counter)
tenzro_transactions_per_second — Tx throughput (gauge)
tenzro_peer_count — Connected peers (gauge)
tenzro_inference_requests_total — AI inference requests served (counter)
tenzro_settlements_total — Settlements processed (counter)
tenzro_model_services_count — Active model services (gauge)
tenzro_node_uptime_seconds — Node uptime (counter)
tenzro_node_health — Health status (gauge: 1 = healthy, 0 = unhealthy)

Backup and Recovery

Create Snapshot

# Create snapshot
tenzro node snapshot create \
  --output ./snapshot-$(date +%Y%m%d).tar.gz

# Upload to cloud storage
aws s3 cp ./snapshot-*.tar.gz s3://tenzro-backups/

Restore from Snapshot

# Download snapshot
aws s3 cp s3://tenzro-backups/snapshot-20260320.tar.gz ./

# Restore
tenzro node snapshot restore \
  --input ./snapshot-20260320.tar.gz \
  --data-dir ~/.tenzro

Security Best Practices

Firewall: Only expose P2P port (9000) publicly. Restrict RPC (8545) to trusted IPs.
TLS: Use reverse proxy (Caddy/nginx) with TLS for RPC/API endpoints.
Key Management: Store validator keys in hardware security modules (HSM) or TEE.
Updates: Subscribe to security advisories, apply patches promptly.
Monitoring: Alert on unusual peer count, block height lag, high resource usage.
DDoS Protection: Use Cloudflare or similar for public endpoints.

Logging

# Set log level
export RUST_LOG=tenzro_node=info,tenzro_consensus=debug

# Log to file
tenzro-node ... 2>&1 | tee -a tenzro-node.log

# Structured JSON logs
export RUST_LOG_FORMAT=json

Health Checks

# Liveness check
curl http://localhost:8080/health

# Readiness check
curl http://localhost:8545 \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"net_peerCount","params":[],"id":1}'

# Expected response: peer count > 0

Troubleshooting

Node Not Syncing

Check peer count: tenzro info
Verify bootstrap nodes are reachable
Check firewall allows port 9000 inbound
Review logs for connection errors

High Memory Usage

Reduce RocksDB cache: --storage-cache-size 512MB
Lower mempool size: --mempool-max-size 5000
Check for memory leaks in logs

RPC Timeout

Increase timeout: --rpc-timeout 60000
Check CPU/disk I/O saturation
Consider scaling to dedicated RPC nodes

← Networking Next: FAQ →