Tenzro Testnet is live. Get testnet TNZO

Kubernetes

Tenzro Network's production testnet runs on Google Kubernetes Engine (GKE) with StatefulSets for validators, Deployments for RPC nodes, and Caddy as a reverse proxy with automatic Let's Encrypt TLS. This guide covers Kubernetes manifests, deployment strategies, and operational best practices.

Architecture Overview

The testnet deployment consists of:

3 Validator Nodes — StatefulSet with persistent volumes

1 RPC Node — Deployment serving JSON-RPC, Web API, MCP, and A2A

Caddy Reverse Proxy — LoadBalancer service with auto-TLS

Services — ClusterIP for internal communication, LoadBalancer for external access

Kubernetes Cluster — Managed Kubernetes in your cloud provider region

Kubernetes Cluster Configuration

Production deployments use managed Kubernetes services (GKE, EKS, AKS) with node pools optimized for validator and RPC workloads.

# Example: Create cluster with cloud provider CLI
# Adjust for your cloud provider (gcloud, aws, az)
kubectl create cluster your-tenzro-cluster \
  --region your-region \
  --num-nodes 4 \
  --machine-type recommended-instance-type \
  --disk-type ssd \
  --disk-size 100 \
  --enable-autoscaling \
  --min-nodes 3 \
  --max-nodes 6

# Get cluster credentials
kubectl config use-context your-tenzro-cluster

# Verify cluster access
kubectl cluster-info
kubectl get nodes

Validator StatefulSet

Validators use StatefulSets for stable network identities and persistent storage. Each validator gets a unique PersistentVolumeClaim for RocksDB data.

# validator-statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: tenzro-validator
  namespace: default
spec:
  serviceName: tenzro-validator
  replicas: 3
  selector:
    matchLabels:
      app: tenzro-validator
  template:
    metadata:
      labels:
        app: tenzro-validator
    spec:
      containers:
      - name: tenzro-node
        image: your-registry/tenzro-node:latest
        imagePullPolicy: Always
        args:
        - --role
        - validator
        - --listen-addr
        - /ip4/0.0.0.0/tcp/9000
        - --rpc-addr
        - 0.0.0.0:8545
        - --data-dir
        - /data
        ports:
        - name: p2p
          containerPort: 9000
          protocol: TCP
        - name: rpc
          containerPort: 8545
          protocol: TCP
        - name: api
          containerPort: 8080
          protocol: TCP
        resources:
          requests:
            memory: "4Gi"
            cpu: "2000m"
          limits:
            memory: "8Gi"
            cpu: "4000m"
        volumeMounts:
        - name: data
          mountPath: /data
        env:
        - name: RUST_LOG
          value: "info"
        - name: TENZRO_CHAIN_ID
          value: "1337"
        livenessProbe:
          httpGet:
            path: /api/health
            port: 8080
          initialDelaySeconds: 60
          periodSeconds: 30
          timeoutSeconds: 10
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /api/health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: pd-ssd
      resources:
        requests:
          storage: 100Gi

Validator Service

Headless service for stable DNS names (tenzro-validator-0, tenzro-validator-1, tenzro-validator-2) and P2P discovery.

# validator-service.yaml
apiVersion: v1
kind: Service
metadata:
  name: tenzro-validator
  namespace: default
spec:
  clusterIP: None  # Headless service
  selector:
    app: tenzro-validator
  ports:
  - name: p2p
    port: 9000
    targetPort: 9000
    protocol: TCP
  - name: rpc
    port: 8545
    targetPort: 8545
    protocol: TCP
  - name: api
    port: 8080
    targetPort: 8080
    protocol: TCP

RPC Deployment

RPC nodes are stateless and use Deployments for easy scaling. They serve JSON-RPC, Web API, MCP, and A2A protocol servers.

# rpc-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: tenzro-rpc
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: tenzro-rpc
  template:
    metadata:
      labels:
        app: tenzro-rpc
    spec:
      containers:
      - name: tenzro-node
        image: your-registry/tenzro-node:latest
        imagePullPolicy: Always
        args:
        - --role
        - light-client
        - --rpc-addr
        - 0.0.0.0:8545
        - --mcp-addr
        - 0.0.0.0:3001
        - --a2a-addr
        - 0.0.0.0:3002
        - --data-dir
        - /data
        ports:
        - name: rpc
          containerPort: 8545
          protocol: TCP
        - name: api
          containerPort: 8080
          protocol: TCP
        - name: mcp
          containerPort: 3001
          protocol: TCP
        - name: a2a
          containerPort: 3002
          protocol: TCP
        resources:
          requests:
            memory: "2Gi"
            cpu: "1000m"
          limits:
            memory: "4Gi"
            cpu: "2000m"
        volumeMounts:
        - name: data
          mountPath: /data
        env:
        - name: RUST_LOG
          value: "info"
        livenessProbe:
          httpGet:
            path: /api/health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 30
        readinessProbe:
          httpGet:
            path: /api/health
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 10
      volumes:
      - name: data
        emptyDir: {}  # RPC nodes don't need persistence

RPC Service

ClusterIP service for internal routing. External access goes through Caddy reverse proxy.

# rpc-service.yaml
apiVersion: v1
kind: Service
metadata:
  name: tenzro-rpc
  namespace: default
spec:
  type: ClusterIP
  selector:
    app: tenzro-rpc
  ports:
  - name: rpc
    port: 8545
    targetPort: 8545
    protocol: TCP
  - name: api
    port: 8080
    targetPort: 8080
    protocol: TCP
  - name: mcp
    port: 3001
    targetPort: 3001
    protocol: TCP
  - name: a2a
    port: 3002
    targetPort: 3002
    protocol: TCP

Caddy Reverse Proxy

Caddy provides automatic HTTPS with Let's Encrypt, reverse proxying to RPC nodes, and routing for all public endpoints: rpc.tenzro.network, api.tenzro.network, mcp.tenzro.network, a2a.tenzro.network.

Caddy ConfigMap

# caddy-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: caddy-config
  namespace: default
data:
  Caddyfile: |
    # JSON-RPC endpoint
    rpc.tenzro.network {
      reverse_proxy tenzro-rpc:8545
      log {
        output stdout
      }
    }

    # Web API and faucet
    api.tenzro.network {
      reverse_proxy tenzro-rpc:8080
      log {
        output stdout
      }
    }

    # MCP server
    mcp.tenzro.network {
      reverse_proxy tenzro-rpc:3001
      log {
        output stdout
      }
    }

    # A2A protocol server
    a2a.tenzro.network {
      reverse_proxy tenzro-rpc:3002
      log {
        output stdout
      }
    }

Caddy Deployment

# caddy-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: caddy
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: caddy
  template:
    metadata:
      labels:
        app: caddy
    spec:
      containers:
      - name: caddy
        image: caddy:2-alpine
        ports:
        - name: http
          containerPort: 80
        - name: https
          containerPort: 443
        volumeMounts:
        - name: config
          mountPath: /etc/caddy
        - name: data
          mountPath: /data
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "256Mi"
            cpu: "200m"
      volumes:
      - name: config
        configMap:
          name: caddy-config
      - name: data
        persistentVolumeClaim:
          claimName: caddy-data

Caddy Service

LoadBalancer service with external IP for HTTPS traffic on ports 80 and 443.

# caddy-service.yaml
apiVersion: v1
kind: Service
metadata:
  name: caddy
  namespace: default
spec:
  type: LoadBalancer
  selector:
    app: caddy
  ports:
  - name: http
    port: 80
    targetPort: 80
    protocol: TCP
  - name: https
    port: 443
    targetPort: 443
    protocol: TCP

---
# caddy-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: caddy-data
  namespace: default
spec:
  accessModes:
  - ReadWriteOnce
  storageClassName: pd-ssd
  resources:
    requests:
      storage: 10Gi

ConfigMaps

Store configuration as ConfigMaps for easy updates without rebuilding images. Genesis block configuration, network parameters, and bootstrap nodes can be defined here.

# tenzro-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: tenzro-config
  namespace: default
data:
  chain_id: "1337"
  network_fee: "0.005"
  genesis_supply: "1000000000000000000000000000"  # 1B TNZO
  faucet_allocation: "10000000000000000000000000"  # 10M TNZO
  faucet_amount: "100000000000000000000"  # 100 TNZO per request
  faucet_cooldown: "86400"  # 24 hours

Secrets

Store sensitive data like validator keys and API credentials in Kubernetes Secrets. Never commit secrets to version control.

# Create secret from file
kubectl create secret generic tenzro-validator-keys \
  --from-file=validator-0.key \
  --from-file=validator-1.key \
  --from-file=validator-2.key

# Create secret from literal
kubectl create secret generic tenzro-api-secret \
  --from-literal=api-key=your-api-key-here

# View secrets (base64 encoded)
kubectl get secret tenzro-validator-keys -o yaml

# Use secret in pod
spec:
  containers:
  - name: tenzro-node
    volumeMounts:
    - name: keys
      mountPath: /keys
      readOnly: true
  volumes:
  - name: keys
    secret:
      secretName: tenzro-validator-keys
      defaultMode: 0400

Deployment Commands

Initial Deployment

# Apply all manifests
kubectl apply -f deploy/kubernetes/

# Or apply individually in order
kubectl apply -f validator-statefulset.yaml
kubectl apply -f validator-service.yaml
kubectl apply -f rpc-deployment.yaml
kubectl apply -f rpc-service.yaml
kubectl apply -f caddy-configmap.yaml
kubectl apply -f caddy-deployment.yaml
kubectl apply -f caddy-service.yaml

# Wait for pods to be ready
kubectl wait --for=condition=ready pod -l app=tenzro-validator --timeout=300s
kubectl wait --for=condition=ready pod -l app=tenzro-rpc --timeout=120s

# Check pod status
kubectl get pods -o wide

# Check services
kubectl get services

Scaling Operations

# Scale validators (requires genesis update)
kubectl scale statefulset tenzro-validator --replicas=5

# Scale RPC nodes (horizontal scaling)
kubectl scale deployment tenzro-rpc --replicas=3

# Autoscale RPC based on CPU
kubectl autoscale deployment tenzro-rpc \
  --min=1 --max=5 --cpu-percent=70

# Check autoscaler status
kubectl get hpa

Updates and Rollouts

# Update to new image version
kubectl set image statefulset/tenzro-validator \
  tenzro-node=your-registry/tenzro-node:v0.2.0

kubectl set image deployment/tenzro-rpc \
  tenzro-node=your-registry/tenzro-node:v0.2.0

# Watch rollout status
kubectl rollout status statefulset/tenzro-validator
kubectl rollout status deployment/tenzro-rpc

# Rollback to previous version
kubectl rollout undo statefulset/tenzro-validator
kubectl rollout undo deployment/tenzro-rpc

# View rollout history
kubectl rollout history statefulset/tenzro-validator

Monitoring and Logs

Log Access

# View logs for all validators
kubectl logs -l app=tenzro-validator --tail=100 -f

# View logs for specific validator
kubectl logs tenzro-validator-0 -f

# View logs for RPC nodes
kubectl logs -l app=tenzro-rpc --tail=100 -f

# View logs for Caddy
kubectl logs -l app=caddy -f

# View logs from previous container (after crash)
kubectl logs tenzro-validator-0 --previous

# Stream logs from all containers
kubectl logs -f --selector app=tenzro-validator --all-containers=true

Resource Monitoring

# Check resource usage
kubectl top nodes
kubectl top pods

# Describe pod for events and conditions
kubectl describe pod tenzro-validator-0

# Check persistent volume usage
kubectl get pvc
kubectl describe pvc data-tenzro-validator-0

# Port forward for debugging
kubectl port-forward tenzro-validator-0 8545:8545

# Execute commands in pod
kubectl exec -it tenzro-validator-0 -- /bin/sh

Backup and Recovery

Volume Snapshots

Managed Kubernetes supports VolumeSnapshots for backing up persistent volumes. Create snapshots before major upgrades.

# Create VolumeSnapshot
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: validator-0-snapshot
spec:
  volumeSnapshotClassName: pd-ssd-snapshot
  source:
    persistentVolumeClaimName: data-tenzro-validator-0

# Restore from snapshot
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data-tenzro-validator-0-restored
spec:
  dataSource:
    name: validator-0-snapshot
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
  accessModes:
  - ReadWriteOnce
  storageClassName: pd-ssd
  resources:
    requests:
      storage: 100Gi

Manual Backup

# Backup validator data to local machine
kubectl exec tenzro-validator-0 -- tar czf - /data > validator-0-backup.tar.gz

# Restore from backup
cat validator-0-backup.tar.gz | kubectl exec -i tenzro-validator-0 -- tar xzf - -C /

Troubleshooting

Pod Crashes

# Check pod events
kubectl describe pod tenzro-validator-0

# Check logs from crashed container
kubectl logs tenzro-validator-0 --previous

# Check resource constraints
kubectl top pod tenzro-validator-0

# Check node capacity
kubectl describe node <node-name>

Network Issues

# Test service connectivity
kubectl run curl-test --image=curlimages/curl -i --rm --restart=Never \
  -- curl http://tenzro-rpc:8545

# Check DNS resolution
kubectl run dns-test --image=busybox -i --rm --restart=Never \
  -- nslookup tenzro-rpc

# Check service endpoints
kubectl get endpoints tenzro-rpc

Storage Issues

# Check PVC status
kubectl get pvc

# Describe PVC for events
kubectl describe pvc data-tenzro-validator-0

# Check PV (persistent volume)
kubectl get pv

# Resize PVC (if storage class supports it)
kubectl patch pvc data-tenzro-validator-0 \
  -p '{"spec":{"resources":{"requests":{"storage":"200Gi"}}}}'

Production Checklist

✓ StatefulSets with PersistentVolumeClaims for validators

✓ Resource requests and limits configured

✓ Liveness and readiness probes enabled

✓ Secrets for sensitive data (never in ConfigMaps)

✓ Network policies for pod-to-pod communication

✓ Horizontal Pod Autoscaler for RPC nodes

✓ VolumeSnapshots for backup and disaster recovery

✓ Caddy reverse proxy with automatic TLS

✓ DNS configured for public endpoints

✓ Monitoring and alerting (Prometheus/Grafana)