Kubernetes
Tenzro Network's production testnet runs on Google Kubernetes Engine (GKE) with StatefulSets for validators, Deployments for RPC nodes, and Caddy as a reverse proxy with automatic Let's Encrypt TLS. This guide covers Kubernetes manifests, deployment strategies, and operational best practices.
Architecture Overview
The testnet deployment consists of:
3 Validator Nodes — StatefulSet with persistent volumes
1 RPC Node — Deployment serving JSON-RPC, Web API, MCP, and A2A
Caddy Reverse Proxy — LoadBalancer service with auto-TLS
Services — ClusterIP for internal communication, LoadBalancer for external access
Kubernetes Cluster — Managed Kubernetes in your cloud provider region
Kubernetes Cluster Configuration
Production deployments use managed Kubernetes services (GKE, EKS, AKS) with node pools optimized for validator and RPC workloads.
# Example: Create cluster with cloud provider CLI # Adjust for your cloud provider (gcloud, aws, az) kubectl create cluster your-tenzro-cluster \ --region your-region \ --num-nodes 4 \ --machine-type recommended-instance-type \ --disk-type ssd \ --disk-size 100 \ --enable-autoscaling \ --min-nodes 3 \ --max-nodes 6 # Get cluster credentials kubectl config use-context your-tenzro-cluster # Verify cluster access kubectl cluster-info kubectl get nodes
Validator StatefulSet
Validators use StatefulSets for stable network identities and persistent storage. Each validator gets a unique PersistentVolumeClaim for RocksDB data.
# validator-statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: tenzro-validator
namespace: default
spec:
serviceName: tenzro-validator
replicas: 3
selector:
matchLabels:
app: tenzro-validator
template:
metadata:
labels:
app: tenzro-validator
spec:
containers:
- name: tenzro-node
image: your-registry/tenzro-node:latest
imagePullPolicy: Always
args:
- --role
- validator
- --listen-addr
- /ip4/0.0.0.0/tcp/9000
- --rpc-addr
- 0.0.0.0:8545
- --data-dir
- /data
ports:
- name: p2p
containerPort: 9000
protocol: TCP
- name: rpc
containerPort: 8545
protocol: TCP
- name: api
containerPort: 8080
protocol: TCP
resources:
requests:
memory: "4Gi"
cpu: "2000m"
limits:
memory: "8Gi"
cpu: "4000m"
volumeMounts:
- name: data
mountPath: /data
env:
- name: RUST_LOG
value: "info"
- name: TENZRO_CHAIN_ID
value: "1337"
livenessProbe:
httpGet:
path: /api/health
port: 8080
initialDelaySeconds: 60
periodSeconds: 30
timeoutSeconds: 10
failureThreshold: 3
readinessProbe:
httpGet:
path: /api/health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: pd-ssd
resources:
requests:
storage: 100GiValidator Service
Headless service for stable DNS names (tenzro-validator-0, tenzro-validator-1, tenzro-validator-2) and P2P discovery.
# validator-service.yaml
apiVersion: v1
kind: Service
metadata:
name: tenzro-validator
namespace: default
spec:
clusterIP: None # Headless service
selector:
app: tenzro-validator
ports:
- name: p2p
port: 9000
targetPort: 9000
protocol: TCP
- name: rpc
port: 8545
targetPort: 8545
protocol: TCP
- name: api
port: 8080
targetPort: 8080
protocol: TCPRPC Deployment
RPC nodes are stateless and use Deployments for easy scaling. They serve JSON-RPC, Web API, MCP, and A2A protocol servers.
# rpc-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: tenzro-rpc
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: tenzro-rpc
template:
metadata:
labels:
app: tenzro-rpc
spec:
containers:
- name: tenzro-node
image: your-registry/tenzro-node:latest
imagePullPolicy: Always
args:
- --role
- light-client
- --rpc-addr
- 0.0.0.0:8545
- --mcp-addr
- 0.0.0.0:3001
- --a2a-addr
- 0.0.0.0:3002
- --data-dir
- /data
ports:
- name: rpc
containerPort: 8545
protocol: TCP
- name: api
containerPort: 8080
protocol: TCP
- name: mcp
containerPort: 3001
protocol: TCP
- name: a2a
containerPort: 3002
protocol: TCP
resources:
requests:
memory: "2Gi"
cpu: "1000m"
limits:
memory: "4Gi"
cpu: "2000m"
volumeMounts:
- name: data
mountPath: /data
env:
- name: RUST_LOG
value: "info"
livenessProbe:
httpGet:
path: /api/health
port: 8080
initialDelaySeconds: 30
periodSeconds: 30
readinessProbe:
httpGet:
path: /api/health
port: 8080
initialDelaySeconds: 10
periodSeconds: 10
volumes:
- name: data
emptyDir: {} # RPC nodes don't need persistenceRPC Service
ClusterIP service for internal routing. External access goes through Caddy reverse proxy.
# rpc-service.yaml
apiVersion: v1
kind: Service
metadata:
name: tenzro-rpc
namespace: default
spec:
type: ClusterIP
selector:
app: tenzro-rpc
ports:
- name: rpc
port: 8545
targetPort: 8545
protocol: TCP
- name: api
port: 8080
targetPort: 8080
protocol: TCP
- name: mcp
port: 3001
targetPort: 3001
protocol: TCP
- name: a2a
port: 3002
targetPort: 3002
protocol: TCPCaddy Reverse Proxy
Caddy provides automatic HTTPS with Let's Encrypt, reverse proxying to RPC nodes, and routing for all public endpoints: rpc.tenzro.network, api.tenzro.network, mcp.tenzro.network, a2a.tenzro.network.
Caddy ConfigMap
# caddy-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: caddy-config
namespace: default
data:
Caddyfile: |
# JSON-RPC endpoint
rpc.tenzro.network {
reverse_proxy tenzro-rpc:8545
log {
output stdout
}
}
# Web API and faucet
api.tenzro.network {
reverse_proxy tenzro-rpc:8080
log {
output stdout
}
}
# MCP server
mcp.tenzro.network {
reverse_proxy tenzro-rpc:3001
log {
output stdout
}
}
# A2A protocol server
a2a.tenzro.network {
reverse_proxy tenzro-rpc:3002
log {
output stdout
}
}Caddy Deployment
# caddy-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: caddy
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: caddy
template:
metadata:
labels:
app: caddy
spec:
containers:
- name: caddy
image: caddy:2-alpine
ports:
- name: http
containerPort: 80
- name: https
containerPort: 443
volumeMounts:
- name: config
mountPath: /etc/caddy
- name: data
mountPath: /data
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "200m"
volumes:
- name: config
configMap:
name: caddy-config
- name: data
persistentVolumeClaim:
claimName: caddy-dataCaddy Service
LoadBalancer service with external IP for HTTPS traffic on ports 80 and 443.
# caddy-service.yaml
apiVersion: v1
kind: Service
metadata:
name: caddy
namespace: default
spec:
type: LoadBalancer
selector:
app: caddy
ports:
- name: http
port: 80
targetPort: 80
protocol: TCP
- name: https
port: 443
targetPort: 443
protocol: TCP
---
# caddy-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: caddy-data
namespace: default
spec:
accessModes:
- ReadWriteOnce
storageClassName: pd-ssd
resources:
requests:
storage: 10GiConfigMaps
Store configuration as ConfigMaps for easy updates without rebuilding images. Genesis block configuration, network parameters, and bootstrap nodes can be defined here.
# tenzro-config.yaml apiVersion: v1 kind: ConfigMap metadata: name: tenzro-config namespace: default data: chain_id: "1337" network_fee: "0.005" genesis_supply: "1000000000000000000000000000" # 1B TNZO faucet_allocation: "10000000000000000000000000" # 10M TNZO faucet_amount: "100000000000000000000" # 100 TNZO per request faucet_cooldown: "86400" # 24 hours
Secrets
Store sensitive data like validator keys and API credentials in Kubernetes Secrets. Never commit secrets to version control.
# Create secret from file
kubectl create secret generic tenzro-validator-keys \
--from-file=validator-0.key \
--from-file=validator-1.key \
--from-file=validator-2.key
# Create secret from literal
kubectl create secret generic tenzro-api-secret \
--from-literal=api-key=your-api-key-here
# View secrets (base64 encoded)
kubectl get secret tenzro-validator-keys -o yaml
# Use secret in pod
spec:
containers:
- name: tenzro-node
volumeMounts:
- name: keys
mountPath: /keys
readOnly: true
volumes:
- name: keys
secret:
secretName: tenzro-validator-keys
defaultMode: 0400Deployment Commands
Initial Deployment
# Apply all manifests kubectl apply -f deploy/kubernetes/ # Or apply individually in order kubectl apply -f validator-statefulset.yaml kubectl apply -f validator-service.yaml kubectl apply -f rpc-deployment.yaml kubectl apply -f rpc-service.yaml kubectl apply -f caddy-configmap.yaml kubectl apply -f caddy-deployment.yaml kubectl apply -f caddy-service.yaml # Wait for pods to be ready kubectl wait --for=condition=ready pod -l app=tenzro-validator --timeout=300s kubectl wait --for=condition=ready pod -l app=tenzro-rpc --timeout=120s # Check pod status kubectl get pods -o wide # Check services kubectl get services
Scaling Operations
# Scale validators (requires genesis update) kubectl scale statefulset tenzro-validator --replicas=5 # Scale RPC nodes (horizontal scaling) kubectl scale deployment tenzro-rpc --replicas=3 # Autoscale RPC based on CPU kubectl autoscale deployment tenzro-rpc \ --min=1 --max=5 --cpu-percent=70 # Check autoscaler status kubectl get hpa
Updates and Rollouts
# Update to new image version kubectl set image statefulset/tenzro-validator \ tenzro-node=your-registry/tenzro-node:v0.2.0 kubectl set image deployment/tenzro-rpc \ tenzro-node=your-registry/tenzro-node:v0.2.0 # Watch rollout status kubectl rollout status statefulset/tenzro-validator kubectl rollout status deployment/tenzro-rpc # Rollback to previous version kubectl rollout undo statefulset/tenzro-validator kubectl rollout undo deployment/tenzro-rpc # View rollout history kubectl rollout history statefulset/tenzro-validator
Monitoring and Logs
Log Access
# View logs for all validators kubectl logs -l app=tenzro-validator --tail=100 -f # View logs for specific validator kubectl logs tenzro-validator-0 -f # View logs for RPC nodes kubectl logs -l app=tenzro-rpc --tail=100 -f # View logs for Caddy kubectl logs -l app=caddy -f # View logs from previous container (after crash) kubectl logs tenzro-validator-0 --previous # Stream logs from all containers kubectl logs -f --selector app=tenzro-validator --all-containers=true
Resource Monitoring
# Check resource usage kubectl top nodes kubectl top pods # Describe pod for events and conditions kubectl describe pod tenzro-validator-0 # Check persistent volume usage kubectl get pvc kubectl describe pvc data-tenzro-validator-0 # Port forward for debugging kubectl port-forward tenzro-validator-0 8545:8545 # Execute commands in pod kubectl exec -it tenzro-validator-0 -- /bin/sh
Backup and Recovery
Volume Snapshots
Managed Kubernetes supports VolumeSnapshots for backing up persistent volumes. Create snapshots before major upgrades.
# Create VolumeSnapshot
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: validator-0-snapshot
spec:
volumeSnapshotClassName: pd-ssd-snapshot
source:
persistentVolumeClaimName: data-tenzro-validator-0
# Restore from snapshot
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data-tenzro-validator-0-restored
spec:
dataSource:
name: validator-0-snapshot
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
accessModes:
- ReadWriteOnce
storageClassName: pd-ssd
resources:
requests:
storage: 100GiManual Backup
# Backup validator data to local machine kubectl exec tenzro-validator-0 -- tar czf - /data > validator-0-backup.tar.gz # Restore from backup cat validator-0-backup.tar.gz | kubectl exec -i tenzro-validator-0 -- tar xzf - -C /
Troubleshooting
Pod Crashes
# Check pod events kubectl describe pod tenzro-validator-0 # Check logs from crashed container kubectl logs tenzro-validator-0 --previous # Check resource constraints kubectl top pod tenzro-validator-0 # Check node capacity kubectl describe node <node-name>
Network Issues
# Test service connectivity kubectl run curl-test --image=curlimages/curl -i --rm --restart=Never \ -- curl http://tenzro-rpc:8545 # Check DNS resolution kubectl run dns-test --image=busybox -i --rm --restart=Never \ -- nslookup tenzro-rpc # Check service endpoints kubectl get endpoints tenzro-rpc
Storage Issues
# Check PVC status
kubectl get pvc
# Describe PVC for events
kubectl describe pvc data-tenzro-validator-0
# Check PV (persistent volume)
kubectl get pv
# Resize PVC (if storage class supports it)
kubectl patch pvc data-tenzro-validator-0 \
-p '{"spec":{"resources":{"requests":{"storage":"200Gi"}}}}'Production Checklist
✓ StatefulSets with PersistentVolumeClaims for validators
✓ Resource requests and limits configured
✓ Liveness and readiness probes enabled
✓ Secrets for sensitive data (never in ConfigMaps)
✓ Network policies for pod-to-pod communication
✓ Horizontal Pod Autoscaler for RPC nodes
✓ VolumeSnapshots for backup and disaster recovery
✓ Caddy reverse proxy with automatic TLS
✓ DNS configured for public endpoints
✓ Monitoring and alerting (Prometheus/Grafana)