Kubernetes
Tenzro Network's production testnet runs on Google Kubernetes Engine (GKE) with StatefulSets for validators, Deployments for RPC nodes, and Caddy as a reverse proxy with automatic Let's Encrypt TLS. This guide covers Kubernetes manifests, deployment strategies, and operational best practices.
Architecture Overview
The testnet deployment consists of:
3 Validator Nodes — StatefulSet with persistent volumes
1 RPC Node — Deployment serving JSON-RPC, Web API, MCP, and A2A
Caddy Reverse Proxy — LoadBalancer service with auto-TLS
Services — ClusterIP for internal communication, LoadBalancer for external access
Kubernetes Cluster — Managed Kubernetes in your cloud provider region
Kubernetes Cluster Configuration
Production deployments use managed Kubernetes services (GKE, EKS, AKS) with node pools optimized for validator and RPC workloads.
# Example: Create cluster with cloud provider CLI
# Adjust for your cloud provider (gcloud, aws, az)
kubectl create cluster your-tenzro-cluster \
--region your-region \
--num-nodes 4 \
--machine-type recommended-instance-type \
--disk-type ssd \
--disk-size 100 \
--enable-autoscaling \
--min-nodes 3 \
--max-nodes 6
# Get cluster credentials
kubectl config use-context your-tenzro-cluster
# Verify cluster access
kubectl cluster-info
kubectl get nodesValidator StatefulSet
Validators use StatefulSets for stable network identities and persistent storage. Each validator gets a unique PersistentVolumeClaim for RocksDB data.
# validator-statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: tenzro-validator
namespace: default
spec:
serviceName: tenzro-validator
replicas: 3
selector:
matchLabels:
app: tenzro-validator
template:
metadata:
labels:
app: tenzro-validator
spec:
containers:
- name: tenzro-node
image: your-registry/tenzro-node:latest
imagePullPolicy: Always
args:
- --role
- validator
- --listen-addr
- /ip4/0.0.0.0/tcp/9000
- --rpc-addr
- 0.0.0.0:8545
- --data-dir
- /data
ports:
- name: p2p
containerPort: 9000
protocol: TCP
- name: rpc
containerPort: 8545
protocol: TCP
- name: api
containerPort: 8080
protocol: TCP
resources:
requests:
memory: "4Gi"
cpu: "2000m"
limits:
memory: "8Gi"
cpu: "4000m"
volumeMounts:
- name: data
mountPath: /data
env:
- name: RUST_LOG
value: "info"
- name: TENZRO_CHAIN_ID
value: "1337"
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 60
periodSeconds: 30
timeoutSeconds: 10
failureThreshold: 3
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: pd-ssd
resources:
requests:
storage: 100GiValidator Service
Headless service for stable DNS names (tenzro-validator-0, tenzro-validator-1, tenzro-validator-2) and P2P discovery.
# validator-service.yaml
apiVersion: v1
kind: Service
metadata:
name: tenzro-validator
namespace: default
spec:
clusterIP: None # Headless service
selector:
app: tenzro-validator
ports:
- name: p2p
port: 9000
targetPort: 9000
protocol: TCP
- name: rpc
port: 8545
targetPort: 8545
protocol: TCP
- name: api
port: 8080
targetPort: 8080
protocol: TCPRPC Deployment
RPC nodes are stateless and use Deployments for easy scaling. They serve JSON-RPC, Web API, MCP, and A2A protocol servers.
# rpc-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: tenzro-rpc
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: tenzro-rpc
template:
metadata:
labels:
app: tenzro-rpc
spec:
containers:
- name: tenzro-node
image: your-registry/tenzro-node:latest
imagePullPolicy: Always
args:
- --role
- light-client
- --rpc-addr
- 0.0.0.0:8545
- --mcp-addr
- 0.0.0.0:3001
- --a2a-addr
- 0.0.0.0:3002
- --data-dir
- /data
ports:
- name: rpc
containerPort: 8545
protocol: TCP
- name: api
containerPort: 8080
protocol: TCP
- name: mcp
containerPort: 3001
protocol: TCP
- name: a2a
containerPort: 3002
protocol: TCP
resources:
requests:
memory: "2Gi"
cpu: "1000m"
limits:
memory: "4Gi"
cpu: "2000m"
volumeMounts:
- name: data
mountPath: /data
env:
- name: RUST_LOG
value: "info"
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 30
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 10
periodSeconds: 10
volumes:
- name: data
emptyDir: {} # RPC nodes don't need persistenceRPC Service
ClusterIP service for internal routing. External access goes through Caddy reverse proxy.
# rpc-service.yaml
apiVersion: v1
kind: Service
metadata:
name: tenzro-rpc
namespace: default
spec:
type: ClusterIP
selector:
app: tenzro-rpc
ports:
- name: rpc
port: 8545
targetPort: 8545
protocol: TCP
- name: api
port: 8080
targetPort: 8080
protocol: TCP
- name: mcp
port: 3001
targetPort: 3001
protocol: TCP
- name: a2a
port: 3002
targetPort: 3002
protocol: TCPCaddy Reverse Proxy
Caddy provides automatic HTTPS with Let's Encrypt, reverse proxying to RPC nodes, and routing for all public endpoints: rpc.tenzro.network, api.tenzro.network, mcp.tenzro.network, a2a.tenzro.network.
Caddy ConfigMap
# caddy-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: caddy-config
namespace: default
data:
Caddyfile: |
# JSON-RPC endpoint
rpc.tenzro.network {
reverse_proxy tenzro-rpc:8545
log {
output stdout
}
}
# Web API and faucet
api.tenzro.network {
reverse_proxy tenzro-rpc:8080
log {
output stdout
}
}
# MCP server
mcp.tenzro.network {
reverse_proxy tenzro-rpc:3001
log {
output stdout
}
}
# A2A protocol server
a2a.tenzro.network {
reverse_proxy tenzro-rpc:3002
log {
output stdout
}
}Caddy Deployment
# caddy-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: caddy
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: caddy
template:
metadata:
labels:
app: caddy
spec:
containers:
- name: caddy
image: caddy:2-alpine
ports:
- name: http
containerPort: 80
- name: https
containerPort: 443
volumeMounts:
- name: config
mountPath: /etc/caddy
- name: data
mountPath: /data
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "200m"
volumes:
- name: config
configMap:
name: caddy-config
- name: data
persistentVolumeClaim:
claimName: caddy-dataCaddy Service
LoadBalancer service with external IP for HTTPS traffic on ports 80 and 443.
# caddy-service.yaml
apiVersion: v1
kind: Service
metadata:
name: caddy
namespace: default
spec:
type: LoadBalancer
selector:
app: caddy
ports:
- name: http
port: 80
targetPort: 80
protocol: TCP
- name: https
port: 443
targetPort: 443
protocol: TCP
---
# caddy-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: caddy-data
namespace: default
spec:
accessModes:
- ReadWriteOnce
storageClassName: pd-ssd
resources:
requests:
storage: 10GiConfigMaps
Store configuration as ConfigMaps for easy updates without rebuilding images. Genesis block configuration, network parameters, and bootstrap nodes can be defined here.
# tenzro-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: tenzro-config
namespace: default
data:
chain_id: "1337"
network_fee: "0.005"
genesis_supply: "1000000000000000000000000000" # 1B TNZO
faucet_allocation: "10000000000000000000000000" # 10M TNZO
faucet_amount: "100000000000000000000" # 100 TNZO per request
faucet_cooldown: "86400" # 24 hoursSecrets
Store sensitive data like validator keys and API credentials in Kubernetes Secrets. Never commit secrets to version control.
# Create secret from file
kubectl create secret generic tenzro-validator-keys \
--from-file=validator-0.key \
--from-file=validator-1.key \
--from-file=validator-2.key
# Create secret from literal
kubectl create secret generic tenzro-api-secret \
--from-literal=api-key=your-api-key-here
# View secrets (base64 encoded)
kubectl get secret tenzro-validator-keys -o yaml
# Use secret in pod
spec:
containers:
- name: tenzro-node
volumeMounts:
- name: keys
mountPath: /keys
readOnly: true
volumes:
- name: keys
secret:
secretName: tenzro-validator-keys
defaultMode: 0400Deployment Commands
Initial Deployment
# Apply all manifests
kubectl apply -f deploy/kubernetes/
# Or apply individually in order
kubectl apply -f validator-statefulset.yaml
kubectl apply -f validator-service.yaml
kubectl apply -f rpc-deployment.yaml
kubectl apply -f rpc-service.yaml
kubectl apply -f caddy-configmap.yaml
kubectl apply -f caddy-deployment.yaml
kubectl apply -f caddy-service.yaml
# Wait for pods to be ready
kubectl wait --for=condition=ready pod -l app=tenzro-validator --timeout=300s
kubectl wait --for=condition=ready pod -l app=tenzro-rpc --timeout=120s
# Check pod status
kubectl get pods -o wide
# Check services
kubectl get servicesScaling Operations
# Scale validators (requires genesis update)
kubectl scale statefulset tenzro-validator --replicas=5
# Scale RPC nodes (horizontal scaling)
kubectl scale deployment tenzro-rpc --replicas=3
# Autoscale RPC based on CPU
kubectl autoscale deployment tenzro-rpc \
--min=1 --max=5 --cpu-percent=70
# Check autoscaler status
kubectl get hpaUpdates and Rollouts
# Update to new image version
kubectl set image statefulset/tenzro-validator \
tenzro-node=your-registry/tenzro-node:v0.2.0
kubectl set image deployment/tenzro-rpc \
tenzro-node=your-registry/tenzro-node:v0.2.0
# Watch rollout status
kubectl rollout status statefulset/tenzro-validator
kubectl rollout status deployment/tenzro-rpc
# Rollback to previous version
kubectl rollout undo statefulset/tenzro-validator
kubectl rollout undo deployment/tenzro-rpc
# View rollout history
kubectl rollout history statefulset/tenzro-validatorMonitoring and Logs
Log Access
# View logs for all validators
kubectl logs -l app=tenzro-validator --tail=100 -f
# View logs for specific validator
kubectl logs tenzro-validator-0 -f
# View logs for RPC nodes
kubectl logs -l app=tenzro-rpc --tail=100 -f
# View logs for Caddy
kubectl logs -l app=caddy -f
# View logs from previous container (after crash)
kubectl logs tenzro-validator-0 --previous
# Stream logs from all containers
kubectl logs -f --selector app=tenzro-validator --all-containers=trueResource Monitoring
# Check resource usage
kubectl top nodes
kubectl top pods
# Describe pod for events and conditions
kubectl describe pod tenzro-validator-0
# Check persistent volume usage
kubectl get pvc
kubectl describe pvc data-tenzro-validator-0
# Port forward for debugging
kubectl port-forward tenzro-validator-0 8545:8545
# Execute commands in pod
kubectl exec -it tenzro-validator-0 -- /bin/shBackup and Recovery
Volume Snapshots
Managed Kubernetes supports VolumeSnapshots for backing up persistent volumes. Create snapshots before major upgrades.
# Create VolumeSnapshot
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: validator-0-snapshot
spec:
volumeSnapshotClassName: pd-ssd-snapshot
source:
persistentVolumeClaimName: data-tenzro-validator-0
# Restore from snapshot
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data-tenzro-validator-0-restored
spec:
dataSource:
name: validator-0-snapshot
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
accessModes:
- ReadWriteOnce
storageClassName: pd-ssd
resources:
requests:
storage: 100GiManual Backup
# Backup validator data to local machine
kubectl exec tenzro-validator-0 -- tar czf - /data > validator-0-backup.tar.gz
# Restore from backup
cat validator-0-backup.tar.gz | kubectl exec -i tenzro-validator-0 -- tar xzf - -C /Troubleshooting
Pod Crashes
# Check pod events
kubectl describe pod tenzro-validator-0
# Check logs from crashed container
kubectl logs tenzro-validator-0 --previous
# Check resource constraints
kubectl top pod tenzro-validator-0
# Check node capacity
kubectl describe node <node-name>Network Issues
# Test service connectivity
kubectl run curl-test --image=curlimages/curl -i --rm --restart=Never \
-- curl http://tenzro-rpc:8545
# Check DNS resolution
kubectl run dns-test --image=busybox -i --rm --restart=Never \
-- nslookup tenzro-rpc
# Check service endpoints
kubectl get endpoints tenzro-rpcStorage Issues
# Check PVC status
kubectl get pvc
# Describe PVC for events
kubectl describe pvc data-tenzro-validator-0
# Check PV (persistent volume)
kubectl get pv
# Resize PVC (if storage class supports it)
kubectl patch pvc data-tenzro-validator-0 \
-p '{"spec":{"resources":{"requests":{"storage":"200Gi"}}}}'Production Checklist
✓ StatefulSets with PersistentVolumeClaims for validators
✓ Resource requests and limits configured
✓ Liveness and readiness probes enabled
✓ Secrets for sensitive data (never in ConfigMaps)
✓ Network policies for pod-to-pod communication
✓ Horizontal Pod Autoscaler for RPC nodes
✓ VolumeSnapshots for backup and disaster recovery
✓ Caddy reverse proxy with automatic TLS
✓ DNS configured for public endpoints
✓ Monitoring and alerting (Prometheus/Grafana)