Model Serving
Tenzro enables any node to serve AI models and earn TNZO for inference. Models are downloaded from HuggingFace Hub in GGUF format and served locally. Providers register their model endpoints on-chain, set per-token pricing, and the InferenceRouter directs user requests to the best available provider based on latency, price, or reputation.
Provider Flow
Earning flow: User sends inference request, the Router picks your node, the model generates a response, the user pays per token, and TNZO settles to your wallet via a micropayment channel.
Step 1: Download a Model
The CLI downloads GGUF models from HuggingFace Hub with SHA-256 integrity verification via the hf-hub crate:
# Download a model from HuggingFace
tenzro-cli model download unsloth/gemma-3-270m-it-GGUF
# Download a specific quantization
tenzro-cli model download TheBloke/Mistral-7B-Instruct-v0.2-GGUF \
--filename mistral-7b-instruct-v0.2.Q4_K_M.gguf
# List downloaded models
tenzro-cli model list --local
# Check model integrity (SHA-256 verification)
tenzro-cli model info unsloth/gemma-3-270m-it-GGUFStep 2: Serve the Model
# Serve a model locally
tenzro-cli model serve unsloth/gemma-3-270m-it-GGUF \
--port 8080 \
--ctx-size 4096 \
--gpu-layers 35
# The serve command:
# 1. Starts local inference server with the GGUF file
# 2. Calls tenzro_serveModel RPC to register the endpoint
# 3. Exposes OpenAI-compatible API at http://localhost:8080/v1
# Serve on the network (registers endpoint on-chain)
tenzro-cli model serve unsloth/gemma-3-270m-it-GGUF \
--remote # Registers endpoint via tenzro_serveModel RPC
# Stop serving
tenzro-cli model stop unsloth/gemma-3-270m-it-GGUFStep 3: Register as Provider
# Register as a model provider (requires staking)
tenzro-cli provider register --role model-provider
# Stake TNZO (required for provider registration)
tenzro-cli stake deposit --amount 1000 --role model-provider
# Set pricing for your model
tenzro-cli provider pricing set \
--model gemma3-270m \
--per-token 0.0001 # TNZO per token
# Show current pricing
tenzro-cli provider pricing show
# Set availability schedule
tenzro-cli provider schedule set \
--timezone UTC \
--hours 0-24 # 24/7 availability
# Check provider status
tenzro-cli provider statusStep 4: Chat and Earn
# Users can now chat with your model
tenzro-cli chat --model gemma3-270m
# The chat command:
# 1. Tries local inference server first
# 2. Falls back to tenzro_chat RPC (routes to your provider)
# 3. Interactive REPL with /history and /load session management
# Or via RPC (for apps)
curl -X POST https://rpc.tenzro.network \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"method": "tenzro_chat",
"params": [{
"model": "gemma3-270m",
"messages": [
{"role": "user", "content": "Hello!"}
],
"max_tokens": 200
}],
"id": 1
}'Inference Routing Strategies
The InferenceRouter selects the best provider for each request using configurable strategies:
| Strategy | Description |
|---|---|
| Price | Route to the cheapest provider |
| Latency | Route to the fastest provider (lowest measured latency) |
| Reputation | Route to the highest-rated provider |
| Weighted | Balanced routing across all factors |
Provider Health Monitoring
The ProviderManager runs background health checks on all registered providers. Providers that fail health checks are temporarily removed from the routing pool (circuit breaker pattern). Health metrics include response time, error rate, and availability:
# Check your provider stats
curl -X POST https://rpc.tenzro.network \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"method": "tenzro_providerStats",
"params": ["0xYourAddress..."],
"id": 1
}'
# Response:
# {
# "result": {
# "models_served": 2,
# "total_inferences": 15420,
# "total_earned": "154200000000000000000",
# "uptime_percent": 99.7,
# "avg_latency_ms": 245,
# "staked": "10000000000000000000000"
# }
# }Model Endpoints
# List all model endpoints on the network
tenzro-cli model endpoints
# Get details for a specific endpoint
tenzro-cli model endpoint --model gemma3-270m
# Via RPC
curl -X POST https://rpc.tenzro.network \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"method": "tenzro_listModelEndpoints",
"params": [],
"id": 1
}'SDK Usage
import { ProviderClient } from "@tenzro/sdk";
const provider = new ProviderClient({
rpcUrl: "https://rpc.tenzro.network",
walletKey: process.env.PROVIDER_KEY,
});
// Register and serve
await provider.register({ role: "model_provider", stake: "10000" });
await provider.serveModel({
modelId: "gemma3-270m",
endpoint: "http://localhost:8080/v1",
pricing: { perToken: "0.0001" },
});
// Set availability
await provider.setSchedule({
timezone: "UTC",
hours: { start: 0, end: 24 },
});