Tenzro Testnet is live. Get testnet TNZO

Model Serving

Tenzro enables any node to serve AI models and earn TNZO for inference. Models are downloaded from HuggingFace Hub in GGUF format and served locally. Providers register their model endpoints on-chain, set per-token pricing, and the InferenceRouter directs user requests to the best available provider based on latency, price, or reputation.

Provider Flow

Earning flow: User sends inference request, the Router picks your node, the model generates a response, the user pays per token, and TNZO settles to your wallet via a micropayment channel.

Step 1: Download a Model

The CLI downloads GGUF models from HuggingFace Hub with SHA-256 integrity verification via the hf-hub crate:

# Download a model from HuggingFace
tenzro-cli model download unsloth/gemma-3-270m-it-GGUF

# Download a specific quantization
tenzro-cli model download TheBloke/Mistral-7B-Instruct-v0.2-GGUF \
  --filename mistral-7b-instruct-v0.2.Q4_K_M.gguf

# List downloaded models
tenzro-cli model list --local

# Check model integrity (SHA-256 verification)
tenzro-cli model info unsloth/gemma-3-270m-it-GGUF

Step 2: Serve the Model

# Serve a model locally
tenzro-cli model serve unsloth/gemma-3-270m-it-GGUF \
  --port 8080 \
  --ctx-size 4096 \
  --gpu-layers 35

# The serve command:
# 1. Starts local inference server with the GGUF file
# 2. Calls tenzro_serveModel RPC to register the endpoint
# 3. Exposes OpenAI-compatible API at http://localhost:8080/v1

# Serve on the network (registers endpoint on-chain)
tenzro-cli model serve unsloth/gemma-3-270m-it-GGUF \
  --remote  # Registers endpoint via tenzro_serveModel RPC

# Stop serving
tenzro-cli model stop unsloth/gemma-3-270m-it-GGUF

Step 3: Register as Provider

# Register as a model provider (requires staking)
tenzro-cli provider register --role model-provider

# Stake TNZO (required for provider registration)
tenzro-cli stake deposit --amount 1000 --role model-provider

# Set pricing for your model
tenzro-cli provider pricing set \
  --model gemma3-270m \
  --per-token 0.0001  # TNZO per token

# Show current pricing
tenzro-cli provider pricing show

# Set availability schedule
tenzro-cli provider schedule set \
  --timezone UTC \
  --hours 0-24  # 24/7 availability

# Check provider status
tenzro-cli provider status

Step 4: Chat and Earn

# Users can now chat with your model
tenzro-cli chat --model gemma3-270m

# The chat command:
# 1. Tries local inference server first
# 2. Falls back to tenzro_chat RPC (routes to your provider)
# 3. Interactive REPL with /history and /load session management

# Or via RPC (for apps)
curl -X POST https://rpc.tenzro.network \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "method": "tenzro_chat",
    "params": [{
      "model": "gemma3-270m",
      "messages": [
        {"role": "user", "content": "Hello!"}
      ],
      "max_tokens": 200
    }],
    "id": 1
  }'

Inference Routing Strategies

The InferenceRouter selects the best provider for each request using configurable strategies:

StrategyDescription
PriceRoute to the cheapest provider
LatencyRoute to the fastest provider (lowest measured latency)
ReputationRoute to the highest-rated provider
WeightedBalanced routing across all factors

Provider Health Monitoring

The ProviderManager runs background health checks on all registered providers. Providers that fail health checks are temporarily removed from the routing pool (circuit breaker pattern). Health metrics include response time, error rate, and availability:

# Check your provider stats
curl -X POST https://rpc.tenzro.network \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "method": "tenzro_providerStats",
    "params": ["0xYourAddress..."],
    "id": 1
  }'

# Response:
# {
#   "result": {
#     "models_served": 2,
#     "total_inferences": 15420,
#     "total_earned": "154200000000000000000",
#     "uptime_percent": 99.7,
#     "avg_latency_ms": 245,
#     "staked": "10000000000000000000000"
#   }
# }

Model Endpoints

# List all model endpoints on the network
tenzro-cli model endpoints

# Get details for a specific endpoint
tenzro-cli model endpoint --model gemma3-270m

# Via RPC
curl -X POST https://rpc.tenzro.network \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "method": "tenzro_listModelEndpoints",
    "params": [],
    "id": 1
  }'

SDK Usage

import { ProviderClient } from "@tenzro/sdk";

const provider = new ProviderClient({
  rpcUrl: "https://rpc.tenzro.network",
  walletKey: process.env.PROVIDER_KEY,
});

// Register and serve
await provider.register({ role: "model_provider", stake: "10000" });
await provider.serveModel({
  modelId: "gemma3-270m",
  endpoint: "http://localhost:8080/v1",
  pricing: { perToken: "0.0001" },
});

// Set availability
await provider.setSchedule({
  timezone: "UTC",
  hours: { start: 0, end: 24 },
});

Related Documentation

Models — Available models and registry
Inference — Making inference requests
Streaming Inference — Real-time token streaming
Micropayments — Per-token billing channels
MicroNode — Getting started as a provider