Model serving.

Run a provider. Register, serve, and earn TNZO per inference.

STATUS: Testnet
CRATE: tenzro-model
STABILITY: Stable
TYPE: Guide

Register

tenzro stake --role model-provider --amount 1000
tenzro provider register --models qwen3-0.6b,gemma3-medium

Serve

tenzro model serve qwen3-0.6b --device cpu
tenzro model serve gemma3-medium --device gpu --concurrency 4

LAN clustering

When a model is too large for one host, the node clusters automatically. It reads the GGUF header for layer count and hidden dimension, discovers LAN members from gossiped cluster announcements, splits the layers VRAM-weighted across them, and runs a layer-wise pipeline. No extra arguments are required — the node decides single-host vs. cluster from the model shape and the reachable members.

# Auto: single host if it fits, cluster if it doesn't
tenzro model serve gemma3-large

# Force a split even when it fits one box (trades decode speed for memory)
tenzro model serve gemma3-large --cluster

# Never cluster; pin single-host
tenzro model serve gemma3-large --force-single

Preview the proposed layer split before serving with tenzro cluster preview <id> — it surfaces discovered members, the VRAM-weighted split, and any rejected members (commit mismatch, unreachable data plane, insufficient VRAM).

Visibility

By default a served model is announced to the network so any peer can route inference to it. Serve privately to register it locally without gossiping — reachable only over a direct or LAN connection.

tenzro model serve qwen3-0.6b --private

Health

Background health monitoring polls every endpoint. Reputation moves +1 on success, -5 on failure (saturating, ceiling 1000, floor 0). Reputation is persisted in CF_PROVIDERS.

Discovery

Every served model is listed at GET /v1/models with per-token pricing (wei), context length, max output tokens, feature flags (streaming, usage-in-stream, MTP, provenance signing), and the provider's declared datacenter geography — all derived from registry and gossip-announcement state, with no separate listing configuration. The response carries a data_policy object declaring retention behavior: prompt and completion bodies are never written to disk.

Schedule

tenzro schedule set --start 09:00 --end 21:00 --tz UTC
tenzro schedule show

← All docs