Tenzro Testnet is live —request testnet TNZO
← Back to Tutorials

Run a Model Provider Node and Serve gemma3-270m

WorkflowIntermediate30 min

This tutorial walks through the complete lifecycle of running a Tenzro node as a model provider: starting the Docker container, joining the testnet gossipsub mesh, downloading a small open-weights model (gemma3-270m), serving it on the network, and performing a real chat inference call. Every command below is executed against the live testnet — no mocks.

What you'll need

1. Start the node as a model provider

Model provider nodes run the same binary as validators but with --role model-provider. The node joins the P2P mesh on port 9000, exposes JSON-RPC on 8545, and a Web API on 8080. State is persisted in the volume mounted at /data so you can restart the container without losing identity, wallet, or downloaded models.

docker run -d \
  --name tenzro-provider \
  --restart unless-stopped \
  -p 8545:8545 \
  -p 8080:8080 \
  -p 9000:9000 \
  -v /var/lib/tenzro:/data \
  us-central1-docker.pkg.dev/tenzro-infra/tenzro/tenzro-node:latest \
  --role model-provider \
  --data-dir /data \
  --listen-addr /ip4/0.0.0.0/tcp/9000 \
  --rpc-addr 0.0.0.0:8545 \
  --boot-nodes /ip4/BOOT_NODE_IP/tcp/9000

Replace BOOT_NODE_IP with a live peer — for the public testnet, the RPC node at rpc.tenzro.network exposes port 9000 on its internal multiaddr. A node with no boot nodes will start but never sync.

2. Verify the node is healthy and synced

The Web API exposes a /api/status endpoint that reports the current role, block height, peer count, and uptime. Wait until peer_count is at least 1 and health is healthy.

# Wait for the node to sync and join the gossipsub mesh
curl -s https://api.tenzro.network/status | jq

# Expected output:
# {
#   "node_state": "running",
#   "role": "model-provider",
#   "health": "healthy",
#   "block_height": 12345,
#   "peer_count": 4,
#   "uptime_secs": 60
# }

3. List available models

The tenzro_listModels RPC returns every model registered in the on-chain catalog, including fields like model_id, category, modality, downloaded, and serving. At first, your node will show downloaded: false for every model.

curl http://localhost:8545 \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "id": 1,
    "method": "tenzro_listModels",
    "params": {}
  }' | jq

4. Download gemma3-270m

Kick off a background download from HuggingFace Hub. gemma3-270m is about 550 MB of weights — small enough to run on 4 GB of RAM, and ideal for verifying the full serving pipeline without spending minutes on a multi-GB download.

curl http://localhost:8545 \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "id": 2,
    "method": "tenzro_downloadModel",
    "params": {
      "model_id": "gemma3-270m"
    }
  }'

The download runs asynchronously. Poll tenzro_listModels until downloaded: true:

# Poll until downloaded: true
while true; do
  STATUS=$(curl -s http://localhost:8545 \
    -X POST \
    -H "Content-Type: application/json" \
    -d '{
      "jsonrpc":"2.0",
      "id":3,
      "method":"tenzro_listModels",
      "params":{}
    }' | jq -r '.result[] | select(.model_id=="gemma3-270m") | .downloaded')

  echo "Downloaded: $STATUS"
  [ "$STATUS" = "true" ] && break
  sleep 5
done

The node verifies the SHA-256 hash of every file after download. A corrupted download will fail integrity verification and be discarded automatically.

5. Serve the model

Once downloaded, call tenzro_serveModelto load the weights into memory and publish a serving endpoint to the gossipsub mesh. Other peers on the network will see your endpoint within one gossip round (~1–3 seconds).

curl http://localhost:8545 \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "id": 4,
    "method": "tenzro_serveModel",
    "params": {
      "model_id": "gemma3-270m",
      "max_concurrent": 4
    }
  }' | jq

6. Confirm the endpoint is discoverable

Every node (including remote ones) can now see your endpoint via tenzro_listModelEndpoints. This is how autonomous agents discover who is serving what model.

curl http://localhost:8545 \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "id": 5,
    "method": "tenzro_listModelEndpoints",
    "params": {}
  }' | jq

7. Chat with the model

Call tenzro_chat with a model_id and a singular messagestring. The node routes the request to the local serving instance (or to a remote provider if you don't serve the model yourself). For OpenAI-style multi-turnmessages arrays, use the MCP chat_completiontool or the /v1/chat/completions REST endpoint instead.

curl http://localhost:8545 \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "id": 6,
    "method": "tenzro_chat",
    "params": {
      "model_id": "gemma3-270m",
      "message": "What is 2+2?",
      "max_tokens": 64
    }
  }' | jq

A typical response:

{
  "jsonrpc": "2.0",
  "id": 6,
  "result": {
    "model_id": "gemma3-270m",
    "choices": [
      {
        "message": {
          "role": "assistant",
          "content": "2+2 equals 4."
        },
        "finish_reason": "stop"
      }
    ],
    "usage": {
      "prompt_tokens": 8,
      "completion_tokens": 6,
      "total_tokens": 14
    }
  }
}

8. Stop serving (optional)

To unload the model and stop advertising the endpoint:

curl http://localhost:8545 \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "id": 7,
    "method": "tenzro_stopModel",
    "params": {
      "model_id": "gemma3-270m"
    }
  }'

Beyond chat: multi-modal serving

The same node can serve six other modalities alongside chat — each with its own runtime, catalog, RPC method, and CLI command. Pick the per-modality tutorial that matches your workload:

The InferenceRouter reads each model's modalityfield from the registry and dispatches to the correct runtime — tenzro_chat stays LLM-only, while the per-modality methods above route through forecast / vision / text-embedding / segmentation / detection / audio runtimes.

What's next

You now have a working model provider node. The next tutorials build on this foundation — adding multi-node provider networks, registering as a paid provider, and composing agents that discover and call your endpoint autonomously.