Tenzro
Tutorial — Run a node

Serve a model on Tenzro

Pull a model down from Hugging Face, expose it through the local llama.cpp runtime, then register the endpoint in the on-network catalog so others can route inference to you.
Level
Intermediate
Time
~25 min
Prerequisites
tenzro-node running locally, ~10 GB disk
Stack
CLI
01

Download a model from Hugging Face

The CLI uses hf-hub under the hood and verifies SHA-256 integrity.

tenzro model download qwen-3-0.6b
02

Start serving locally

The local runtime exposes an OpenAI-compatible HTTP surface for development.

tenzro model serve qwen-3-0.6b --listen 127.0.0.1:11434
03

Register as a provider

Bond a small stake, then publish the model endpoint, price, and SLA to the catalog.

tenzro provider register
tenzro model endpoint publish \
  --model qwen-3-0.6b \
  --price-per-token 0.0000025
04

Verify routing picks you up

Once the registration confirms, the inference router will start placing requests with you.

tenzro inference request qwen3-0.6b "what is the chain id?"
Related
← All tutorials