Tutorial — Run a node
Serve a model on Tenzro
Pull a model down from Hugging Face, expose it through the local llama.cpp runtime, then register the endpoint in the on-network catalog so others can route inference to you.
- Level
- Intermediate
- Time
- ~25 min
- Prerequisites
- tenzro-node running locally, ~10 GB disk
- Stack
- CLI
01
Download a model from Hugging Face
The CLI uses hf-hub under the hood and verifies SHA-256 integrity.
tenzro model download qwen-3-0.6b02
Start serving locally
The local runtime exposes an OpenAI-compatible HTTP surface for development.
tenzro model serve qwen-3-0.6b --listen 127.0.0.1:1143403
Register as a provider
Bond a small stake, then publish the model endpoint, price, and SLA to the catalog.
tenzro provider register
tenzro model endpoint publish \
--model qwen-3-0.6b \
--price-per-token 0.000002504
Verify routing picks you up
Once the registration confirms, the inference router will start placing requests with you.
tenzro inference request qwen3-0.6b "what is the chain id?"Related