Tenzro Execution

One runtime. Any model. Any infrastructure.

The adaptive AI model runtime of Tenzro Network — multi-modal inference served by independent providers, routed by price, latency, and reputation, verifiable per call.

Overview

Open model serving across every modality the network supports.

Tenzro Execution is how intelligence is served on the network. Any provider can register a model and serve inference. The router dispatches each request by modality to the right runtime, picks a provider by your strategy — price, latency, reputation, weighted — and produces a receipt that can be verified on-chain. Settlement is in TNZO, per-call or per-token via micropayment channels.

Modalities

Eight runtimes for eight kinds of intelligence.

Chat

GGUF llama.cpp runtime serving Qwen 3, Gemma 3, Mistral, Phi 3, DeepSeek V3, Granite — OpenAI-compatible streaming, KV-cache reuse.

Forecast

ONNX timeseries runtime for TimesFM-class 200M foundation models — context to horizon, optional quantile output.

Vision embed

ONNX image-encoder runtime for CLIP ViT-B/32 and L/14, SigLIP2 base/large/so400m, DINOv3 vits/vitb/vitl.

Text embed

ONNX text-encoder runtime for Qwen3-Embedding (0.6B/4B/8B), EmbeddingGemma-300M, BGE-M3, Snowflake Arctic Embed L v2.0.

Segmentation

Two-pass ONNX runtime for SAM 2 (base/large), EdgeSAM, MobileSAM — point/box-promptable mask prediction.

Detection

ONNX object-detection runtime for RF-DETR (n/s/m/b/l/2xl, 90-class COCO) and D-FINE (n/s/m/l/x, 80-class COCO).

Audio (ASR)

Speech-to-text runtime for Moonshine v2, Distil-Whisper, Whisper-v3-turbo, NVIDIA Parakeet-TDT-v3, Canary-1B-Flash.

Video embed

Video encoder runtime — frame extraction via ffmpeg, mean-pooled image embeddings, optional L2-normalization.

How serving works

From request to receipt.

01
Provider registers
A provider stakes TNZO, publishes its hardware profile, model catalog, pricing, and availability schedule. Health-checked by the network.
02
Client requests
A client submits an inference request typed by modality — Chat, Forecast, VisionEmbed, TextEmbed, Segment, Detect, Transcribe, VideoEmbed.
03
Router dispatches
InferenceRouter selects a provider by strategy — price, latency, reputation, or weighted — and routes to the correct runtime.
04
Provider serves
The runtime executes the inference, returns the result, and emits a usage record. Optional TEE attestation for confidential execution.
05
Payment settles
Settlement in TNZO via per-call payment, micropayment channel, or batched. Provider reputation updated on success/failure with asymmetric weighting.
06
Receipt anchors
Optional on-chain ZK commitment binds the inference to the chain — verifiable by anyone, durable forever.

Specifications

Modalities: Chat, Forecast, Vision, TextEmbed, Segment, Detect, Audio, Video
Runtimes: llama.cpp (GGUF), ONNX Runtime, optional CUDA / Metal acceleration
Routing strategies: Price, Latency, Reputation, Weighted
Catalogs: License-tier gated — Permissive, Attribution, CommercialCustom, NonCommercial
Payment: Per-call, per-token (micropayment channels), or batched in TNZO
Verification: Optional TEE attestation; optional Plonky3 commitment for inference proof

Get started

Ship on the open network.

Serve a model

Browse models