Tenzro Train

Tenzro Train is a protocol for training foundation models — language, timeseries, vision, and multimodal — across a permissionless network of independently operated compute providers. It adapts Decoupled DiLoCo to a trustless setting using attested execution, Byzantine-robust gradient aggregation, and on-chain settlement to produce verifiable training receipts.

Architecture: Rust protocol + Python reference trainer

Tenzro Train splits cleanly into two layers. The Rust protocol layer (tenzro-training crate plus types in tenzro-types) owns coordination, aggregation, settlement, and verification. The Python reference trainer (integrations/trainer/, PyTorch FSDP2 + Hivemind + safetensors) owns the inner training loop. Communication between the two layers is JSON-RPC plus the gossip topics defined below.

No tensor library lives in the Rust workspace. The Rust crate may compute aggregation over already-decoded tensors via ndarray, but it does not own training compute. This mirrors how Prime Intellect, Nous Research, and OpenDiLoCo split their stacks: PyTorch for the inner loop, a typed protocol crate for orchestration.

Trust tiers

The sponsor selects a trust tier at task posting. Tier defines what trainer hardware must provide and how rewards scale. Training compute is TEE-optional in the Open tier; key custody and verification are TEE-mandatory in every tier.

Tier	Trainer hardware	Trust source	Use case
Open	Any GPU or CPU	Stake bonding, Byzantine-robust aggregation, fragment redundancy	Default. Public datasets, cheapest tier.
Verified	TDX / SEV-SNP / Nitro / NVIDIA CC	Per-round TEE attestation binding {program, shard, model, DID}	Encrypted-at-rest datasets, higher reward weight.
Confidential	TEE with sealed memory	Data sealed to the enclave — host OS never sees cleartext	Private datasets that cannot leave the data owner.

Phase 1 ships Open tier only. Verified and Confidential tiers stay on the roadmap.

Run lifecycle

A training run progresses through five phases:

Posting. Sponsor submits a TrainingTaskSpec and escrows TNZO into the reward pool.
Syncer election. A syncer is elected — VRF-weighted by stake, Phase 2 will require a TEE attestation.
Enrollment. Trainers stake, are admitted up to trainer_count (M), and receive shard assignments.
Inner-outer loop. Each round, every trainer runs inner_steps (H) of inner SGD on its shard, then submits an OuterGradient per fragment. The syncer aggregates K-of-M and commits the result on-chain.
Sealing. At max_rounds, a TrainingReceipt is emitted carrying the run root, full task spec, final model hash, and aggregation transcript. Optionally minted as an NFT.

Aggregation rules

The syncer applies one Byzantine-robust rule across the K accepted outer gradients per fragment.

AggregationRule:
  - Mean                       // Phase 1 default. Plain mean.
  - TrimmedMean { alpha_bps }   // Trim top/bottom α% per parameter.
  - CoordinateMedian            // Robust to f < M/2 Byzantine learners.
  - Krum { f }                  // Pick gradients with lowest sum-of-distances
                                //   to nearest neighbors.

Phase 1 ships Mean only. Trimmed mean, coordinate median, and Krum land in Phase 2 with multi-region scale.

Task spec

The on-chain description of a training run. Posted by the sponsor, referenced by trainers, and committed verbatim into the final receipt.

{
  "task_id": "train-2026-04-25-timesfm-200m",
  "sponsor_did": "did:tenzro:human:01J...",
  "sponsor_address": "0x...",
  "architecture": {
    "family": "timesfm",
    "param_count": 200000000,
    "modality": "Timeseries",
    "fragment_count": 12,
    "dtype": "bf16",
    "metadata": { "context_len": 2048, "horizon": 128 }
  },
  "tier": "Open",
  "aggregation": "Mean",
  "trainer_count": 8,        // M
  "quorum": 6,               // K
  "inner_steps": 24,         // H
  "max_rounds": 1024,
  "grace_window_ms": 30000,  // τ — straggler tolerance
  "reward_pool": "1000000000000000000000",  // attoTNZO
  "dataset_ref": "ipfs://Qm...",
  "dataset_hash": "0x...",
  "min_throughput": null,
  "created_at": 1761350400
}

JSON-RPC namespace

The node exposes seven methods under the tenzro_training_* namespace.

Method	Purpose
`tenzro_training_postTask`	Sponsor posts a `TrainingTaskSpec` and escrows TNZO.
`tenzro_training_listRuns`	List active runs the node is tracking.
`tenzro_training_getRun`	Fetch one run by `task_id`.
`tenzro_training_getReceipt`	Fetch a sealed `TrainingReceipt` for a finished run.
`tenzro_training_enrollTrainer`	Trainer enrolls a DID + stake into a run.
`tenzro_training_submitOuterGradient`	Trainer submits one `OuterGradient` for one fragment in one round.
`tenzro_training_finalizeRound`	Syncer finalizes the current round once K submissions are accepted.

Gossip topics

Off-chain coordination flows over two libp2p gossipsub topics. On-chain commitments use the SHA-256 Merkle root of the run prefixed with tenzro/train/run-root/v1.

tenzro/training/1.0.0          // Outer gradient submissions, fragment payloads
tenzro/training/syncer/1.0.0   // Syncer status, round transitions, finality

VM precompile

On-chain contracts can verify a Tenzro Train receipt via the TRAINING_VERIFY precompile at 0x1008. Input: a serialized receipt plus the claimed run root; output: a single byte indicating verification status. Lets ERC-20 reward escrow contracts gate payouts on a verified receipt without trusting the sponsor.

Storage

Two RocksDB column families back the training subsystem:

CF_TRAINING_RUNS — active and historical run state, keyed by task_id.
CF_TRAINING_RECEIPTS — sealed receipts, keyed by task_id.

CLI usage

The tenzro train command group wraps the JSON-RPC namespace. All subcommands accept --rpc (defaults to http://127.0.0.1:8545).

# Sponsor flow — post a task from a JSON spec file
tenzro train post-task --spec ./timesfm-task.json

# Discovery
tenzro train list-runs
tenzro train get-run --task-id train-2026-04-25-timesfm-200m
tenzro train get-receipt --task-id train-2026-04-25-timesfm-200m

# Trainer flow
tenzro train enroll-trainer \
  --task-id train-2026-04-25-timesfm-200m \
  --did did:tenzro:machine:abc... \
  --stake 100000000000000000000

tenzro train submit-gradient \
  --task-id train-2026-04-25-timesfm-200m \
  --round 12 --fragment 3 \
  --payload ./grad-r12-f3.bin

# Syncer flow
tenzro train finalize-round --task-id train-2026-04-25-timesfm-200m

Phase 1 scope

Phase 1 is timeseries-first. The Python reference trainer ships TimesFM-class (200M decoder-only patch transformer), Chronos-Bolt (T5-derived), and Granite-TTM (patch-mixer) adapters. Forecast model entries live in the ONNX catalog at tenzro_model::get_forecast_catalog().

Modality: Timeseries only.
Tier: Open only.
Aggregation: Mean only.
Hyperparameters: M=8, K=6, F=12, H=24, AdamW inner (lr=3e-4), Nesterov SGD outer (lr=0.7, mom=0.9).

Phases 2–5 (Byzantine-robust aggregation, multi-region scale, language and vision modalities, TEE-resident data) stay on the roadmap. See TRAIN.md §7.4 for the full phasing plan.

Next steps

Hands-on tutorials walk through both flows end-to-end:

Post a training task — sponsor flow: write a task spec, escrow TNZO, watch enrollment, fetch the receipt.
Run a trainer node — trainer flow: install the Python reference trainer, enroll, submit gradients.
Tenzro Train whitepaper — full architecture, multi-modal extensions, comparison with Prime Intellect / Nous / OpenDiLoCo.