← Back to Tutorials
Embed Text with Qwen3-Embedding
Text EmbeddingsIntermediate15 min
Tenzro's text-embedding runtime serves four model families: qwen3-embedding (0.6B / 4B / 8B), embeddinggemma-300m (Matryoshka 768 / 512 / 256 / 128), bge-m3, and snowflake-arctic-embed-l-v2.0. This tutorial walks through embedding text with the smallest, permissively-licensed entry — qwen3-embedding-0.6b— for retrieval and semantic search.
1. Download the model
# Qwen3-Embedding-0.6B is permissively licensed (Apache-2.0)
tenzro model download qwen3-embedding-0.6b
# Output:
# Resolving artifact bundle from HuggingFace Hub...
# Source: Qwen/Qwen3-Embedding-0.6B (ONNX export)
# License tier: Permissive
# Files: model.onnx, tokenizer.json, config.json
# SHA-256 verified for all files
# Saved to: ~/.tenzro/models/qwen3-embedding-0.6b/2. Load into the runtime
# Load into the text-embedding runtime
tenzro text-embedding load qwen3-embedding-0.6b
# Output:
# Text-embedding runtime loaded:
# Model: qwen3-embedding-0.6b
# Modality: text-embedding
# Output dim: 1024
# Max sequence length: 8192
# Tokenizer: loaded from tokenizer.json3. Embed text via the CLI
# Embed a single passage
tenzro embed-text \
--model qwen3-embedding-0.6b \
--text "Tenzro is a settlement layer purpose-built for the AI age."
# Output:
# Embedding (dim=1024):
# [0.0231, -0.0118, 0.0492, ..., -0.0067]
# tokens: 14
# latency_ms: 22Repeat --text to batch passages in a single forward pass:
# Embed multiple passages in one call (batched)
tenzro embed-text \
--model qwen3-embedding-0.6b \
--text "Validators secure the network." \
--text "Providers serve AI models for TNZO." \
--text "Agents pay for inference per token."
# Output:
# Batch embeddings (3 x 1024):
# passage[0]: [0.0103, ...]
# passage[1]: [-0.0421, ...]
# passage[2]: [0.0298, ...]
# total_tokens: 18
# latency_ms: 474. Embed via JSON-RPC
# Equivalent JSON-RPC call against the public testnet
curl https://rpc.tenzro.network \
-X POST \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"id": 1,
"method": "tenzro_textEmbed",
"params": {
"model_id": "qwen3-embedding-0.6b",
"texts": [
"Tenzro is a settlement layer purpose-built for the AI age.",
"Validators secure the network.",
"Providers serve AI models for TNZO."
]
}
}' | jqA typical response:
{
"jsonrpc": "2.0",
"id": 1,
"result": {
"model_id": "qwen3-embedding-0.6b",
"embeddings": [
[0.0231, -0.0118, 0.0492, "..."],
[0.0103, 0.0211, -0.0344, "..."],
[-0.0421, 0.0156, 0.0287, "..."]
],
"dim": 1024,
"total_tokens": 31,
"latency_ms": 47
}
}5. Matryoshka truncation
For storage-sensitive deployments, EmbeddingGemma-300M supports Matryoshka representation learning — truncate to 768, 512, 256, or 128 dimensions and re-normalize without losing the bulk of recall:
# EmbeddingGemma-300M supports Matryoshka truncation: pick 768 / 512 / 256 / 128
tenzro embed-text \
--model embeddinggemma-300m \
--text "Tenzro Hub indexes skills, tools, and templates." \
--truncate-dim 256
# Note: embeddinggemma-300m carries CommercialCustom license — pass
# --accept-license gemma at download time.See also
- Model serving documentation — the text-embedding catalog and licensing tiers
- Inference RPC reference— full
tenzro_textEmbedschema - Embed images with DINOv3 — pair with text embeddings for cross-modal retrieval