Tutorial — Multi-modal AI

Segment images with SAM 2

The segmentation runtime exposes SAM 2 base/large, EdgeSAM, and MobileSAM. Pass point or box prompts; the encoder caches per-image embeddings and the decoder returns mask data.

Level: Intermediate
Time: ~15 min
Prerequisites: Tenzro CLI installed, sample image
Stack: CLI · JSON-RPC

Load SAM 2

SAM 2 is under Meta's commercial-custom terms, so the node operator has to have started tenzro-node with --accept-license for it; otherwise the load is refused. SAM ships as an encoder/decoder pair, and catalog-id supplies the decoder ABI and input resolution.

tenzro segment catalog

tenzro segment load \
  --model seg \
  --encoder-path /models/sam2-base/encoder.onnx \
  --decoder-path /models/sam2-base/decoder.onnx \
  --catalog-id sam2-base

Run a point-prompted segmentation

Prompts come from a JSON file. Coordinates are in original-image pixels, and is_foreground marks whether the point is inside the object or outside it.

cat > prompts.json <<'EOF'
[
  {"type": "point", "x": 412, "y": 310, "is_foreground": true},
  {"type": "point", "x": 120, "y": 500, "is_foreground": false}
]
EOF

tenzro segment run \
  --model seg \
  --image image.png \
  --prompts prompts.json

Run a box-prompted segmentation

A box prompt gives object-level masks. x0,y0 is the top-left corner and x1,y1 the bottom-right, again in original-image pixels.

cat > box.json <<'EOF'
[{"type": "box", "x0": 120, "y0": 80, "x1": 540, "y1": 420}]
EOF

tenzro segment run \
  --model seg \
  --image image.png \
  --prompts box.json

Call from JSON-RPC

The RPC method returns mask bytes plus IOU scores. model_id is the id you loaded under, not the catalog id.

curl -s https://rpc.tenzro.xyz -H 'content-type: application/json' \
  -d '{"jsonrpc":"2.0","id":1,"method":"tenzro_segment","params":{"model_id":"seg","image_base64":"…","prompts":[{"type":"point","x":412,"y":310,"is_foreground":true}]}}'

Open the docs →Browse all tutorials

← All tutorials