Tenzro Testnet is live —request testnet TNZO
← Back to Tutorials

Segment Images with SAM 2

SegmentationIntermediate20 min

Tenzro's segmentation runtime serves the SAM family (SAM 3 / 3.1 / 2 base / 2 large), EdgeSAM, and MobileSAM. It runs the encoder once per image, caches the embedding, then dispatches each decoder call with point or box prompts. This tutorial walks through SAM 2 base — the smallest production-grade SAM 2 checkpoint — via the CLI and JSON-RPC.

License gate. The SAM model family carries the CommercialCustom license tier and requires --accept-license sam at download time.

1. Download with license acceptance

# SAM 2 base ships under the SAM CommercialCustom license  explicit accept required
tenzro model download sam2-base --accept-license sam

# Output:
# License gate: sam (CommercialCustom)
#   Terms: https://github.com/facebookresearch/segment-anything-2/blob/main/LICENSE
#   Acknowledgment: recorded in CF_MODELS
# Resolving artifact bundle from HuggingFace Hub...
#   Source: facebook/sam2-hiera-base-plus (ONNX export)
#   Files: encoder.onnx, decoder.onnx, config.json
#   SHA-256 verified for all files
#   Saved to: ~/.tenzro/models/sam2-base/

2. Load into the segmentation runtime

# Load into the segmentation runtime
tenzro segmentation load sam2-base

# Output:
# Segmentation runtime loaded:
#   Model: sam2-base
#   Modality: segmentation
#   Encoder cached: true (per-image embedding reuse)
#   Prompt types: points, boxes

3. Segment with point prompts

Each --point takes x,y,label where the label is fg (foreground) or bg (background). Mix and match to refine the mask boundary.

# Segment with two point prompts (foreground), one click each
tenzro segment \
  --model sam2-base \
  --image ./photo.jpg \
  --point 320,240,fg \
  --point 410,260,fg

# Output:
# Segmented 1 mask:
#   mask[0]: 1024x1024 binary, score=0.943
#   pixels: 87,412 (8.3% of image)
# encoder_latency_ms: 134
# decoder_latency_ms: 21

4. Segment with a bounding box

# Or segment with a bounding-box prompt
tenzro segment \
  --model sam2-base \
  --image ./photo.jpg \
  --box 280,200,460,320

# Output:
# Segmented 1 mask:
#   mask[0]: 1024x1024 binary, score=0.961
#   pixels: 91,003 (8.7% of image)

5. Segment via JSON-RPC

# Equivalent JSON-RPC call. Image is base64-encoded raw bytes.
IMAGE_B64=$(base64 -i ./photo.jpg)

curl https://rpc.tenzro.network \
  -X POST \
  -H "Content-Type: application/json" \
  -d "{
    \"jsonrpc\": \"2.0\",
    \"id\": 1,
    \"method\": \"tenzro_segment\",
    \"params\": {
      \"model_id\": \"sam2-base\",
      \"image_base64\": \"$IMAGE_B64\",
      \"prompts\": [
        { \"type\": \"point\", \"x\": 320, \"y\": 240, \"label\": \"fg\" },
        { \"type\": \"point\", \"x\": 410, \"y\": 260, \"label\": \"fg\" }
      ]
    }
  }" | jq

A typical response:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "model_id": "sam2-base",
    "masks": [
      {
        "width": 1024,
        "height": 1024,
        "rle": "...",
        "score": 0.943,
        "pixel_count": 87412
      }
    ],
    "encoder_latency_ms": 134,
    "decoder_latency_ms": 21
  }
}

See also