API Reference

Select routes your requests to the best available frontier model — automatically, based on workload type, capability tier, and live utilisation. One endpoint for the active Select model catalog.

Base URLhttps://api.select.ax/v1

Overview

Select exposes inference endpoints under https://api.select.ax/v1. Requests are routed in real time to the best available model. You can influence routing via request headers — or let Select decide automatically.

Two routing modes are available:

Auto mode (default) — Select classifies your request and picks the optimal model by tier, live load, and workload type.
Direct mode — You specify exactly which model to use via x-model.

Some models are TEE-enabled. The model catalog shows which models support TEE.

Authentication

All requests require a Bearer token in the Authorization header. API keys are issued after purchase at select.ax/pricing and follow the format sk-sel-….

http

Authorization: Bearer sk-sel-<your-api-key>

Keep your key safe. API keys grant full access to your credit balance. Rotate from your dashboard if compromised.

Quick start

Send your first request in under a minute:

curl

curl -X POST https://api.select.ax/v1/messages \
  -H "Authorization: Bearer sk-sel-<your-key>" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      { "role": "user", "content": "Summarise the key steps of a RAG pipeline." }
    ],
    "max_tokens": 1024
  }'

Select will automatically classify the request and route it to the best available model for the task.

Finish any source file with Select Finish: select-finish path/to/file.py

POST /v1/messages

The primary inference endpoint. Accepts a messages payload and returns a completion.

POSThttps://api.select.ax/v1/messages

Routes to the best available model. Model selection is controlled by routing headers (see below).

Request body

Parameter	Type		Description
messages	array	Required	Array of message objects with `role` and `content` fields.
max_tokens	integer	Optional	Maximum tokens to generate. Defaults to `4096`.
stream	boolean	Optional	Set to `true` to stream the response as SSE.
temperature	number	Optional	Sampling temperature. Passed through to the model.
top_p	number	Optional	Nucleus sampling threshold.
tools	array	Optional	Tool definitions. Presence influences routing toward tool-capable models.
tool_choice	object	Optional	Tool choice control. Passed through to the model.
response_format	object	Optional	Structured output format. Presence scores toward structured-output models.

POST /v1/chat/completions

An OpenAI-compatible alias that delegates to /v1/messages. Use this as a drop-in replacement if your SDK or framework targets the OpenAI API format.

POSThttps://api.select.ax/v1/chat/completions

Identical behaviour to /v1/messages. All routing headers apply.

Select Finish

Get Claude to finish one source file from your terminal. The script shows the summary, suggested tests, cost, and remaining balance, then asks before applying the finished file.

Install

bash

curl -o select-finish https://select.ax/scripts/select-finish.sh
chmod +x select-finish
export SELECTAX_API_KEY=sk-sel-...

Run

bash

./select-finish path/to/file.py
./select-finish path/to/file.py --test "python3 path/to/file.py"
./select-finish path/to/file.py --yes

It can change files. Select Finish writes a timestamped backup beside the original and asks before applying the finished version unless --yes is used.

Single-file limit: approximately 150k input tokens.

What it does

Step	Behaviour
Send	Uploads one source file to Select Finish.
Finish	Claude returns a finished version of the file plus a plain-English summary and suggested tests.
Approve	The script shows the cost and asks before changing anything.
Apply	The original file is backed up first, then replaced with the finished version.
Check	Python and JavaScript files get a basic syntax check automatically. Use `--test` for your own command.

API note

The script uses select-finish-run, which returns Select Finish JSON containing summary, tests, and fixed_code. It is intended for the script workflow, not as a normal chat model in OpenAI-compatible clients.

json

{
  "object": "select.finish.result",
  "actual_cost": 0.248028,
  "balance_remaining": 3.714298,
  "summary": "Fixed required production-readiness issues.",
  "tests": ["python3 -m py_compile path/to/file.py"],
  "fixed_code": "..."
}

Select Review

Use Select Review when you want a read-only audit instead of an applied fix. It returns findings, risks, patch guidance, and a test plan, but it does not rewrite files.

Install

bash

curl -o select-review https://select.ax/scripts/select-review.sh
chmod +x select-review
export SELECTAX_API_KEY=sk-sel-...

Run

bash

./select-review path/to/file.py

Check cost first

bash

curl -o select-review-estimate https://select.ax/scripts/select-review-estimate.sh
chmod +x select-review-estimate
./select-review-estimate path/to/file.py

Tool endpoints

Tool	What it does
`select-review`	Cost estimate only, no charge.
`select-review-run`	Claude review, bills immediately.

API reference

Select Review is review-only: it returns findings, risks, and recommendations. It does not generate new code, rewrite files, or complete implementation tasks.

Select Review is a tool workflow, not a selectable model in the model catalog. Review requests bypass Smart Select routing and are billed from your existing dollar balance only when you run a confirmed review.

POSThttps://api.select.ax/v1/chat/completions

Use the script above, or call the tool endpoint with the Select Review model field shown below.

Header	Required	Description
`Authorization`	Yes	`Bearer sk-sel-...`
`Content-Type`	Yes	`application/json`
`x-confirm-review`	No	Use `true` with `select-review` to run the paid review after checking the estimate.

Dry run flow. select-review returns an estimate first. Add x-confirm-review: true to run the paid review.

Estimate request

curl

curl -X POST https://api.select.ax/v1/chat/completions \
  -H "Authorization: Bearer sk-sel-<your-key>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "select-review",
    "messages": [
      {
        "role": "user",
        "content": "function total(items) { let sum = 0; for (const i in items) { if (items[i].price) sum += items[i].price; } return sum.toFixed(2); }"
      }
    ]
  }'

Estimate response

json

{
  "object": "select.review.estimate",
  "review_model": "sonnet",
  "estimated_input_tokens": 193,
  "estimated_output_tokens": 600,
  "estimated_cost": 0.012772,
  "balance_current": 9.87,
  "balance_after": 9.857228,
  "message": "Select Review - Claude. Estimated cost: $0.013 from your balance ($9.87 remaining). Add x-confirm-review: true to proceed.",
  "proceed": false
}

Confirmed review request

curl

curl -X POST https://api.select.ax/v1/chat/completions \
  -H "Authorization: Bearer sk-sel-<your-key>" \
  -H "Content-Type: application/json" \
  -H "x-confirm-review: true" \
  -d '{
    "model": "select-review",
    "messages": [
      {
        "role": "user",
        "content": "function total(items) { let sum = 0; for (const i in items) { if (items[i].price) sum += items[i].price; } return sum.toFixed(2); }"
      }
    ]
  }'

Confirmed review response

json

{
  "object": "select.review.result",
  "review_model": "sonnet",
  "actual_input_tokens": 193,
  "actual_output_tokens": 600,
  "actual_cost": 0.012772,
  "balance_remaining": 9.857228,
  "review": "Markdown review output..."
}

Review response headers

Header	Description
`x-model-used`	The Select Review tool route used for the request.
`x-request-id`	Request identifier for dashboard and support lookup.
`x-cost-usd`	Dollar cost charged for the confirmed review.
`x-balance-remaining-usd`	Remaining dollar balance after the confirmed review.

Immediate review request

Use select-review-run when a client cannot send custom headers.

curl

curl -X POST https://api.select.ax/v1/chat/completions \
  -H "Authorization: Bearer sk-sel-<your-key>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "select-review-run",
    "messages": [
      {
        "role": "user",
        "content": "function total(items) { let sum = 0; for (const i in items) { if (items[i].price) sum += items[i].price; } return sum.toFixed(2); }"
      }
    ]
  }'

Billing rates

Tool route	Billing rate
`select-review`	Estimate only, no charge until confirmed. $0.004 / 1K input · $0.020 / 1K output when confirmed.
`select-review-run`	$0.004 / 1K input · $0.020 / 1K output

Routing headers

These request headers control how Select routes your request. All are optional — omitting them triggers fully automatic routing.

Header	Values	Description
x-mode	`auto` · `direct`	Routing mode. `auto` (default) lets Select decide. `direct` routes to a specific model via `x-model`.
x-model	model slug	Target model slug for direct mode, e.g. `kimi-k2.6-tee`. Ignored in auto mode.
x-latency	`low` · `normal`	Latency preference. `low` biases routing toward faster Tier 1–2 models.
x-privacy	`tee`	Set to `tee` to restrict routing to TEE-only models.
x-tier-max	`1` · `2` · `3`	Cap the maximum model tier used. Useful for cost control.

Auto mode

In auto mode, Select uses a two-stage routing pipeline:

LLM classifier — A lightweight Claude call analyses your messages for signals: agentic intent, tool use, code, reasoning depth, latency sensitivity. Completes in under 2 seconds.
Availability-weighted scorer — Combines capability tier score with live availability data to pick the model least likely to be rate-limited right now.

If the classifier times out, routing falls back to the heuristic scorer alone — no request is dropped.

curl — auto mode with hints

curl -X POST https://api.select.ax/v1/messages \
  -H "Authorization: Bearer sk-sel-<your-key>" \
  -H "Content-Type: application/json" \
  -H "x-latency: low" \
  -H "x-privacy: tee" \
  -d '{
    "messages": [{ "role": "user", "content": "Fix this bug: ..." }],
    "tools": [{ "name": "bash", "description": "Run shell commands" }],
    "max_tokens": 2048
  }'

Direct mode

Use x-mode: direct with x-model to target a specific model. The availability scorer is bypassed — your request goes to exactly the model you specify.

curl — direct mode

curl -X POST https://api.select.ax/v1/messages \
  -H "Authorization: Bearer sk-sel-<your-key>" \
  -H "Content-Type: application/json" \
  -H "x-mode: direct" \
  -H "x-model: kimi-k2.6-tee" \
  -d '{
    "messages": [{ "role": "user", "content": "..." }],
    "max_tokens": 4096
  }'

Response headers

Every response includes metadata headers describing what happened:

Header	Description
x-model-used	Slug of the model that handled the request, e.g. `kimi-k2.6-tee`.
x-tier	Tier of the selected model (`1`–`4`).
x-score	Routing score that won (0–100).
x-cost-micro	Cost of this request in µ$ (micro-dollars). $1 = 1,000,000 µ$.
x-cost-usd	Cost in USD as a decimal string, e.g. `0.000312`.
x-balance-remaining-micro	Remaining balance in µ$ after this request.
x-balance-remaining-usd	Remaining balance in USD, e.g. `18.4231`.
x-classifier-source	`llm` — LLM classifier ran. `heuristic` — fell back to scorer. `direct` — direct mode.
x-classifier-confidence	Classifier confidence score (0–1), if available.
x-request-id	UUID for this request — include in support queries.

Streaming

Set "stream": true in your request body to receive a Server-Sent Events stream. Each event follows the standard SSE format with a data: prefix. The stream terminates with data: [DONE].

Note: Cost and token counts are not included in response headers for streaming requests — they are logged server-side and visible in your dashboard.

python — streaming

import requests, json

resp = requests.post(
    "https://api.select.ax/v1/messages",
    headers={
        "Authorization": "Bearer sk-sel-<your-key>",
        "Content-Type": "application/json",
    },
    json={
        "messages": [{"role": "user", "content": "Write a sorting algorithm."}],
        "max_tokens": 2048,
        "stream": True,
    },
    stream=True,
)

for line in resp.iter_lines():
    if line and line.startswith(b"data: "):
        data = line[6:]
        if data == b"[DONE]":
            break
        chunk = json.loads(data)
        print(chunk, flush=True)

Errors

All errors return JSON with an error field describing the issue.

Status	Meaning
400	Bad request — invalid JSON body or missing `messages` array.
401	Unauthorised — missing, malformed, or inactive API key.
402	Payment required — insufficient balance. Top up at select.ax/pricing.
403	Forbidden — restricted models require an enterprise key.
500	Internal server error — routing or provider failure.
502/503	Provider error — upstream model returned an error. Retry after a short delay.

Model catalog

The active model catalog is exposed through GET /v1/models. Provider names are shown as Select Network in customer-facing surfaces.

Model	Slug	Tier	Context	Input /1M	Output /1M	TEE
MiniMax M2.5 TEE	`minimax-m2.5-tee`	T1	256K	$0.171	$1.368	✓
GLM 5 Turbo	`glm-5-turbo`	T1	256K	$0.5575	$2.2304	✓
Qwen3 Coder Next TEE	`qwen3-coder-next-tee`	T1	256K	$0.1368	$0.855	✓
DeepSeek V4 Flash	`deepseek-v4-flash`	T1	1 million	$0.1596	$0.3192	—
Kimi K2.5 TEE	`kimi-k2.5-tee`	T2	256K	$0.5016	$2.28	✓
Qwen3.5 397B A17B TEE	`qwen3.5-397b-tee`	T2	256K	$0.4446	$2.6676	✓
Qwen3.6 27B TEE	`qwen3.6-27b-tee`	T2	262K	$0.57	$2.28	✓
DeepSeek V4 Pro	`deepseek-v4-pro`	T2	1 million	$0.522	$1.044	—
Kimi K2.6 TEE	`kimi-k2.6-tee`	T3	256K	$1.083	$4.56	✓
GLM 5.1 TEE	`glm-5.1-tee`	T3	256K	$1.197	$3.99	✓
Kimi K2.6 Official	`kimi-k2.6-official`	T3	256K	$1.083	$4.56	—

Tier system

Models are grouped into four capability tiers. Auto routing uses tiers as a primary signal alongside live utilisation.

Tier	Characteristics	Best for
T1	Balanced — fast, cost-efficient	Summarisation, classification, high-throughput pipelines
T2	Frontier agentic — multi-step, tool-capable	Agentic workflows, tool use, reasoning chains
T3	Cutting edge — SWE-bench leaders	Complex coding, deep reasoning, long-context tasks
T4	Enterprise — restricted premium models	Enterprise accounts only

Questions? Email hello@select.ax or check your dashboard at select.ax/dashboard.