API Reference
Select routes your requests to the best available frontier model — automatically, based on workload type, capability tier, and live utilisation. One endpoint for the active Select model catalog.
Overview
Select exposes inference endpoints under https://api.select.ax/v1. Requests are routed in real time to the best available model. You can influence routing via request headers — or let Select decide automatically.
Two routing modes are available:
- Auto mode (default) — Select classifies your request and picks the optimal model by tier, live load, and workload type.
- Direct mode — You specify exactly which model to use via
x-model.
Some models are TEE-enabled. The model catalog shows which models support TEE.
Authentication
All requests require a Bearer token in the Authorization header. API keys are issued after purchase at select.ax/pricing and follow the format sk-sel-….
Authorization: Bearer sk-sel-<your-api-key>Quick start
Send your first request in under a minute:
curl -X POST https://api.select.ax/v1/messages \
-H "Authorization: Bearer sk-sel-<your-key>" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{ "role": "user", "content": "Summarise the key steps of a RAG pipeline." }
],
"max_tokens": 1024
}'Select will automatically classify the request and route it to the best available model for the task.
Finish any source file with Select Finish: select-finish path/to/file.py
POST /v1/messages
The primary inference endpoint. Accepts a messages payload and returns a completion.
Routes to the best available model. Model selection is controlled by routing headers (see below).
Request body
| Parameter | Type | Description | |
|---|---|---|---|
| messages | array | Required | Array of message objects with role and content fields. |
| max_tokens | integer | Optional | Maximum tokens to generate. Defaults to 4096. |
| stream | boolean | Optional | Set to true to stream the response as SSE. |
| temperature | number | Optional | Sampling temperature. Passed through to the model. |
| top_p | number | Optional | Nucleus sampling threshold. |
| tools | array | Optional | Tool definitions. Presence influences routing toward tool-capable models. |
| tool_choice | object | Optional | Tool choice control. Passed through to the model. |
| response_format | object | Optional | Structured output format. Presence scores toward structured-output models. |
POST /v1/chat/completions
An OpenAI-compatible alias that delegates to /v1/messages. Use this as a drop-in replacement if your SDK or framework targets the OpenAI API format.
Identical behaviour to /v1/messages. All routing headers apply.
Select Finish
Get Claude to finish one source file from your terminal. The script shows the summary, suggested tests, cost, and remaining balance, then asks before applying the finished file.
Install
curl -o select-finish https://select.ax/scripts/select-finish.sh
chmod +x select-finish
export SELECTAX_API_KEY=sk-sel-...Run
./select-finish path/to/file.py
./select-finish path/to/file.py --test "python3 path/to/file.py"
./select-finish path/to/file.py --yes--yes is used.Single-file limit: approximately 150k input tokens.
What it does
| Step | Behaviour |
|---|---|
| Send | Uploads one source file to Select Finish. |
| Finish | Claude returns a finished version of the file plus a plain-English summary and suggested tests. |
| Approve | The script shows the cost and asks before changing anything. |
| Apply | The original file is backed up first, then replaced with the finished version. |
| Check | Python and JavaScript files get a basic syntax check automatically. Use --test for your own command. |
API note
The script uses select-finish-run, which returns Select Finish JSON containing summary, tests, and fixed_code. It is intended for the script workflow, not as a normal chat model in OpenAI-compatible clients.
{
"object": "select.finish.result",
"actual_cost": 0.248028,
"balance_remaining": 3.714298,
"summary": "Fixed required production-readiness issues.",
"tests": ["python3 -m py_compile path/to/file.py"],
"fixed_code": "..."
}Select Review
Use Select Review when you want a read-only audit instead of an applied fix. It returns findings, risks, patch guidance, and a test plan, but it does not rewrite files.
Install
curl -o select-review https://select.ax/scripts/select-review.sh
chmod +x select-review
export SELECTAX_API_KEY=sk-sel-...Run
./select-review path/to/file.pyCheck cost first
curl -o select-review-estimate https://select.ax/scripts/select-review-estimate.sh
chmod +x select-review-estimate
./select-review-estimate path/to/file.pyTool endpoints
| Tool | What it does |
|---|---|
select-review | Cost estimate only, no charge. |
select-review-run | Claude review, bills immediately. |
API reference
Select Review is review-only: it returns findings, risks, and recommendations. It does not generate new code, rewrite files, or complete implementation tasks.
Select Review is a tool workflow, not a selectable model in the model catalog. Review requests bypass Smart Select routing and are billed from your existing dollar balance only when you run a confirmed review.
Use the script above, or call the tool endpoint with the Select Review model field shown below.
| Header | Required | Description |
|---|---|---|
Authorization | Yes | Bearer sk-sel-... |
Content-Type | Yes | application/json |
x-confirm-review | No | Use true with select-review to run the paid review after checking the estimate. |
select-review returns an estimate first. Add x-confirm-review: true to run the paid review.Estimate request
curl -X POST https://api.select.ax/v1/chat/completions \
-H "Authorization: Bearer sk-sel-<your-key>" \
-H "Content-Type: application/json" \
-d '{
"model": "select-review",
"messages": [
{
"role": "user",
"content": "function total(items) { let sum = 0; for (const i in items) { if (items[i].price) sum += items[i].price; } return sum.toFixed(2); }"
}
]
}'Estimate response
{
"object": "select.review.estimate",
"review_model": "sonnet",
"estimated_input_tokens": 193,
"estimated_output_tokens": 600,
"estimated_cost": 0.012772,
"balance_current": 9.87,
"balance_after": 9.857228,
"message": "Select Review - Claude. Estimated cost: $0.013 from your balance ($9.87 remaining). Add x-confirm-review: true to proceed.",
"proceed": false
}Confirmed review request
curl -X POST https://api.select.ax/v1/chat/completions \
-H "Authorization: Bearer sk-sel-<your-key>" \
-H "Content-Type: application/json" \
-H "x-confirm-review: true" \
-d '{
"model": "select-review",
"messages": [
{
"role": "user",
"content": "function total(items) { let sum = 0; for (const i in items) { if (items[i].price) sum += items[i].price; } return sum.toFixed(2); }"
}
]
}'Confirmed review response
{
"object": "select.review.result",
"review_model": "sonnet",
"actual_input_tokens": 193,
"actual_output_tokens": 600,
"actual_cost": 0.012772,
"balance_remaining": 9.857228,
"review": "Markdown review output..."
}Review response headers
| Header | Description |
|---|---|
x-model-used | The Select Review tool route used for the request. |
x-request-id | Request identifier for dashboard and support lookup. |
x-cost-usd | Dollar cost charged for the confirmed review. |
x-balance-remaining-usd | Remaining dollar balance after the confirmed review. |
Immediate review request
Use select-review-run when a client cannot send custom headers.
curl -X POST https://api.select.ax/v1/chat/completions \
-H "Authorization: Bearer sk-sel-<your-key>" \
-H "Content-Type: application/json" \
-d '{
"model": "select-review-run",
"messages": [
{
"role": "user",
"content": "function total(items) { let sum = 0; for (const i in items) { if (items[i].price) sum += items[i].price; } return sum.toFixed(2); }"
}
]
}'Billing rates
| Tool route | Billing rate |
|---|---|
select-review | Estimate only, no charge until confirmed. $0.004 / 1K input · $0.020 / 1K output when confirmed. |
select-review-run | $0.004 / 1K input · $0.020 / 1K output |
Routing headers
These request headers control how Select routes your request. All are optional — omitting them triggers fully automatic routing.
| Header | Values | Description |
|---|---|---|
| x-mode | auto · direct | Routing mode. auto (default) lets Select decide. direct routes to a specific model via x-model. |
| x-model | model slug | Target model slug for direct mode, e.g. kimi-k2.6-tee. Ignored in auto mode. |
| x-latency | low · normal | Latency preference. low biases routing toward faster Tier 1–2 models. |
| x-privacy | tee | Set to tee to restrict routing to TEE-only models. |
| x-tier-max | 1 · 2 · 3 | Cap the maximum model tier used. Useful for cost control. |
Auto mode
In auto mode, Select uses a two-stage routing pipeline:
- LLM classifier — A lightweight Claude call analyses your messages for signals: agentic intent, tool use, code, reasoning depth, latency sensitivity. Completes in under 2 seconds.
- Availability-weighted scorer — Combines capability tier score with live availability data to pick the model least likely to be rate-limited right now.
If the classifier times out, routing falls back to the heuristic scorer alone — no request is dropped.
curl -X POST https://api.select.ax/v1/messages \
-H "Authorization: Bearer sk-sel-<your-key>" \
-H "Content-Type: application/json" \
-H "x-latency: low" \
-H "x-privacy: tee" \
-d '{
"messages": [{ "role": "user", "content": "Fix this bug: ..." }],
"tools": [{ "name": "bash", "description": "Run shell commands" }],
"max_tokens": 2048
}'Direct mode
Use x-mode: direct with x-model to target a specific model. The availability scorer is bypassed — your request goes to exactly the model you specify.
curl -X POST https://api.select.ax/v1/messages \
-H "Authorization: Bearer sk-sel-<your-key>" \
-H "Content-Type: application/json" \
-H "x-mode: direct" \
-H "x-model: kimi-k2.6-tee" \
-d '{
"messages": [{ "role": "user", "content": "..." }],
"max_tokens": 4096
}'Response headers
Every response includes metadata headers describing what happened:
| Header | Description |
|---|---|
| x-model-used | Slug of the model that handled the request, e.g. kimi-k2.6-tee. |
| x-tier | Tier of the selected model (1–4). |
| x-score | Routing score that won (0–100). |
| x-cost-micro | Cost of this request in µ$ (micro-dollars). $1 = 1,000,000 µ$. |
| x-cost-usd | Cost in USD as a decimal string, e.g. 0.000312. |
| x-balance-remaining-micro | Remaining balance in µ$ after this request. |
| x-balance-remaining-usd | Remaining balance in USD, e.g. 18.4231. |
| x-classifier-source | llm — LLM classifier ran. heuristic — fell back to scorer. direct — direct mode. |
| x-classifier-confidence | Classifier confidence score (0–1), if available. |
| x-request-id | UUID for this request — include in support queries. |
Streaming
Set "stream": true in your request body to receive a Server-Sent Events stream. Each event follows the standard SSE format with a data: prefix. The stream terminates with data: [DONE].
import requests, json
resp = requests.post(
"https://api.select.ax/v1/messages",
headers={
"Authorization": "Bearer sk-sel-<your-key>",
"Content-Type": "application/json",
},
json={
"messages": [{"role": "user", "content": "Write a sorting algorithm."}],
"max_tokens": 2048,
"stream": True,
},
stream=True,
)
for line in resp.iter_lines():
if line and line.startswith(b"data: "):
data = line[6:]
if data == b"[DONE]":
break
chunk = json.loads(data)
print(chunk, flush=True)Errors
All errors return JSON with an error field describing the issue.
| Status | Meaning |
|---|---|
| 400 | Bad request — invalid JSON body or missing messages array. |
| 401 | Unauthorised — missing, malformed, or inactive API key. |
| 402 | Payment required — insufficient balance. Top up at select.ax/pricing. |
| 403 | Forbidden — restricted models require an enterprise key. |
| 500 | Internal server error — routing or provider failure. |
| 502/503 | Provider error — upstream model returned an error. Retry after a short delay. |
Model catalog
The active model catalog is exposed through GET /v1/models. Provider names are shown as Select Network in customer-facing surfaces.
| Model | Slug | Tier | Context | Input /1M | Output /1M | TEE |
|---|---|---|---|---|---|---|
| MiniMax M2.5 TEE | minimax-m2.5-tee | T1 | 256K | $0.171 | $1.368 | ✓ |
| GLM 5 Turbo | glm-5-turbo | T1 | 256K | $0.5575 | $2.2304 | ✓ |
| Qwen3 Coder Next TEE | qwen3-coder-next-tee | T1 | 256K | $0.1368 | $0.855 | ✓ |
| DeepSeek V4 Flash | deepseek-v4-flash | T1 | 1 million | $0.1596 | $0.3192 | — |
| Kimi K2.5 TEE | kimi-k2.5-tee | T2 | 256K | $0.5016 | $2.28 | ✓ |
| Qwen3.5 397B A17B TEE | qwen3.5-397b-tee | T2 | 256K | $0.4446 | $2.6676 | ✓ |
| Qwen3.6 27B TEE | qwen3.6-27b-tee | T2 | 262K | $0.57 | $2.28 | ✓ |
| DeepSeek V4 Pro | deepseek-v4-pro | T2 | 1 million | $0.522 | $1.044 | — |
| Kimi K2.6 TEE | kimi-k2.6-tee | T3 | 256K | $1.083 | $4.56 | ✓ |
| GLM 5.1 TEE | glm-5.1-tee | T3 | 256K | $1.197 | $3.99 | ✓ |
| Kimi K2.6 Official | kimi-k2.6-official | T3 | 256K | $1.083 | $4.56 | — |
Tier system
Models are grouped into four capability tiers. Auto routing uses tiers as a primary signal alongside live utilisation.
| Tier | Characteristics | Best for |
|---|---|---|
| T1 | Balanced — fast, cost-efficient | Summarisation, classification, high-throughput pipelines |
| T2 | Frontier agentic — multi-step, tool-capable | Agentic workflows, tool use, reasoning chains |
| T3 | Cutting edge — SWE-bench leaders | Complex coding, deep reasoning, long-context tasks |
| T4 | Enterprise — restricted premium models | Enterprise accounts only |