List all models across all providers
GET /v1/llm-models
Parameters
Section titled “ Parameters ”Query Parameters
Section titled “Query Parameters ”Responses
Section titled “ Responses ”List of all models
Response wrapper for list endpoints.
All list endpoints return responses wrapped in a data field.
object
Array of items returned by the list operation.
LLM Model with provider info
object
Example
model_01933b5a00007000800000000000001Whether this model is installed (available in UI model pickers)
Readonly profile with model capabilities (not persisted to database)
object
Whether the model supports file/image attachments
Cost per million tokens
object
Cached read cost per million tokens (USD), if supported
Tiered pricing that applies above certain context thresholds. When present, the base cost fields apply up to the tier threshold, and each tier’s costs apply for tokens beyond that threshold.
A pricing tier that activates above a context token threshold. For example, OpenAI charges higher rates for prompts exceeding 200K tokens.
object
Context token threshold above which this tier applies
Cached read cost per million tokens (USD) for this tier, if supported
Input cost per million tokens (USD) for this tier
Output cost per million tokens (USD) for this tier
Input cost per million tokens (USD)
Output cost per million tokens (USD)
Short human-readable description of the model’s strengths and intended use
Model family (e.g., “gpt-4o”, “claude-3-5-sonnet”)
Knowledge cutoff date (YYYY-MM-DD format)
Last updated date (YYYY-MM-DD format)
Token limits
object
Maximum context window size in tokens
Maximum input tokens (if different from context - output)
Maximum images or PDF pages per request
Maximum output tokens
Display name of the model
Whether the model has open weights
Whether the model has reasoning/chain-of-thought capabilities
Reasoning effort configuration (for reasoning models)
object
Default reasoning effort for this model
Available reasoning effort values for this model
Named reasoning effort value for UI display
object
Display name (e.g., “Low”, “Medium”)
The API value (e.g., “low”, “medium”)
Release date (YYYY-MM-DD format)
Whether the model supports structured output (JSON mode)
Whether the model supports native execution phases (“commentary” / “final_answer”).
When true, the driver sends the phase field on assistant messages in the wire format.
Currently supported by GPT-5.4+ via OpenAI Responses API.
Whether temperature control is supported
Whether the model supports tool/function calling
Whether the model supports tool_search (deferred tool loading). When true, the driver can use namespaces and defer_loading to reduce token usage for large tool sets. Currently supported by GPT-5.4+.
Example
provider_01933b5a00007000800000000000001LLM provider type
How the model was added to the system
LLM model status