Tool Search
| ID | tool_search |
| Category | Optimization |
| Features | None |
| Dependencies | None |
Enables deferred tool loading for agents with many tools, on any model. Instead of sending full parameter schemas for every tool upfront, only tool names and descriptions are sent initially. The model loads full schemas on demand by calling the tool_search tool.
Unlike OpenAI Tool Search, which relies on OpenAI’s hosted tool_search feature, this capability implements tool search entirely client-side. It therefore works with Anthropic, Gemini, OpenAI Completions, and any other provider — not just GPT-5.4+.
tool_search— search the available tools by keyword and load their full parameter schemas.
How It Works
Section titled “How It Works”- Threshold check — deferral only activates when the total tool count meets or exceeds the threshold (default: 15). Below it, full schemas are sent unchanged.
- Schema stripping — a tool-definition hook replaces the parameter schema of every deferrable tool with a small stub (name + description survive). This runs when the runtime agent is built, so the model never receives the full schemas upfront.
tool_searchtool — a real tool is added to the agent. When the model calls it with a query, the tool inspects its sibling tools and returns the full JSON parameter schemas of the matches.- System-prompt guidance — a short note instructs the model to call
tool_searchbefore using a tool whose parameters it has not yet loaded. - Transparent execution — the underlying tools stay registered and executable. Tool calls and results work identically; only how schemas reach the model changes.
DeferrablePolicy
Section titled “DeferrablePolicy”Each tool has a deferrable policy that controls whether its schema can be deferred:
| Policy | Behavior |
|---|---|
never | Full schema always sent (use for high-frequency tools like write_todos) |
automatic | Deferred when tool_search is active and above threshold (default) |
always | Always deferred when tool_search is active |
The tool_search tool itself is never deferred.
Model Support
Section titled “Model Support”None required. Because deferral and search are implemented client-side, every model works the same way. For GPT-5.4+ you may prefer OpenAI Tool Search, which uses the provider’s hosted index; use this capability for all other models.
Configuration
Section titled “Configuration”{ "capabilities": ["tool_search"]}The activation threshold defaults to 15 tools (DEFAULT_TOOL_SEARCH_THRESHOLD).
Limitations
Section titled “Limitations”- Server-executed tools — the search reads schemas from the worker-side tool registry. This includes built-in tools and MCP server tools (MCP tools are registered as first-class registry tools). Client-side tools that are not registered worker-side are not returned by
tool_search(their stripped definition is still sent so the model knows they exist). - Extra round-trip — loading a schema costs one
tool_searchcall before the first use of a deferred tool. The token savings outweigh this for agents with many tools.
See Also
Section titled “See Also”- OpenAI Tool Search — hosted deferred loading for GPT-5.4+
- Capabilities Overview