Parallel Tool Calls
| ID | parallel_tool_calls |
| Category | Optimization |
| Features | None |
| Dependencies | None |
| Risk | Low |
Controls the agent’s request-level preference for parallel (multiple per turn) tool calls, and whether the local tool scheduler runs a batch concurrently.
Most providers emit several tool calls in a single turn by default, and Everruns runs independent tool calls concurrently. This capability makes that behavior explicit and configurable: turn it up to actively request batching of independent reads and searches, or turn it down to force strictly one tool call at a time.
None — this capability only configures the outbound LLM request and the local tool scheduler.
How It Works
Section titled “How It Works”The capability resolves a mode into a request-level preference that threads
through two places:
- The LLM request. On providers that expose a wire control, the preference
is sent on the request:
- OpenAI (Chat Completions and Responses), and the OpenAI-compatible
MAI and Fireworks providers — the top-level
parallel_tool_callsboolean. - OpenRouter — forwarded on the Responses body; ignored by routed providers that do not support it.
- Anthropic —
tool_choice.disable_parallel_tool_use(sent only when the request carries tools). - Gemini and Bedrock have no equivalent request control, so nothing is sent. The local scheduler (below) still honors the preference.
- OpenAI (Chat Completions and Responses), and the OpenAI-compatible
MAI and Fireworks providers — the top-level
- The local tool scheduler.
avoidforces the scheduler to run the turn’s tool calls strictly sequentially. This applies to every provider, soavoidis honored even where there is no wire control.
Whether the preference is sent on the wire is gated per provider/model: a driver that cannot express it omits the field rather than risking an API error.
Config
Section titled “Config”{ "capabilities": [ { "ref": "parallel_tool_calls", "config": { "mode": "prefer" } } ]}| Mode | Provider request | Local scheduler |
|---|---|---|
prefer (default) | Request parallel tool calls where supported | Concurrent (class-aware, the default) |
avoid | Ask for one tool call per turn where supported | Serialized |
none | Omit — provider default | Concurrent (class-aware, the default) |
When the capability is enabled without an explicit mode, the default is
prefer. none is equivalent to not enabling the capability; it is useful to
neutralize a preference inherited from a parent harness.
The Generic harness and the built-in coding harnesses enable this
capability with mode: "prefer" by default.
Precedence
Section titled “Precedence”An explicit parallel_tool_calls field set directly on a harness, agent, or
session is a lower-level escape hatch and takes precedence over this capability.
When To Use
Section titled “When To Use”prefer— workloads that issue many independent reads or searches per turn benefit from batching (faster turns, fewer round-trips).avoid— when tool calls must be observed and applied one at a time, or when a model produces lower-quality parallel batches for your workload.
Limitations
Section titled “Limitations”- Provider gating.
preferonly changes the wire request on providers with a control for it (OpenAI/Anthropic families). Elsewhere providers already parallelize by default, sopreferis a no-op on the wire. - Durable mode. A harness/agent-level
modeother than the default (prefer) is applied with full fidelity in the in-process runtime; in durable worker mode, harness/agent capability config falls back to the default — set the mode at the session level, or use the explicitparallel_tool_callsfield, to override durably. (This matches other config-bearing capabilities.)
See Also
Section titled “See Also”- Capabilities — the extension model this capability plugs into
- Agentic Loop — how the runtime schedules a batch of tool calls