Skip to content

Fireworks AI

Everruns runs agents on Fireworks AI through its OpenAI-compatible Chat Completions API. Fireworks serves frontier open models — Llama, Qwen, DeepSeek, Kimi, GLM, gpt-oss, and more — on a fast, cost-efficient inference platform, so the same Everruns agent, prompt, and capabilities run unchanged on open weights.

  • One key, many open models — a single provider exposing Fireworks’ serverless model catalog.
  • Automatic model discovery — Fireworks’ /models endpoint advertises rich metadata (chat, tool calling, image input, context window), which Everruns parses into capability profiles on sync, so tool and vision support surface correctly per model.
  • Full chat capabilities — streaming, tool/function calling, vision, and structured output, through the same uniform driver as every other provider.
  • Host-gated discovery — model sync runs only against Fireworks’ own host, so a custom proxy base URL is never probed.
  1. Go to SettingsProviders and click Add provider.
  2. Choose Fireworks AI.
  3. Paste your Fireworks API key. Create one from the Fireworks API keys page.
  4. Save. Everruns discovers available models and their capability profiles automatically.

You can optionally set a base URL to route through a proxy; leave it blank to use Fireworks’ hosted API (https://api.fireworks.ai/inference/v1).

Fireworks model ids are namespaced, for example accounts/fireworks/models/llama-v3p1-70b-instruct. After a sync, models appear in the agent and session model pickers with their discovered capabilities. Only chat models are imported — image and other non-chat endpoints are filtered out.