Environment Variables
DEV_MODE
Section titled “DEV_MODE”Enable development mode with in-memory storage. No PostgreSQL required.
| Property | Value |
|---|---|
| Required | No |
| Default | false |
Example:
# Start in dev mode (no database required)DEV_MODE=true ./target/debug/everruns-server
# Or with 1DEV_MODE=1 ./target/debug/everruns-serverNotes:
- When enabled, uses in-memory storage instead of PostgreSQL
- All data is lost when the server stops
- gRPC server and worker communication are disabled
- Stale task reclamation is disabled
- Useful for quick local development and testing
- Not suitable for production or multi-instance deployments
Limitations in dev mode:
- No persistence (data is lost on restart)
- No worker support (all execution happens in-process)
- No distributed tracing of worker activities
- Single-instance only
DEPLOYMENT_GRADE
Section titled “DEPLOYMENT_GRADE”Deployment environment grade. Controls which features and capabilities are available.
| Property | Value |
|---|---|
| Required | No |
| Default | prod (or dev if DEV_MODE=true) |
Valid values:
| Grade | Description |
|---|---|
dev | Development - all experimental features enabled |
poc | Proof of concept / demo environment |
preview | Preview/staging environment |
prod | Production - only stable features |
Example:
# Run in development mode with experimental featuresDEPLOYMENT_GRADE=dev ./target/debug/everruns-server
# Production mode (default)DEPLOYMENT_GRADE=prod ./target/debug/everruns-serverNotes:
- If not set, falls back to
DEV_MODE: ifDEV_MODE=true, usesdev; otherwise usesprod - Experimental capabilities (e.g., Docker Container) are only available in
devgrade - Experimental seed agents (e.g., Python Coder) are only created in
devgrade - Use
devfor local development and testing experimental features - Use
prodfor production deployments
API_PREFIX
Section titled “API_PREFIX”Path prefix for REST API routes.
| Property | Value |
|---|---|
| Required | No |
| Default | /api |
Example:
# Routes at /api/v1/agentsAPI_PREFIX=/apiNotes:
/health,/api-doc/openapi.json,/mcp,/.well-known/*,/oauth/*, and/cli/login-successstay at the server root- REST API routes including auth (
/v1/auth/*) are mounted under this prefix - OAuth callback URLs use
AUTH_BASE_URL, which defaults toPUBLIC_APP_URLplusAPI_PREFIXwhen unset - Override only if you need a non-
/apiREST prefix behind a reverse proxy or gateway
PUBLIC_APP_URL
Section titled “PUBLIC_APP_URL”Public browser origin for the Everruns app. In single-origin deployments, set this once and the server derives FRONTEND_URL and AUTH_BASE_URL from it.
| Property | Value |
|---|---|
| Required | No |
| Default | http://localhost:9300 |
Example:
PUBLIC_APP_URL=https://everruns.example.comNotes:
FRONTEND_URLdefaults toPUBLIC_APP_URLAUTH_BASE_URLdefaults toPUBLIC_APP_URLplusAPI_PREFIX(for example,https://everruns.example.com/api)- Set
FRONTEND_URLonly when browser redirects must land on a different origin - Set
AUTH_BASE_URLonly when OAuth callbacks use a different public API base
CORS_ALLOWED_ORIGINS
Section titled “CORS_ALLOWED_ORIGINS”Comma-separated list of allowed origins for cross-origin requests. Only needed when the UI is served from a different domain than the API.
| Property | Value |
|---|---|
| Required | No |
| Default | Not set (CORS disabled) |
Example:
# Allow requests from a different frontend originCORS_ALLOWED_ORIGINS=https://app.example.com
# Multiple originsCORS_ALLOWED_ORIGINS=https://app.example.com,https://admin.example.comNotes:
- Not needed for local development (Caddy reverse proxy keeps UI and backend on one origin)
- Not needed in production if using a reverse proxy on the same domain
- If set, credentials are allowed (
Access-Control-Allow-Credentials: true) - Wildcard (
*) is not supported when using credentials
HTTP_ADDR
Section titled “HTTP_ADDR”Bind address for the server HTTP API.
| Property | Value |
|---|---|
| Required | No |
| Default | 0.0.0.0:9000 |
Example:
HTTP_ADDR=0.0.0.0:9000Notes:
ADDRis supported as a legacy alias- Container images already default to
0.0.0.0:9000; most deployments do not need to set this
VALKEY_URL
Section titled “VALKEY_URL”Connection URL for Valkey (Redis-compatible) used for distributed rate limiting across control-plane instances.
| Property | Value |
|---|---|
| Required | No |
| Default | Not set (uses per-instance in-memory rate limiting) |
Example:
# Local ValkeyVALKEY_URL=redis://localhost:6379
# With authentication
# TLS (managed cloud service)Notes:
- When not set, rate limiting falls back to in-memory governor (per-instance, no coordination)
- With N instances behind a load balancer, per-instance rate limiting allows N× the intended budget per IP — set
VALKEY_URLfor coordinated limits - Accepts
redis://,rediss://(TLS),valkey://,valkeys://(TLS) schemes - Fail-open: if Valkey is unreachable, requests are allowed (availability over strictness)
- Only used by control-plane (server); workers don’t need this variable
- Uses sliding-window counters via Lua scripts for atomic rate limit checks
DATABASE_UNPOOLED_URL
Section titled “DATABASE_UNPOOLED_URL”Direct PostgreSQL connection URL used only for session-scoped LISTEN/NOTIFY listeners.
| Property | Value |
|---|---|
| Required | No |
| Default | Not set (listeners reuse DATABASE_URL if it is a direct connection) |
Example:
# Query traffic through a pooler, listeners through a direct endpointNotes:
- Use this when
DATABASE_URLpoints at Neon-pooler, PgBouncer, or another proxy that does not preserve session-scopedLISTEN/NOTIFYsemantics. - Listener paths include PostgreSQL-backed event wakeups, notification SSE, and PG task notification fallback when NATS is unavailable.
- If
DATABASE_URLorDATABASE_UNPOOLED_URLappears to point at a pooled/proxied endpoint, startup now fails fast with guidance to set a direct listener URL. - Ordinary query traffic still uses
DATABASE_URL.
NATS_URL
Section titled “NATS_URL”Connection URL for NATS with JetStream, used for push-based event delivery and task notifications.
| Property | Value |
|---|---|
| Required | No |
| Default | Not set (uses PG NOTIFY for task notifications, in-memory broadcast for SSE event delivery) |
Example:
# Local NATSNATS_URL=nats://localhost:4222
# ClusterNATS_URL=nats://nats1:4222,nats://nats2:4222,nats://nats3:4222Notes:
- When not set, the system behaves exactly as before — all events persist to PG, SSE polls PG, task notifications use PG NOTIFY. Zero behavioral change.
- When set, enables two features:
- Ephemeral event delivery: delta events (
output.message.delta,reason.thinking.delta,tool.output.delta,llm.generation) skip PostgreSQL and flow only through NATS JetStream. SSE streams subscribe to NATS instead of polling PG. - Task notifications:
task.available.{activity_type}subjects replace PG NOTIFY for push-based worker notification. Lower latency (~1ms vs ~30ms), supports multi-instance deployments.
- Ephemeral event delivery: delta events (
- When NATS event delivery is active, the server skips the legacy PostgreSQL event listener used only for SSE wakeups.
- NATS JetStream must be enabled on the server (
--jetstreamflag) - Fail-graceful: if NATS connection fails at startup, falls back to PG NOTIFY + in-memory delivery with a warning
- Only used by control-plane (server); workers communicate via gRPC and don’t need NATS access
- Default port: 4222 (or
PORT_PREFIX22withPORT_PREFIX) just start-allautomatically starts NATS and exportsNATS_URLifnats-serveris installed
LLM Provider API Keys
Section titled “LLM Provider API Keys”LLM provider API keys (OpenAI, Anthropic, Gemini) are primarily stored encrypted in the database and managed via the Settings > Providers UI.
| Property | Value |
|---|---|
| Storage | Database (encrypted with AES-256-GCM) |
| Configuration | Settings > Providers UI or /v1/llm-providers API |
| Supported Providers | OpenAI, Anthropic, Google Gemini |
Required for encryption:
The SECRETS_ENCRYPTION_KEY environment variable must be set for the control-plane API to encrypt/decrypt API keys. Workers receive decrypted API keys via gRPC and do not need this variable.
# Generate a new keypython3 -c "import os, base64; print('kek-v1:' + base64.b64encode(os.urandom(32)).decode())"
# Set in environment (control-plane only)SECRETS_ENCRYPTION_KEY=kek-v1:your-generated-key-hereDefault API Keys (Development Convenience)
Section titled “Default API Keys (Development Convenience)”For development, you can set default API keys via environment variables on the control-plane only. These are used as fallbacks when providers don’t have keys configured in the database.
| Variable | Description |
|---|---|
DEFAULT_OPENAI_API_KEY | Fallback API key for OpenAI providers |
DEFAULT_ANTHROPIC_API_KEY | Fallback API key for Anthropic providers |
DEFAULT_GEMINI_API_KEY | Fallback API key for Google Gemini providers |
Example:
# Set in .env or environment (control-plane only)DEFAULT_OPENAI_API_KEY=sk-...DEFAULT_ANTHROPIC_API_KEY=sk-ant-...DEFAULT_GEMINI_API_KEY=AIza...Notes:
- These variables are only used by the control-plane, not workers
- Workers receive API keys via gRPC from the control-plane
- Database-stored keys always take priority over environment variables
- These are intended for development convenience, not production use
- The
just start-allcommand automatically sets these fromOPENAI_API_KEY,ANTHROPIC_API_KEY, andGEMINI_API_KEYif present - If no API key is configured for a provider, LLM calls will fail and users will see an error message in the chat: “I encountered an error while processing your request. Please try again later.”
System Email Delivery
Section titled “System Email Delivery”System email delivery is an internal service used by product and operational flows. It is not an agent capability, public API, or UI setting.
| Variable | Required | Default | Description |
|---|---|---|---|
EMAIL_PROVIDER | Yes, when sending email in production | unset / disabled | Email provider. Supported values: disabled, resend |
RESEND_API_KEY | Yes, when EMAIL_PROVIDER=resend | unset | Resend API key |
RESEND_API_BASE_URL | No | https://api.resend.com | Resend API base URL override for tests or controlled deployments |
Example:
EMAIL_PROVIDER=resendRESEND_API_KEY=re_...Notes:
- Set these on the control-plane process that performs system email sends.
- The sender is fixed in code as
Everruns <[email protected]>. - The Resend account must have
everruns.comverified and enabled for sending.
UI API Proxy Architecture
Section titled “UI API Proxy Architecture”The UI makes all REST API requests (including SSE) to /api/* paths. The backend serves those routes under /api directly. Root-level backend routes like /oauth/*, /mcp, and /.well-known/* bypass the UI and are proxied straight to the backend.
Local Development:
- Caddy on
:9300routes/api/*,/oauth/*,/mcp, and/.well-known/*to backend at:9301 - Example:
/api/v1/agents→http://localhost:9301/api/v1/agents - Example:
/oauth/authorize?...→http://localhost:9301/oauth/authorize?... - Example:
/mcp→http://localhost:9301/mcp - Example:
/.well-known/oauth-authorization-server→http://localhost:9301/.well-known/oauth-authorization-server - SSE streaming works via
flush_interval -1in Caddy config - No CORS needed (same-origin through Caddy)
Production:
- Configure your reverse proxy (nginx, Caddy, etc.) to route
/api/*,/oauth/*,/mcp, and/.well-known/*to the API server - Disable response buffering for SSE endpoints
- Example Caddy config: see
local/Caddyfile
SSE Streaming Configuration
Section titled “SSE Streaming Configuration”| Variable | Default | Description |
|---|---|---|
SSE_REALTIME_CYCLE_SECS | 300 | Connection cycle interval for session event streams (seconds) |
SSE_MONITORING_CYCLE_SECS | 600 | Connection cycle interval for durable monitoring streams (seconds) |
SSE_HEARTBEAT_INTERVAL_SECS | 30 | Interval between heartbeat comments on all SSE streams (seconds) |
SSE_GLOBAL_MAX | 10000 | Maximum total SSE connections across all users |
SSE_PER_SESSION_MAX | 5 | Maximum SSE connections per session |
SSE_PER_ORG_MAX | 1000 | Maximum SSE connections per organization |
Notes:
- Heartbeat comments (
: heartbeat\n\n) are sent on all SSE streams to detect stale connections - The heartbeat interval must be less than the SDK read timeout (default: 60s) with safety margin
- Connection cycling prevents stale connections through proxies and load balancers
- When running behind HTTP/1.1 proxies, increase
SSE_REALTIME_CYCLE_SECSto reduce reconnection frequency
Worker gRPC Configuration
Section titled “Worker gRPC Configuration”SERVER_GRPC_ADDRESS
Section titled “SERVER_GRPC_ADDRESS”Address of the server gRPC endpoint for worker communication.
| Property | Value |
|---|---|
| Required | No (worker only) |
| Default | 127.0.0.1:9001 |
Example:
SERVER_GRPC_ADDRESS=127.0.0.1:9001Notes:
- Workers communicate with the server via gRPC for all database operations
WORKER_GRPC_ADDRESSis supported as a legacy alias- The server exposes both HTTP (default
9000) and gRPC (default9001) interfaces - Workers are stateless and do not connect directly to the database
WORKER_GRPC_AUTH_TOKEN
Section titled “WORKER_GRPC_AUTH_TOKEN”Bearer token for authenticating worker gRPC connections to the control-plane.
| Property | Value |
|---|---|
| Required | Yes (production); No (dev mode) |
| Default | Unset (auth disabled) |
Example:
WORKER_GRPC_AUTH_TOKEN=your-secret-tokenNotes:
- Must be set on both the server and all workers (same value)
- When unset, gRPC auth is disabled (acceptable for local development only)
- Server panics on startup if unset when not in dev mode
SERVER_GRPC_BIND_ADDR
Section titled “SERVER_GRPC_BIND_ADDR”Bind address for the server-side gRPC listener.
| Property | Value |
|---|---|
| Required | No (server only) |
| Default | 0.0.0.0:9001 |
Example:
SERVER_GRPC_BIND_ADDR=0.0.0.0:9001Notes:
WORKER_GRPC_ADDRis supported as a legacy alias
WORKER_GRPC_CONNECT_TIMEOUT
Section titled “WORKER_GRPC_CONNECT_TIMEOUT”Timeout in seconds for worker initial connection to control-plane gRPC.
| Property | Value |
|---|---|
| Required | No (worker only) |
| Default | 30 |
Example:
WORKER_GRPC_CONNECT_TIMEOUT=60WORKER_GRPC_TLS_CERT
Section titled “WORKER_GRPC_TLS_CERT”Path to PEM-encoded certificate file. On the server, this is the gRPC server certificate. On the worker, this is the client certificate presented during mTLS handshake.
| Property | Value |
|---|---|
| Required | No |
| Default | Not set (TLS disabled) |
Example:
WORKER_GRPC_TLS_CERT=/etc/everruns/grpc-cert.pemNotes:
- Must be set together with
WORKER_GRPC_TLS_KEY - Server: enables TLS on the gRPC listener when both cert and key are set
- Worker: presents client certificate to the server when both cert and key are set (requires
WORKER_GRPC_TLS_CA_CERT)
WORKER_GRPC_TLS_KEY
Section titled “WORKER_GRPC_TLS_KEY”Path to PEM-encoded private key file corresponding to WORKER_GRPC_TLS_CERT.
| Property | Value |
|---|---|
| Required | No |
| Default | Not set |
Example:
WORKER_GRPC_TLS_KEY=/etc/everruns/grpc-key.pemWORKER_GRPC_TLS_CA_CERT
Section titled “WORKER_GRPC_TLS_CA_CERT”Path to PEM-encoded CA certificate bundle for verifying the remote peer.
| Property | Value |
|---|---|
| Required | No |
| Default | Not set |
Example:
WORKER_GRPC_TLS_CA_CERT=/etc/everruns/grpc-ca.pemNotes:
- Server: when set, requires workers to present valid client certificates signed by this CA (mutual TLS)
- Worker: when set, verifies the server’s certificate against this CA and switches to
https://transport - For full mTLS, set on both server and worker alongside their respective cert/key pairs
WORKER_GRPC_TLS_DOMAIN
Section titled “WORKER_GRPC_TLS_DOMAIN”Override the expected server domain name for TLS certificate verification (worker only).
| Property | Value |
|---|---|
| Required | No |
| Default | Derived from SERVER_GRPC_ADDRESS hostname |
Example:
WORKER_GRPC_TLS_DOMAIN=control-plane.internalNotes:
- Useful when the server certificate CN/SAN differs from the connection hostname (e.g., connecting via IP but cert has a DNS name)
OpenTelemetry Configuration
Section titled “OpenTelemetry Configuration”Everruns supports distributed tracing via OpenTelemetry with OTLP export. Traces follow the Gen-AI semantic conventions for LLM operations.
OTEL_EXPORTER_OTLP_ENDPOINT
Section titled “OTEL_EXPORTER_OTLP_ENDPOINT”OTLP endpoint for trace export (e.g., Grafana Tempo, Datadog, or any OTLP-compatible backend).
| Property | Value |
|---|---|
| Required | No |
| Default | Not set (tracing disabled) |
Example:
# For local OTLP collectorOTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
# For production TempoOTEL_EXPORTER_OTLP_ENDPOINT=http://tempo.monitoring:4317Notes:
- When set, traces are exported via OTLP/gRPC
- Connect to any OTLP-compatible backend for trace visualization
- Without this variable, only console logging is enabled
OTEL_SERVICE_NAME
Section titled “OTEL_SERVICE_NAME”Service name for traces.
| Property | Value |
|---|---|
| Required | No |
| Default | everruns-server (API), everruns-worker (Worker) |
Example:
OTEL_SERVICE_NAME=everruns-prod-apiOTEL_SERVICE_VERSION
Section titled “OTEL_SERVICE_VERSION”Service version for traces.
| Property | Value |
|---|---|
| Required | No |
| Default | Cargo package version |
OTEL_ENVIRONMENT
Section titled “OTEL_ENVIRONMENT”Deployment environment label.
| Property | Value |
|---|---|
| Required | No |
| Default | Not set |
Example:
OTEL_ENVIRONMENT=productionOTEL_RECORD_CONTENT
Section titled “OTEL_RECORD_CONTENT”Enable recording of LLM input/output content in traces. Warning: May contain sensitive data.
| Property | Value |
|---|---|
| Required | No |
| Default | false |
Example:
# Standard OTel env var (preferred)OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=true
# Legacy alias (also works)OTEL_RECORD_CONTENT=trueNotes:
- When enabled,
gen_ai.input.messages,gen_ai.output.messages,gen_ai.tool.call.arguments,gen_ai.tool.call.result, and thinking content are recorded - Disabled by default for privacy and data size concerns
- Only enable in development or when debugging specific issues
Local Development with OpenTelemetry
Section titled “Local Development with OpenTelemetry”To visualize traces locally, point OTEL_EXPORTER_OTLP_ENDPOINT at any OTLP-compatible collector:
# Set OTLP endpoint for API and Workerexport OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
# Start servicesjust start-allGen-AI Trace Structure
Section titled “Gen-AI Trace Structure”Traces follow the agentic execution lifecycle with 13 event types:
invoke_agent {turn_id} (root span)├── reason (LLM reasoning phase)│ ├── thinking (extended thinking, if enabled)│ └── chat {model} (LLM API call)├── act (tool execution phase)│ ├── execute_tool {name}│ └── execute_tool {name}├── reason (iteration 2)│ └── chat {model}└── ...Gen-AI Trace Attributes
Section titled “Gen-AI Trace Attributes”All spans include OpenTelemetry attributes following the Gen-AI semantic conventions:
| Attribute | Span Types | Description |
|---|---|---|
gen_ai.operation.name | All | Operation type (invoke_agent, chat, execute_tool, reason, act, thinking) |
gen_ai.system | chat | Provider (openai, anthropic, gemini) |
gen_ai.request.model | chat, thinking | Requested model name |
gen_ai.response.model | chat | Model actually used |
gen_ai.response.id | chat | Response identifier |
gen_ai.response.finish_reasons | chat | Why generation stopped |
gen_ai.usage.input_tokens | chat, reason, invoke_agent | Prompt tokens used |
gen_ai.usage.output_tokens | chat, reason, invoke_agent | Completion tokens used |
gen_ai.usage.cache_read_tokens | chat | Tokens read from prompt cache |
gen_ai.usage.cache_creation_tokens | chat | Tokens written to prompt cache |
gen_ai.output.type | chat | text or tool_calls |
gen_ai.conversation.id | All | Session identifier |
gen_ai.tool.name | execute_tool | Tool name |
gen_ai.tool.call.id | execute_tool | Tool call identifier |
tool.success | execute_tool | Whether tool succeeded |
turn.id | invoke_agent | Turn identifier |
turn.iterations | invoke_agent | Number of reason/act iterations |
error.type | invoke_agent, chat, execute_tool | Error description (on failure) |
otel.status_code | invoke_agent | ERROR on failure/cancellation |
duration_ms | All | Span duration in milliseconds |
time_to_first_token_ms | chat | Streaming latency |
Braintrust Integration
Section titled “Braintrust Integration”Everruns supports sending turn, reasoning, tool, and session lifecycle events to Braintrust for observability, evaluation, and logging.
For setup instructions and configuration details, see the Braintrust Integration Guide.
| Variable | Required | Default | Description |
|---|---|---|---|
BRAINTRUST_ENABLED | No | enabled when API key is present | Explicit Braintrust on/off switch |
BRAINTRUST_API_KEY | Yes | - | API key from Braintrust settings |
BRAINTRUST_PROJECT_NAME | No | My Project | Project name for organizing traces |
BRAINTRUST_PROJECT_ID | No | - | Direct project UUID (skips name lookup) |
BRAINTRUST_API_URL | No | https://api.braintrust.dev | API base URL |
BRAINTRUST_QUEUE_CAPACITY | No | 1024 | Buffered event capacity before new exports are dropped |
BRAINTRUST_MAX_BATCH_SIZE | No | 50 | Max events per Braintrust insert call |
BRAINTRUST_FLUSH_INTERVAL_MS | No | 500 | Max delay before flushing a partial batch |
BRAINTRUST_REQUEST_TIMEOUT_MS | No | 10000 | Per-request timeout for Braintrust insert calls |
BRAINTRUST_MAX_RETRIES | No | 3 | Retries for 429, 5xx, and timeout/connect failures |
BRAINTRUST_RETRY_BASE_DELAY_MS | No | 250 | Initial retry backoff |
BRAINTRUST_RETRY_MAX_DELAY_MS | No | 5000 | Retry backoff cap |
BRAINTRUST_RECORD_CONTENT | No | false | Export raw turn and LLM text content |
BRAINTRUST_RECORD_THINKING | No | none | Extended thinking export mode: none, summary, full |
BRAINTRUST_TOOL_ARGS_MODE | No | redacted | Tool argument export mode: full, redacted, none |
BRAINTRUST_TOOL_RESULTS_MODE | No | summary | Tool result export mode: full, summary, redacted, none |
BRAINTRUST_DEBUG_PAYLOADS | No | false | Print full outbound Braintrust payload JSON to local debug logs |