Get system health

GET

/v1/durable/health

const url = 'https://app.everruns.com/api/v1/durable/health';
const options = {method: 'GET'};

try {
  const response = await fetch(url, options);
  const data = await response.json();
  console.log(data);
} catch (error) {
  console.error(error);
}

curl --request GET \
  --url https://app.everruns.com/api/v1/durable/health

Responses

200

System health

application/json

System health response

object

active_workers

required

Number of workers in the running state, ready to claim tasks.

integer

claimed_tasks

required

Tasks currently claimed by a worker (gauge).

integer

completed_tasks

required

Cumulative count of tasks that completed successfully (monotonic counter).

integer

completed_workflows

required

Cumulative count of workflows that completed successfully (monotonic counter).

integer

current_load

required

Total tasks currently in flight across all workers.

integer

dlq_size

required

Size of the dead-letter queue (gauge). High values indicate stuck activities.

integer

event_delivery

Event-delivery backend in use: nats for distributed deployments, in_memory for single-instance. None if the field was omitted by an older server.

string | null

failed_tasks

required

Cumulative count of tasks that failed terminally or were sent to the DLQ (monotonic counter).

integer

failed_workflows

required

Cumulative count of workflows that ended in failure (monotonic counter).

integer

load_percentage

required

current_load / total_capacity * 100. 0.0 when no workers are registered.

number format: double

pending_tasks

required

Tasks waiting to be claimed (gauge).

integer

pending_workflows

required

Workflows waiting to be claimed (gauge).

integer

running_workflows

required

Workflows currently executing (gauge).

integer

started_tasks

required

Cumulative count of tasks claimed at least once (monotonic counter).

integer

started_workflows

required

Cumulative count of workflows that started (monotonic counter).

integer

status

required

Aggregate system status: healthy, degraded, or unhealthy. Derived from worker availability, load, and queue depths.

string

total_capacity

required

Sum of max_concurrency across all workers (the upper bound on concurrent task execution).

integer

total_workers

required

Total number of workers registered (heartbeating in the last window).

integer

workers_accepting

required

Number of workers currently accepting new task assignments (subset of active_workers; drains/backpressure excluded).

integer

Example

{
  "active_workers": 4,
  "claimed_tasks": 7,
  "completed_tasks": 12041,
  "completed_workflows": 4128,
  "current_load": 7,
  "dlq_size": 0,
  "event_delivery": "nats",
  "failed_tasks": 34,
  "failed_workflows": 12,
  "load_percentage": 21.875,
  "pending_tasks": 2,
  "pending_workflows": 1,
  "running_workflows": 3,
  "started_tasks": 12082,
  "started_workflows": 4144,
  "status": "healthy",
  "total_capacity": 32,
  "total_workers": 4,
  "workers_accepting": 4
}

500

Internal server error

application/json

Standard error response.

Wire shape is RFC 9457 Problem Details: every error response includes title and status, and may include detail, code, allowed_actions, retry_after_seconds, instance, and type. The content type is rewritten to application/problem+json by [problem_json_content_type].

object

allowed_actions

Recovery actions the caller can take next.

Array<object>

Agent-actionable link describing a follow-up the caller can take. Used in two contexts:

Error recovery — ErrorResponse.allowed_actions carries rels like retry, retry-later, unarchive, get-existing so the agent knows the right next call after a 4xx/429.
Entity hypermedia — WithUrls<T>.allowed_actions carries state-aware rels like cancel, events, self, update on the entity itself so the agent can follow links instead of reconstructing routes from prose.

The shape is intentionally identical across both contexts; the closed rel vocabulary documented in specs/api-conventions.md distinguishes them.

object

hint

Short, agent-readable hint (e.g. “Shorten ‘name’ to <= 200 chars.”, “Cancel the active turn for this session.”).

string | null

href

Absolute (preferred) or relative URL the caller may invoke directly. Always present on entity hypermedia actions (WithUrls<T>.allowed_actions); optional on error-recovery actions (ErrorResponse.allowed_actions) where the matching operation_id is enough and the URI is implicit from the failed call.

string | null

method

HTTP method to use against href. Required for entity hypermedia actions; usually omitted on error-recovery actions where the same operation is retried with its original method.

string | null

operation_id

OpenAPI operationId the caller should invoke. Lets an MCP client resolve the call without parsing href.

string | null

rel

required

Link relation describing the action. Closed vocabulary documented in specs/api-conventions.md — examples: self, cancel, pause, resume, events, retry, retry-later, unarchive, get-existing, delete, update.

string

schema_ref

OpenAPI $ref to the request-body schema, when the action takes one (e.g. #/components/schemas/UpdateSessionRequest). Lets a tool-calling agent fetch the input shape without scanning the whole spec.

string | null

code

Stable, machine-readable error code (snake_case).

string | null

detail

Human-readable explanation specific to this occurrence.

string | null

instance

Request URI for this occurrence.

string | null

retry_after_seconds

Seconds the caller should wait before retrying (429 / transient 503).

integer | null format: int32

status

required

HTTP status code; mirrors the response status line.

integer format: int32

title

required

Short, human-readable summary of the problem (e.g. “Not Found”).

string

type

RFC 9457 problem type URI. Optional; identifies the problem class.

string | null

Example

{
  "allowed_actions": [
    {
      "method": "POST"
    }
  ],
  "code": "session_not_found",
  "detail": "Session session_01933b5a000070008000000000000001 not found in org org_01933b5a000070008000000000000001.",
  "instance": "/v1/sessions/session_01933b5a000070008000000000000001",
  "retry_after_seconds": 30,
  "status": 404,
  "title": "Session not found",
  "type": "https://docs.everruns.com/errors/session_not_found"
}