Get system health
const url = 'https://app.everruns.com/api/v1/durable/health';const options = {method: 'GET'};
try { const response = await fetch(url, options); const data = await response.json(); console.log(data);} catch (error) { console.error(error);}curl --request GET \ --url https://app.everruns.com/api/v1/durable/healthResponses
Section titled “ Responses ”System health
System health response
object
Number of workers in the running state, ready to claim tasks.
Tasks currently claimed by a worker (gauge).
Cumulative count of tasks that completed successfully (monotonic counter).
Cumulative count of workflows that completed successfully (monotonic counter).
Total tasks currently in flight across all workers.
Size of the dead-letter queue (gauge). High values indicate stuck activities.
Event-delivery backend in use: nats for distributed deployments, in_memory for single-instance. None if the field was omitted by an older server.
Cumulative count of tasks that failed terminally or were sent to the DLQ (monotonic counter).
Cumulative count of workflows that ended in failure (monotonic counter).
current_load / total_capacity * 100. 0.0 when no workers are registered.
Tasks waiting to be claimed (gauge).
Workflows waiting to be claimed (gauge).
Workflows currently executing (gauge).
Cumulative count of tasks claimed at least once (monotonic counter).
Cumulative count of workflows that started (monotonic counter).
Aggregate system status: healthy, degraded, or unhealthy. Derived from worker availability, load, and queue depths.
Sum of max_concurrency across all workers (the upper bound on concurrent task execution).
Total number of workers registered (heartbeating in the last window).
Number of workers currently accepting new task assignments (subset of active_workers; drains/backpressure excluded).
Example generated
{ "active_workers": 1, "claimed_tasks": 1, "completed_tasks": 1, "completed_workflows": 1, "current_load": 1, "dlq_size": 1, "event_delivery": "example", "failed_tasks": 1, "failed_workflows": 1, "load_percentage": 1, "pending_tasks": 1, "pending_workflows": 1, "running_workflows": 1, "started_tasks": 1, "started_workflows": 1, "status": "example", "total_capacity": 1, "total_workers": 1, "workers_accepting": 1}Internal server error
Standard error response.
Wire shape is RFC 9457 Problem Details:
every error response includes title and status, and may include
detail, code, allowed_actions, retry_after_seconds, instance,
and type. The content type is rewritten to application/problem+json
by [problem_json_content_type].
object
Recovery actions the caller can take next.
Agent-actionable recovery hint attached to an error response.
object
Short, agent-readable hint (e.g. “Shorten ‘name’ to <= 200 chars.”).
Optional absolute or relative URL the caller may invoke directly.
OpenAPI operationId the caller should invoke to recover.
Link relation describing the action (e.g. retry, get-existing,
unarchive, retry-later).
Stable, machine-readable error code (snake_case).
Human-readable explanation specific to this occurrence.
Request URI for this occurrence.
Seconds the caller should wait before retrying (429 / transient 503).
HTTP status code; mirrors the response status line.
Short, human-readable summary of the problem (e.g. “Not Found”).
RFC 9457 problem type URI. Optional; identifies the problem class.
Example generated
{ "allowed_actions": [ { "hint": "example", "href": "example", "operation_id": "example", "rel": "example" } ], "code": "example", "detail": "example", "instance": "example", "retry_after_seconds": 1, "status": 1, "title": "example", "type": "example"}