Endpoint Reference
Prompt
POST /ask: single prompt/response
Chat
POST /chat: create/continue conversation usingmessage(+ optionalconversation_id)GET /chat/{conversation_id}: conversation metadataDELETE /chat/{conversation_id}: remove in-memory conversation
Health and Ops
GET /health: health + readiness + runtime observabilityGET /health/history: rolling readiness history (?since=optional)POST /health/recheck: force readiness re-check (API key required)GET /metrics: Prometheus metricsGET /version: service version metadata
Discovery and Docs
GET /models: locally available Claude modelsGET /openapi.yaml: OpenAPI specGET /docs: runtime Swagger docs UI
Common Error Codes
validation_error(400): request shape/value issueunauthorized(401): missing/invalid API keypayload_too_large(413): body exceeds configured limitrate_limited(429): per-IP rate cap reachedconcurrency_limited(429): Claude queue is fullshutting_down(503): drain mode enabledqueue_timeout(504): queue wait exceeded timeouttimeout(504): Claude execution timeoutcli_error/spawn_error/internal_error(500): runtime/internal failures
For exact schemas and all statuses, use OpenAPI Explorer.