Tutorial: Production Hardening
This guide covers baseline settings for container or VM deployment.
Recommended baseline
bash
CLAUDE_API_KEY=change-me \
CLAUDE_API_TIMEOUT=120 \
CLAUDE_API_MAX_CONCURRENT=4 \
CLAUDE_API_MAX_QUEUE=100 \
CLAUDE_API_QUEUE_TIMEOUT_MS=15000 \
CLAUDE_API_DRAIN_TIMEOUT_SECONDS=30 \
CLAUDE_API_RATE_LIMIT_WINDOW_SECONDS=60 \
CLAUDE_API_RATE_LIMIT_MAX_REQUESTS=30 \
CLAUDE_API_STRICT_HEALTH=true \
npx clauditoriumGraceful shutdown behavior
On SIGTERM/SIGINT:
- Service enters drain mode.
- New
POST /askandPOST /chatrequests return503 shutting_down. - In-flight/queued Claude work is allowed to finish (up to drain timeout).
- Process exits.
Health and readiness checks
- Liveness/readiness endpoint:
GET /health - Historical readiness data:
GET /health/history - Manual readiness refresh:
POST /health/recheck(requires API key)
Observability
- Structured JSON logs for request outcomes and shutdown events
- Prometheus metrics:
GET /metrics
Conversation memory safety
Use TTL and max conversation settings to bound memory:
CLAUDE_API_CONVERSATION_TTL_SECONDSCLAUDE_API_MAX_CONVERSATIONS