Errors and Retries
Error handling should classify failures by retryability and isolate user-visible behavior from transient provider instability.
Treat 4xx contract and auth errors as terminal unless user input changes. Treat 429 and most 5xx categories as retryable with bounded exponential backoff and jitter.
Error Code Matrix
| Code | HTTP | Retryable | Meaning |
|---|---|---|---|
| invalid_request | 400 | no | Schema, parameter, or request-shape violation. |
| authentication_error | 401 | no | Invalid API key or malformed bearer header. |
| permission_denied | 403 | no | Blocked provider, route, or policy decision. |
| not_found | 404 | no | Unknown route or unknown model slug. |
| insufficient_credits | 402 | no | Organization credit balance is depleted. |
| rate_limited | 429 | yes | Per-key or tenant request quota exceeded. |
| upstream_error | 502 | yes | Provider returned transient gateway failure. |
| timeout | 504 | yes | Provider response exceeded timeout budget. |
| internal_error | 500 | yes | Unexpected gateway internal failure. |
Error Envelope
json
{
"error": {
"type": "rate_limited",
"message": "rate limit exceeded for this API key"
}
}Retry Guidance
Honor retry-after whenever present. Keep retry count finite, surface useful status to users, and include request IDs in logs. This avoids silent failures while protecting upstream providers from retry storms.