Errors
All errors use the OpenAI error format. The HTTP status code and the code field in the body carry the same value.
Error shape
{
"error": {
"message": "Invalid API key",
"type": "authentication_error",
"code": 401
}
}Error codes
| Status | Type | Meaning |
|---|---|---|
| 400 | invalid_request_error | Malformed request body — missing required fields or unparseable JSON. |
| 401 | authentication_error | Missing, expired, or invalid API key or session token. |
| 402 | payment_required | Account balance exhausted or subscription inactive. |
| 422 | invalid_request_error | Valid JSON but invalid parameter values, e.g. unknown model ID. |
| 429 | rate_limit_error | Exceeded the per-key or fleet RPM limit. Check the Retry-After header. |
| 500 | server_error | Internal error on the inference backend. Transient — safe to retry. |
| 503 | service_unavailable | No healthy inference workers available. Retry with exponential backoff. |
What a 401 looks like
curl -X POST https://api.pinstripes.io/v1/chat/completions \
-H "Authorization: Bearer bad-key" \
-H "Content-Type: application/json" \
-d '{"model":"ps/deepseek-v4-flash","messages":[{"role":"user","content":"hi"}]}'
# HTTP 401
{
"error": {
"message": "Invalid API key",
"type": "authentication_error",
"code": 401
}
}Retries
503 errors mean no workers are available — retry with exponential backoff starting at 1 second. 429 errors include a Retry-After header (seconds) indicating when the rate limit window resets. 500 errors are safe to retry immediately once, then with backoff.