Errors

All errors use the OpenAI error format. The HTTP status code and the code field in the body carry the same value.

Error shape

{
  "error": {
    "message": "Invalid API key",
    "type": "authentication_error",
    "code": 401
  }
}

Error codes

Status	Type	Meaning
400	invalid_request_error	Malformed request body — missing required fields or unparseable JSON.
401	authentication_error	Missing, expired, or invalid API key or session token.
402	payment_required	Account balance exhausted or subscription inactive.
422	invalid_request_error	Valid JSON but invalid parameter values, e.g. unknown model ID.
429	rate_limit_error	Exceeded the per-key or fleet RPM limit. Check the Retry-After header.
500	server_error	Internal error on the inference backend. Transient — safe to retry.
503	service_unavailable	No healthy inference workers available. Retry with exponential backoff.

What a 401 looks like

curl -X POST https://api.pinstripes.io/v1/chat/completions \
  -H "Authorization: Bearer bad-key" \
  -H "Content-Type: application/json" \
  -d '{"model":"ps/deepseek-v4-flash","messages":[{"role":"user","content":"hi"}]}'

# HTTP 401
{
  "error": {
    "message": "Invalid API key",
    "type": "authentication_error",
    "code": 401
  }
}

Retries

503 errors mean no workers are available — retry with exponential backoff starting at 1 second. 429 errors include a Retry-After header (seconds) indicating when the rate limit window resets. 500 errors are safe to retry immediately once, then with backoff.