Responses
POST /api/v1/responsesOpenAI Responses format. Works with every model in the catalog via any-to-any translation.
Request body
Body — application/json (required)
modelstringrequiredinputstring | (message | function_call | function_call_output | reasoning)[]requiredShow 2 variants
stringstring(message | function_call | function_call_output | reasoning)[](message | function_call | function_call_output | reasoning)[]Show 4 variants
messageobjectShow 3 nested fields
type"message"rolestringrequiredPossible values: system developer user assistant
contentstring | (input_text | output_text | input_image | input_file)[]requiredShow 2 variants
stringstring(input_text | output_text | input_image | input_file)[](input_text | output_text | input_image | input_file)[]Show 4 variants
input_textobjectShow 2 nested fields
type"input_text"requiredtextstringrequiredoutput_textobjectShow 2 nested fields
type"output_text"requiredtextstringrequiredinput_imageobjectShow 3 nested fields
type"input_image"requiredimage_urlstringrequireddetailstringinput_fileobjectShow 4 nested fields
type"input_file"requiredfilenamestringfile_datastringfile_urlstringfunction_callobjectShow 4 nested fields
type"function_call"requiredcall_idstringrequirednamestringrequiredargumentsstringrequiredfunction_call_outputobjectShow 3 nested fields
type"function_call_output"requiredcall_idstringrequiredoutputstringrequiredreasoningobjectShow 2 nested fields
type"reasoning"requiredsummaryobject[]Default: []
Show 2 nested fields
type"summary_text"requiredtextstringrequiredinstructionsstring | nulltools(function | object)[]Show 2 variants
functionobjectShow 5 nested fields
type"function"requirednamestringrequireddescriptionstring | nullparametersobject | nullstrictboolean | nullobjectobjectShow 2 nested fields
typestringrequiredPossible values: web_search web_fetch pdf datetime image
eventsbooleantool_choicestring | functionShow 2 variants
stringstringPossible values: auto none required
functionobjectShow 2 nested fields
type"function"requirednamestringrequiredmax_output_tokensinteger | nulltemperaturenumber | nulltop_pnumber | nullstreambooleantextobjectShow 1 nested field
formatobjectShow 4 nested fields
typestringrequiredPossible values: text json_object json_schema
namestringschemaobjectstrictbooleanreasoningobject | nullShow 1 nested field
effortstring | nullPossible values: low medium high
previous_response_idstring | nullstorebooleanhi_tool_eventsbooleanGenerated at build time from the API's OpenAPI document — the same schemas that validate requests, so this section cannot drift from the API.
input accepts plain text or a list of input items. stream: true switches to the
semantic SSE event protocol — see Streaming. This endpoint is
stateless: previous_response_id is rejected — send the full input. Server tools
activate as built-in tool types in tools — see server tools.
Stop sequences are not expressible in this format and are rejected as
invalid_request — see the capability gaps.
Response
{
"id": "resp_def456",
"object": "response",
"created_at": 1767312000,
"status": "completed",
"model": "anthropic/claude-sonnet-4.5",
"output": [
{
"type": "message",
"role": "assistant",
"content": [{ "type": "output_text", "text": "Speculative decoding drafts tokens…" }]
}
],
"usage": {
"input_tokens": 21,
"output_tokens": 34,
"total_tokens": 55,
"input_tokens_details": { "cached_tokens": 0 },
"output_tokens_details": { "reasoning_tokens": 0 }
}
}Reasoning models emit reasoning output items ahead of the message. The id is
resp_<request id> — pass the request ID (also returned in the X-Request-Id
header) to GET /generation.
Response schema
200Response
The response object. With `stream: true`, an SSE stream instead.
application/json
idstringrequiredresp_<request id>.
object"response"requiredcreated_atintegerrequiredUnix seconds.
statusstringrequiredPossible values: completed incomplete
incomplete_detailsobject | nullShow 1 nested field
reasonstringPossible values: max_output_tokens content_filter
modelstringrequiredoutputobject[]requiredOutput items in order: optional `reasoning` (summary_text), `message` (output_text content), and one `function_call` per tool call.
usageobjectrequiredShow 5 nested fields
input_tokensintegerinput_tokens_detailsobjectShow 1 nested field
cached_tokensintegeroutput_tokensintegeroutput_tokens_detailsobjectShow 1 nested field
reasoning_tokensintegertotal_tokensintegertext/event-streamSSE streamSemantic SSE event protocol (`event:` + `data:` frames): `response.created` / `response.in_progress`, then per-item events (`response.output_item.added`, `response.output_text.delta`, `response.function_call_arguments.delta`, matching `*.done` events), ending with `response.completed` carrying the full response incl. usage (002-R8). Keep-alive comments every 15 s.
errorError envelope (any non-2xx status)
Error in this ingress format's native envelope, with the stable taxonomy `code` (002-R7). See the ErrorCode schema for the code → HTTP status mapping.
errorobjectrequiredShow 5 nested fields
messagestringrequiredtypestringrequiredOpenAI-compatible error class (e.g. invalid_request_error).
codestringrequiredStable error taxonomy (002-R7), identical across ingress formats. HTTP status per code: provider_auth=502, provider_rate_limit=429, provider_overloaded=529, context_length_exceeded=400, content_filter=400, provider_timeout=504, provider_unavailable=502, insufficient_credits=402, key_limit_exceeded=402, model_not_allowed=403, invalid_api_key=401, workspace_locked=403, rate_limit_exceeded=429, invalid_request=400, payload_too_large=413, internal_error=500.
Possible values: provider_auth provider_rate_limit provider_overloaded context_length_exceeded content_filter provider_timeout provider_unavailable insufficient_credits key_limit_exceeded model_not_allowed invalid_api_key workspace_locked rate_limit_exceeded invalid_request payload_too_large internal_error
paramstring | nullrequest_idstringGenerated at build time from the API's OpenAPI document — the same schemas that validate requests, so this section cannot drift from the API.
Examples
curl
curl https://api.hyperinfer.ai/api/v1/responses \
-H "Authorization: Bearer $HYPERINFER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-sonnet-4.5",
"instructions": "You are a concise assistant.",
"input": "In one sentence: what is speculative decoding?",
"max_output_tokens": 256
}'Errors
OpenAI error object with the stable taxonomy.