Skip to Content
API ReferencePOST /responses

Responses

POST /api/v1/responses

OpenAI Responses format. Works with every model in the catalog via any-to-any translation.

Request body

Body — application/json (required)

modelstringrequired
inputstring | (message | function_call | function_call_output | reasoning)[]required
Show 2 variants
stringstring
(message | function_call | function_call_output | reasoning)[](message | function_call | function_call_output | reasoning)[]
Show 4 variants
messageobject
Show 3 nested fields
type"message"
rolestringrequired

Possible values: system developer user assistant

contentstring | (input_text | output_text | input_image | input_file)[]required
Show 2 variants
stringstring
(input_text | output_text | input_image | input_file)[](input_text | output_text | input_image | input_file)[]
Show 4 variants
input_textobject
Show 2 nested fields
type"input_text"required
textstringrequired
output_textobject
Show 2 nested fields
type"output_text"required
textstringrequired
input_imageobject
Show 3 nested fields
type"input_image"required
image_urlstringrequired
detailstring
input_fileobject
Show 4 nested fields
type"input_file"required
filenamestring
file_datastring
file_urlstring
function_callobject
Show 4 nested fields
type"function_call"required
call_idstringrequired
namestringrequired
argumentsstringrequired
function_call_outputobject
Show 3 nested fields
type"function_call_output"required
call_idstringrequired
outputstringrequired
reasoningobject
Show 2 nested fields
type"reasoning"required
summaryobject[]

Default: []

Show 2 nested fields
type"summary_text"required
textstringrequired
instructionsstring | null
tools(function | object)[]
Show 2 variants
functionobject
Show 5 nested fields
type"function"required
namestringrequired
descriptionstring | null
parametersobject | null
strictboolean | null
objectobject
Show 2 nested fields
typestringrequired

Possible values: web_search web_fetch pdf datetime image

eventsboolean
tool_choicestring | function
Show 2 variants
stringstring

Possible values: auto none required

functionobject
Show 2 nested fields
type"function"required
namestringrequired
max_output_tokensinteger | null
temperaturenumber | null
top_pnumber | null
streamboolean
textobject
Show 1 nested field
formatobject
Show 4 nested fields
typestringrequired

Possible values: text json_object json_schema

namestring
schemaobject
strictboolean
reasoningobject | null
Show 1 nested field
effortstring | null

Possible values: low medium high

previous_response_idstring | null
storeboolean
hi_tool_eventsboolean

Generated at build time from the API's OpenAPI document — the same schemas that validate requests, so this section cannot drift from the API.

input accepts plain text or a list of input items. stream: true switches to the semantic SSE event protocol — see Streaming. This endpoint is stateless: previous_response_id is rejected — send the full input. Server tools activate as built-in tool types in tools — see server tools.

Stop sequences are not expressible in this format and are rejected as invalid_request — see the capability gaps.

Response

{ "id": "resp_def456", "object": "response", "created_at": 1767312000, "status": "completed", "model": "anthropic/claude-sonnet-4.5", "output": [ { "type": "message", "role": "assistant", "content": [{ "type": "output_text", "text": "Speculative decoding drafts tokens…" }] } ], "usage": { "input_tokens": 21, "output_tokens": 34, "total_tokens": 55, "input_tokens_details": { "cached_tokens": 0 }, "output_tokens_details": { "reasoning_tokens": 0 } } }

Reasoning models emit reasoning output items ahead of the message. The id is resp_<request id> — pass the request ID (also returned in the X-Request-Id header) to GET /generation.

Response schema

200Response

The response object. With `stream: true`, an SSE stream instead.

application/json

idstringrequired

resp_<request id>.

object"response"required
created_atintegerrequired

Unix seconds.

statusstringrequired

Possible values: completed incomplete

incomplete_detailsobject | null
Show 1 nested field
reasonstring

Possible values: max_output_tokens content_filter

modelstringrequired
outputobject[]required

Output items in order: optional `reasoning` (summary_text), `message` (output_text content), and one `function_call` per tool call.

usageobjectrequired
Show 5 nested fields
input_tokensinteger
input_tokens_detailsobject
Show 1 nested field
cached_tokensinteger
output_tokensinteger
output_tokens_detailsobject
Show 1 nested field
reasoning_tokensinteger
total_tokensinteger
text/event-streamSSE stream

Semantic SSE event protocol (`event:` + `data:` frames): `response.created` / `response.in_progress`, then per-item events (`response.output_item.added`, `response.output_text.delta`, `response.function_call_arguments.delta`, matching `*.done` events), ending with `response.completed` carrying the full response incl. usage (002-R8). Keep-alive comments every 15 s.

errorError envelope (any non-2xx status)

Error in this ingress format's native envelope, with the stable taxonomy `code` (002-R7). See the ErrorCode schema for the code → HTTP status mapping.

errorobjectrequired
Show 5 nested fields
messagestringrequired
typestringrequired

OpenAI-compatible error class (e.g. invalid_request_error).

codestringrequired

Stable error taxonomy (002-R7), identical across ingress formats. HTTP status per code: provider_auth=502, provider_rate_limit=429, provider_overloaded=529, context_length_exceeded=400, content_filter=400, provider_timeout=504, provider_unavailable=502, insufficient_credits=402, key_limit_exceeded=402, model_not_allowed=403, invalid_api_key=401, workspace_locked=403, rate_limit_exceeded=429, invalid_request=400, payload_too_large=413, internal_error=500.

Possible values: provider_auth provider_rate_limit provider_overloaded context_length_exceeded content_filter provider_timeout provider_unavailable insufficient_credits key_limit_exceeded model_not_allowed invalid_api_key workspace_locked rate_limit_exceeded invalid_request payload_too_large internal_error

paramstring | null
request_idstring

Generated at build time from the API's OpenAPI document — the same schemas that validate requests, so this section cannot drift from the API.

Examples

curl https://api.hyperinfer.ai/api/v1/responses \ -H "Authorization: Bearer $HYPERINFER_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "anthropic/claude-sonnet-4.5", "instructions": "You are a concise assistant.", "input": "In one sentence: what is speculative decoding?", "max_output_tokens": 256 }'

Errors

OpenAI error object with the stable taxonomy.

Playground

Checking session…
POST /api/v1/responses
Request as curl
curl https://api.hyperinfer.ai/api/v1/responses \
  -H "Authorization: Bearer $HYPERINFER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "model": "deepseek/deepseek-v4-flash",
  "input": "In one sentence: what is speculative decoding?",
  "instructions": "You are a concise assistant.",
  "max_output_tokens": 256
}'