Get Generation


GET /api/v1/generation?id=<request id>

Post-hoc metadata for a previous request: model, full token breakdown, cost in credits, and latency. The request ID is the X-Request-Id header of every gateway response — it is also embedded in each response’s id (gen-<request id>, resp_<request id>, msg_<request id>). Requires authentication with a key of the same workspace.

Query parameters

idstringrequired

The request id.

Generated at build time from the API's OpenAPI document — the same schemas that validate requests, so this section cannot drift from the API.

Response


{
  "data": {
    "id": "abc123",
    "model": "anthropic/claude-sonnet-4.5",
    "ingress_format": "chat_completions",
    "source": "api",
    "tokens_prompt": 2145,
    "tokens_completion": 312,
    "tokens_reasoning": 128,
    "tokens_cached_prompt": 2048,
    "total_cost": 0.0053,
    "cost_micro_usd": "5300",
    "tool_costs_micro_usd": {},
    "latency_ms": 2841,
    "status": "ok",
    "created_at": "2026-07-02T10:15:00Z"
  }
}

200Response

Generation metadata.

dataobjectrequired

Show 15 nested fields

idstring

modelstring

ingress_formatstring

Possible values: chat_completions responses anthropic_messages

sourcestring

Possible values: api playground chat

tokens_promptinteger

tokens_completioninteger

tokens_reasoninginteger

tokens_cached_promptinteger

total_costnumber

Credits (USD).

cost_micro_usdstring

Exact micro-USD, as a string.

tool_costs_micro_usdmap<string, string>

latency_msinteger

statusstring

error_codestring

Stable error taxonomy (002-R7), identical across ingress formats. HTTP status per code: provider_auth=502, provider_rate_limit=429, provider_overloaded=529, context_length_exceeded=400, content_filter=400, provider_timeout=504, provider_unavailable=502, insufficient_credits=402, key_limit_exceeded=402, model_not_allowed=403, invalid_api_key=401, workspace_locked=403, rate_limit_exceeded=429, invalid_request=400, payload_too_large=413, internal_error=500.

Possible values: provider_auth provider_rate_limit provider_overloaded context_length_exceeded content_filter provider_timeout provider_unavailable insufficient_credits key_limit_exceeded model_not_allowed invalid_api_key workspace_locked rate_limit_exceeded invalid_request payload_too_large internal_error

created_atstring

errorError envelope (any non-2xx status)

Error in this ingress format's native envelope, with the stable taxonomy `code` (002-R7). See the ErrorCode schema for the code → HTTP status mapping.

errorobjectrequired

Show 5 nested fields

messagestringrequired

typestringrequired

OpenAI-compatible error class (e.g. invalid_request_error).

codestringrequired

paramstring | null

request_idstring

Generated at build time from the API's OpenAPI document — the same schemas that validate requests, so this section cannot drift from the API.

total_cost is the total in credit-USD — model usage plus server tools, which are itemized per tool in tool_costs_micro_usd; cost_micro_usd is the same total as an exact micro-USD string.

Unknown or foreign-workspace IDs return invalid_request (HTTP 400/404) in the requesting key’s error format.

Example


curl "https://api.hyperinfer.ai/api/v1/generation?id=abc123" \
  -H "Authorization: Bearer $HYPERINFER_API_KEY"

Playground

Run a request on another reference page first, then paste its request ID here — the X-Request-Id header value (the part after the gen-/resp_/msg_ prefix of the response id).

Checking session…

GET /api/v1/generation

id

Request as curl

curl https://api.hyperinfer.ai/api/v1/generation \
  -H "Authorization: Bearer $HYPERINFER_API_KEY"