Quickstart

Your first request in under five minutes. Every ingress format works with every model — pick whichever your existing code already speaks.

Get an API key

Sign in at platform.hyperinfer.ai , open API keys, and create a key. The key (prefix hi-…) is shown once at creation — store it as an environment variable:


export HYPERINFER_API_KEY="hi-..."

Pick a model

Model slugs are identical to OpenRouter’s. List everything (no auth required):


curl https://api.hyperinfer.ai/api/v1/models

Make your first call

Chat Completions

The OpenAI Chat Completions format at POST /api/v1/chat/completions.

curl


curl https://api.hyperinfer.ai/api/v1/chat/completions \
  -H "Authorization: Bearer $HYPERINFER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-sonnet-4.5",
    "messages": [
      { "role": "user", "content": "Say hello in five words." }
    ]
  }'

TypeScript


import OpenAI from "openai";
 
const client = new OpenAI({
  baseURL: "https://api.hyperinfer.ai/api/v1",
  apiKey: process.env.HYPERINFER_API_KEY,
});
 
const completion = await client.chat.completions.create({
  model: "anthropic/claude-sonnet-4.5",
  messages: [{ role: "user", content: "Say hello in five words." }],
});
 
console.log(completion.choices[0].message.content);

Python


import os
from openai import OpenAI
 
client = OpenAI(
    base_url="https://api.hyperinfer.ai/api/v1",
    api_key=os.environ["HYPERINFER_API_KEY"],
)
 
completion = client.chat.completions.create(
    model="anthropic/claude-sonnet-4.5",
    messages=[{"role": "user", "content": "Say hello in five words."}],
)
 
print(completion.choices[0].message.content)

Responses

The OpenAI Responses format at POST /api/v1/responses.

curl


curl https://api.hyperinfer.ai/api/v1/responses \
  -H "Authorization: Bearer $HYPERINFER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-sonnet-4.5",
    "input": "Say hello in five words."
  }'

TypeScript


import OpenAI from "openai";
 
const client = new OpenAI({
  baseURL: "https://api.hyperinfer.ai/api/v1",
  apiKey: process.env.HYPERINFER_API_KEY,
});
 
const response = await client.responses.create({
  model: "anthropic/claude-sonnet-4.5",
  input: "Say hello in five words.",
});
 
console.log(response.output_text);

Python


import os
from openai import OpenAI
 
client = OpenAI(
    base_url="https://api.hyperinfer.ai/api/v1",
    api_key=os.environ["HYPERINFER_API_KEY"],
)
 
response = client.responses.create(
    model="anthropic/claude-sonnet-4.5",
    input="Say hello in five words.",
)
 
print(response.output_text)

Anthropic Messages

The Anthropic Messages format at POST /api/v1/messages. Note the SDK baseURL ends at /api — the Anthropic SDKs append /v1/messages themselves — and that HyperInfer uses Bearer auth (the SDKs’ authToken option), not x-api-key.

curl


curl https://api.hyperinfer.ai/api/v1/messages \
  -H "Authorization: Bearer $HYPERINFER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o-mini",
    "max_tokens": 256,
    "messages": [
      { "role": "user", "content": "Say hello in five words." }
    ]
  }'

TypeScript


import Anthropic from "@anthropic-ai/sdk";
 
const client = new Anthropic({
  baseURL: "https://api.hyperinfer.ai/api",
  authToken: process.env.HYPERINFER_API_KEY,
});
 
const message = await client.messages.create({
  model: "openai/gpt-4o-mini", // any model works in any format
  max_tokens: 256,
  messages: [{ role: "user", content: "Say hello in five words." }],
});
 
console.log(message.content);

Python


import os
from anthropic import Anthropic
 
client = Anthropic(
    base_url="https://api.hyperinfer.ai/api",
    auth_token=os.environ["HYPERINFER_API_KEY"],
)
 
message = client.messages.create(
    model="openai/gpt-4o-mini",  # any model works in any format
    max_tokens=256,
    messages=[{"role": "user", "content": "Say hello in five words."}],
)
 
print(message.content)

Stream it

Add "stream": true to any request to get server-sent events in the ingress format’s native streaming protocol — see Streaming.

Notice the cross-format examples above: a Chat Completions request calling an Anthropic-slugged model and a Messages request calling an OpenAI-slugged one. That is the point — see Models & Routing.

Next steps

Authentication — key management, spend limits, model pinning
API Reference — every endpoint with a live playground
Pricing — per-model rates and the credits system