Enterprise-grade coding and LLM API
Use one Hypereal API key for coding agents, IDE integrations, internal tools, and production LLM workloads. The Enterprise API is OpenAI-compatible, Anthropic-native, and exposes a curated model set for teams that want predictable model IDs, image generation, billing, and usage logs.
The CLI-only Claude model IDs ending in -max (e.g. claude-opus-4-7-max) are supported only through the Claude Code CLI against the Anthropic-native endpoint. Using these models with any other client or third-party wrapper is strictly prohibited and will result in the request being blocked and the API key suspended without refund. This includes, but is not limited to, Hermes, OpenClaw, and similar proxy, replay, or account-pooling tools. Standard (non--max) models are unaffected and remain available to all clients.
Use it with Claude Code, coding agents, review bots, IDE tools, and internal automation that already speak OpenAI or Anthropic APIs.
Claude Opus 4.8, Claude Sonnet 4.7, Claude Haiku, GPT-5.5, Nano Banana 2, GPT Image 2, DeepSeek, Qwen, and Kimi are exposed behind stable Hypereal model IDs.
Generate images through the same managed chat completions endpoint with multimodal response fields and account-level usage controls.
Hypereal API keys keep spending limits, model scoping, usage logs, and credit billing in one account-level control plane.
Successful Enterprise API requests include latency insurance metadata and automatic credit compensation when they run unusually long.
Call chat completions
Use the managed base path for the curated Enterprise model catalog and stable Hypereal model IDs.
curl https://api.hypereal.cloud/v1/managed/chat/completions \
-H "Authorization: Bearer ck_..." \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-7",
"messages": [
{
"role": "system",
"content": "You are a senior software engineer."
},
{
"role": "user",
"content": "Review this TypeScript function for correctness."
}
],
"temperature": 0.2,
"max_tokens": 1200
}'import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.HYPEREAL_API_KEY,
baseURL: "https://api.hypereal.cloud/v1/managed",
});
const completion = await client.chat.completions.create({
model: "claude-sonnet-4-7",
messages: [
{ role: "user", content: "Write a migration checklist for this PR." },
],
});
console.log(completion.choices[0]?.message?.content);const response = await client.responses.create({
model: "claude-sonnet-4-7",
input: "Create a concise migration checklist for this pull request.",
});
console.log(response.output_text);Generate images through chat completions
Use Nano Banana 2 with multimodal chat completions, or call the OpenAI-compatible image generations endpoint for GPT Image 2. Use model IDs nano-banana-2 and gpt-image-2. Multimodal chat image fields return base64 data URLs, while image generations returns the OpenAI image response shape.
curl https://api.hypereal.cloud/v1/managed/chat/completions \
-H "Authorization: Bearer ck_..." \
-H "Content-Type: application/json" \
-d '{
"model": "nano-banana-2",
"messages": [
{
"role": "user",
"content": "Generate a clean product mockup of a glass banana sculpture on a white studio background."
}
],
"modalities": ["image", "text"],
"image_config": {
"aspect_ratio": "1:1",
"image_size": "1K"
}
}'curl https://api.hypereal.cloud/v1/managed/images/generations \
-H "Authorization: Bearer ck_..." \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-image-2",
"prompt": "A clean product mockup of a glass banana sculpture on a white studio background.",
"size": "1024x1024",
"quality": "standard"
}'Use the Anthropic-native endpoint
Claude Code and Anthropic SDK clients should point at the Hypereal API root because they append the native messages path themselves. Raw HTTP clients can call the managed messages path directly. Tool use, thinking blocks, streaming, and prompt cache fields are preserved.
The CLI-only Claude model IDs (-max suffix) must only be used from the Claude Code CLI. Third-party wrappers such as Hermes or OpenClaw are not permitted on this tier.
export ANTHROPIC_BASE_URL="https://api.hypereal.cloud" export ANTHROPIC_AUTH_TOKEN="ck_..." export ANTHROPIC_API_KEY="" export ANTHROPIC_DEFAULT_OPUS_MODEL="claude-opus-4-8" export ANTHROPIC_DEFAULT_SONNET_MODEL="claude-sonnet-4-7" export ANTHROPIC_DEFAULT_HAIKU_MODEL="claude-haiku-latest" export CLAUDE_CODE_SUBAGENT_MODEL="claude-sonnet-4-7"
# Claude Code CLI ONLY. # Claude model IDs for the official Claude Code CLI, not third-party wrappers. export ANTHROPIC_BASE_URL="https://api.hypereal.cloud" export ANTHROPIC_AUTH_TOKEN="ck_..." export ANTHROPIC_API_KEY="" export ANTHROPIC_DEFAULT_OPUS_MODEL="claude-opus-4-7-max" export ANTHROPIC_DEFAULT_SONNET_MODEL="claude-sonnet-4-6-max" export ANTHROPIC_DEFAULT_HAIKU_MODEL="claude-haiku-4-5-max" export CLAUDE_CODE_SUBAGENT_MODEL="claude-sonnet-4-6-max"
curl https://api.hypereal.cloud/v1/managed/messages \
-H "anthropic-api-key: ck_..." \
-H "anthropic-version: 2023-06-01" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-7",
"messages": [
{ "role": "user", "content": "Review this diff." }
],
"tools": [],
"max_tokens": 1200
}'Supported Enterprise models
Prices are shown per one million tokens and billed through Hypereal Credits.
| Model ID | Name | Context | Input | Cache read | Cache write | Output |
|---|---|---|---|---|---|---|
| claude-opus-4-8 | Claude Opus 4.8 | 1M | $5.25 | $0.525 | $6.56 | $26.25 |
| claude-sonnet-4-7 | Claude Sonnet 4.7 | 1M | $3.15 | $0.315 | $3.94 | $15.75 |
| claude-haiku-latest | Claude Haiku Latest | 200k | $1.05 | $0.105 | $1.31 | $5.25 |
| claude-opus-4-7-max | Claude Opus 4.7 | 200k | $5.25 | $0.525 | $6.56 | $26.25 |
| claude-sonnet-4-6-max | Claude Sonnet 4.6 | 200k | $3.15 | $0.315 | $3.94 | $15.75 |
| gpt-5-5 | GPT-5.5 | 1M | $5.25 | $0.525 | n/a | $31.50 |
| deepseek-v4-pro | DeepSeek V4 Pro | 1M | $0.4567 | $0.0038 | n/a | $0.9135 |
| qwen3-7-max | Qwen3.7 Max | 200k | $1.31 | $0.2625 | $1.64 | $3.94 |
| qwen3-7-plus | Qwen3.7 Plus | 1M | $0.42 | $0.084 | $0.525 | $1.68 |
| kimi-latest | Kimi Latest | 256k | $0.7182 | $0.1512 | n/a | $3.59 |
| nano-banana-2 | Nano Banana 2 | 131k | $0.525 | n/a | n/a | $3.15 |
| gpt-image-2 | GPT Image 2 | 272k | $8.40 | $2.10 | n/a | $31.50 |
curl https://api.hypereal.cloud/v1/managed/models \ -H "Authorization: Bearer ck_..."
Request and response shape
The Enterprise API accepts the OpenAI chat completions request shape, the Responses API shape, and OpenAI image generation requests when supported by the selected model. Streaming, tools, structured outputs, temperature, and max token controls pass through on compatible models.
{
"model": "claude-sonnet-4-7",
"messages": [
{ "role": "user", "content": "Refactor this function." }
],
"stream": true,
"max_tokens": 2000
}{
"hypereal": {
"billing": {
"model": "claude-sonnet-4-7",
"credits_charged": 12,
"balance_before": 1000,
"balance_after": 988
}
}
}Tools and caching
The managed endpoint preserves OpenAI-compatible tool calls, structured outputs, reasoning controls, streaming chunks, and prompt-cache fields supported by the selected model. For long coding sessions, send stable project context with cache controls and keep a consistent session ID.
const completion = await client.chat.completions.create({
model: "claude-sonnet-4-7",
messages: [{ role: "user", content: "Find the changed files." }],
tools: [
{
type: "function",
function: {
name: "list_changed_files",
description: "List changed files in the current repository.",
parameters: { type: "object", properties: {} },
},
},
],
tool_choice: "auto",
});curl https://api.hypereal.cloud/v1/managed/chat/completions \
-H "Authorization: Bearer ck_..." \
-H "Content-Type: application/json" \
-H "X-Hypereal-Cache: true" \
-H "X-Session-Id: coding-agent-session-123" \
-d '{
"model": "claude-sonnet-4-7",
"cache_control": { "type": "ephemeral" },
"messages": [
{ "role": "system", "content": "Stable project context..." },
{ "role": "user", "content": "Continue the refactor." }
],
"max_tokens": 1200
}'Managed concurrency controls
Enterprise API requests pass through managed admission control before a model call is sent. The gateway uses short wait queues, model-level concurrency slots, account-level request-per-minute guards, capacity telemetry, and circuit breakers for overloaded model paths. These controls apply only to Enterprise API traffic and are surfaced as Hypereal response headers.
| Surface | Primary models | Requests | Tokens | Queue |
|---|---|---|---|---|
| Text generation | gpt-5-5 | 15,000 RPM | 40,000,000 TPM | 15,000,000,000 tokens |
| Image generation | gpt-image-2 | 250 IPM | 8,000,000 TPM | n/a |
These are managed capacity ceilings. API key spending limits, model scoping, daily budgets, hourly budgets, and per-key model limits can be configured lower for internal control.
X-Hypereal-Managed-Governor: active X-Hypereal-Managed-Model-Concurrency-Limit: 80 X-Hypereal-Managed-Model-Concurrency-Remaining: 79 X-Hypereal-Managed-Model-RPM-Limit: 15000 X-Hypereal-Managed-Model-RPM-Remaining: 14999 X-Hypereal-Capacity-Requests-Remaining: 9852 X-Hypereal-Managed-Image-IPM-Limit: 250 X-Hypereal-Managed-Image-IPM-Remaining: 249 X-Hypereal-Managed-Circuit: closed
Automatic compensation for slow requests
Enterprise API requests carry request insurance for unusually slow successful calls. Failed requests are not charged, so compensation is only evaluated after a successful request has a credit charge. Non-streaming responses include the settlement in hypereal.insurance. Streaming responses expose policy headers immediately and settle automatically after the stream finishes.
{
"hypereal": {
"insurance": {
"status": "paid",
"trigger": "latency",
"reason": "latency_threshold_exceeded",
"latency_ms": 94320,
"threshold_ms": 90000,
"credits_charged": 12,
"credits_compensated": 3
}
}
}X-Hypereal-Insurance-Status: paid X-Hypereal-Insurance-Trigger: latency X-Hypereal-Insurance-Latency-Ms: 94320 X-Hypereal-Insurance-Threshold-Ms: 90000 X-Hypereal-Insurance-Credits: 3
Use the managed path for OpenAI-compatible chat completions: /v1/managed/chat/completions, Responses API: /v1/managed/responses, and OpenAI image generations: /v1/managed/images/generations. Use /v1/managed/messages for direct Anthropic-native requests. Claude Code should use https://api.hypereal.cloud as its base URL.
