Enterprise API

Enterprise-grade coding and LLM API

Use one Hypereal API key for coding agents, IDE integrations, internal tools, and production LLM workloads. The Enterprise API is OpenAI-compatible, Anthropic-native, and exposes a curated model set for teams that want predictable model IDs, image generation, billing, and usage logs.

Get API key View models

Some Claude model IDs are Claude Code CLI only

The CLI-only Claude model IDs ending in -max (e.g. claude-opus-4-7-max) are supported only through the Claude Code CLI against the Anthropic-native endpoint. Using these models with any other client or third-party wrapper is strictly prohibited and will result in the request being blocked and the API key suspended without refund. This includes, but is not limited to, Hermes, OpenClaw, and similar proxy, replay, or account-pooling tools. Standard (non--max) models are unaffected and remain available to all clients.

Built for coding

Use it with Claude Code, coding agents, review bots, IDE tools, and internal automation that already speak OpenAI or Anthropic APIs.

Enterprise model set

Claude Opus 4.7, Claude Sonnet 4.7, Claude Haiku, GPT-5.5, Nano Banana 2, GPT Image 2, DeepSeek, Qwen, and Kimi are exposed behind stable Hypereal model IDs.

Image outputs

Generate images through the same managed chat completions endpoint with multimodal response fields and account-level usage controls.

Central controls

Hypereal API keys keep spending limits, model scoping, usage logs, and credit billing in one account-level control plane.

Request insurance

Successful Enterprise API requests include latency insurance metadata and automatic credit compensation when they run unusually long.

Quickstart

Call chat completions

Use the managed base path for the curated Enterprise model catalog and stable Hypereal model IDs.

curlbash

curl https://api.hypereal.cloud/v1/managed/chat/completions \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-7",
    "messages": [
      {
        "role": "system",
        "content": "You are a senior software engineer."
      },
      {
        "role": "user",
        "content": "Review this TypeScript function for correctness."
      }
    ],
    "temperature": 0.2,
    "max_tokens": 1200
  }'

OpenAI SDKts

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.HYPEREAL_API_KEY,
  baseURL: "https://api.hypereal.cloud/v1/managed",
});

const completion = await client.chat.completions.create({
  model: "claude-sonnet-4-7",
  messages: [
    { role: "user", content: "Write a migration checklist for this PR." },
  ],
});

console.log(completion.choices[0]?.message?.content);

Responses APIts

const response = await client.responses.create({
  model: "claude-sonnet-4-7",
  input: "Create a concise migration checklist for this pull request.",
});

console.log(response.output_text);

Image generation

Generate images through chat completions

Use Nano Banana 2 with multimodal chat completions, or call the OpenAI-compatible image generations endpoint for GPT Image 2. Use model IDs nano-banana-2 and gpt-image-2. Multimodal chat image fields return base64 data URLs, while image generations returns the OpenAI image response shape.

Multimodal chat imagebash

curl https://api.hypereal.cloud/v1/managed/chat/completions \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nano-banana-2",
    "messages": [
      {
        "role": "user",
        "content": "Generate a clean product mockup of a glass banana sculpture on a white studio background."
      }
    ],
    "modalities": ["image", "text"],
    "image_config": {
      "aspect_ratio": "1:1",
      "image_size": "1K"
    }
  }'

OpenAI image generationsbash

curl https://api.hypereal.cloud/v1/managed/images/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-image-2",
    "prompt": "A clean product mockup of a glass banana sculpture on a white studio background.",
    "size": "1024x1024",
    "quality": "standard"
  }'

Claude Code

Use the Anthropic-native endpoint

Claude Code and Anthropic SDK clients should point at the Hypereal API root because they append the native messages path themselves. Raw HTTP clients can call the managed messages path directly. Tool use, thinking blocks, streaming, and prompt cache fields are preserved.

The CLI-only Claude model IDs (-max suffix) must only be used from the Claude Code CLI. Third-party wrappers such as Hermes or OpenClaw are not permitted on this tier.

Claude Code environmentbash

export ANTHROPIC_BASE_URL="https://api.hypereal.cloud"
export ANTHROPIC_AUTH_TOKEN="ck_..."
export ANTHROPIC_API_KEY=""

export ANTHROPIC_DEFAULT_OPUS_MODEL="claude-opus-4-8"
export ANTHROPIC_DEFAULT_SONNET_MODEL="claude-sonnet-4-7"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="claude-haiku-latest"
export CLAUDE_CODE_SUBAGENT_MODEL="claude-sonnet-4-7"

Claude Code CLI-onlybash

# Claude Code CLI ONLY.
# Claude model IDs for the official Claude Code CLI, not third-party wrappers.
export ANTHROPIC_BASE_URL="https://api.hypereal.cloud"
export ANTHROPIC_AUTH_TOKEN="ck_..."
export ANTHROPIC_API_KEY=""

export ANTHROPIC_DEFAULT_OPUS_MODEL="claude-opus-4-7-max"
export ANTHROPIC_DEFAULT_SONNET_MODEL="claude-sonnet-4-6-max"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="claude-haiku-4-5-max"
export CLAUDE_CODE_SUBAGENT_MODEL="claude-sonnet-4-6-max"

Anthropic Messagesbash

curl https://api.hypereal.cloud/v1/managed/messages \
  -H "anthropic-api-key: ck_..." \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-7",
    "messages": [
      { "role": "user", "content": "Review this diff." }
    ],
    "tools": [],
    "max_tokens": 1200
  }'

Models

Supported Enterprise models

Prices are shown per one million tokens and billed through Hypereal Credits.

Model ID	Name	Context	Input	Cache read	Cache write	Output
claude-opus-4-8	Claude Opus 4.8	1M	$5.25	$0.525	$6.56	$26.25
claude-sonnet-4-7	Claude Sonnet 4.7	1M	$3.15	$0.315	$3.94	$15.75
claude-haiku-latest	Claude Haiku Latest	200k	$1.05	$0.105	$1.31	$5.25
claude-opus-4-7-max	Claude Opus 4.7	200k	$5.25	$0.525	$6.56	$26.25
claude-sonnet-4-6-max	Claude Sonnet 4.6	200k	$3.15	$0.315	$3.94	$15.75
gpt-5-5	GPT-5.5	1M	$5.25	$0.525	n/a	$31.50
deepseek-v4-pro	DeepSeek V4 Pro	1M	$0.4567	$0.0038	n/a	$0.9135
qwen3-7-max	Qwen3.7 Max	200k	$1.31	$0.2625	$1.64	$3.94
qwen3-7-plus	Qwen3.7 Plus	1M	$0.42	$0.084	$0.525	$1.68
kimi-latest	Kimi Latest	256k	$0.7182	$0.1512	n/a	$3.59
nano-banana-2	Nano Banana 2	131k	$0.525	n/a	n/a	$3.15
gpt-image-2	GPT Image 2	272k	$8.40	$2.10	n/a	$31.50

List Enterprise modelsbash

curl https://api.hypereal.cloud/v1/managed/models \
  -H "Authorization: Bearer ck_..."

Schema

Request and response shape

The Enterprise API accepts the OpenAI chat completions request shape, the Responses API shape, and OpenAI image generation requests when supported by the selected model. Streaming, tools, structured outputs, temperature, and max token controls pass through on compatible models.

Request bodyjson

{
  "model": "claude-sonnet-4-7",
  "messages": [
    { "role": "user", "content": "Refactor this function." }
  ],
  "stream": true,
  "max_tokens": 2000
}

Billing metadatajson

{
  "hypereal": {
    "billing": {
      "model": "claude-sonnet-4-7",
      "credits_charged": 12,
      "balance_before": 1000,
      "balance_after": 988
    }
  }
}

Agents

Tools and caching

The managed endpoint preserves OpenAI-compatible tool calls, structured outputs, reasoning controls, streaming chunks, and prompt-cache fields supported by the selected model. For long coding sessions, send stable project context with cache controls and keep a consistent session ID.

Tool callingts

const completion = await client.chat.completions.create({
  model: "claude-sonnet-4-7",
  messages: [{ role: "user", content: "Find the changed files." }],
  tools: [
    {
      type: "function",
      function: {
        name: "list_changed_files",
        description: "List changed files in the current repository.",
        parameters: { type: "object", properties: {} },
      },
    },
  ],
  tool_choice: "auto",
});

Cachingbash

curl https://api.hypereal.cloud/v1/managed/chat/completions \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -H "X-Hypereal-Cache: true" \
  -H "X-Session-Id: coding-agent-session-123" \
  -d '{
    "model": "claude-sonnet-4-7",
    "cache_control": { "type": "ephemeral" },
    "messages": [
      { "role": "system", "content": "Stable project context..." },
      { "role": "user", "content": "Continue the refactor." }
    ],
    "max_tokens": 1200
  }'

Capacity

Managed concurrency controls

Enterprise API requests pass through managed admission control before a model call is sent. The gateway uses short wait queues, model-level concurrency slots, account-level request-per-minute guards, capacity telemetry, and circuit breakers for overloaded model paths. These controls apply only to Enterprise API traffic and are surfaced as Hypereal response headers.

Surface	Primary models	Requests	Tokens	Queue
Text generation	gpt-5-5	15,000 RPM	40,000,000 TPM	15,000,000,000 tokens
Image generation	gpt-image-2	250 IPM	8,000,000 TPM	n/a

These are managed capacity ceilings. API key spending limits, model scoping, daily budgets, hourly budgets, and per-key model limits can be configured lower for internal control.

Managed capacity headershttp

X-Hypereal-Managed-Governor: active
X-Hypereal-Managed-Model-Concurrency-Limit: 80
X-Hypereal-Managed-Model-Concurrency-Remaining: 79
X-Hypereal-Managed-Model-RPM-Limit: 15000
X-Hypereal-Managed-Model-RPM-Remaining: 14999
X-Hypereal-Capacity-Requests-Remaining: 9852
X-Hypereal-Managed-Image-IPM-Limit: 250
X-Hypereal-Managed-Image-IPM-Remaining: 249
X-Hypereal-Managed-Circuit: closed

Insurance

Automatic compensation for slow requests

Enterprise API requests carry request insurance for unusually slow successful calls. Failed requests are not charged, so compensation is only evaluated after a successful request has a credit charge. Non-streaming responses include the settlement in hypereal.insurance. Streaming responses expose policy headers immediately and settle automatically after the stream finishes.

Insurance metadatajson

{
  "hypereal": {
    "insurance": {
      "status": "paid",
      "trigger": "latency",
      "reason": "latency_threshold_exceeded",
      "latency_ms": 94320,
      "threshold_ms": 90000,
      "credits_charged": 12,
      "credits_compensated": 3
    }
  }
}

Response headershttp

X-Hypereal-Insurance-Status: paid
X-Hypereal-Insurance-Trigger: latency
X-Hypereal-Insurance-Latency-Ms: 94320
X-Hypereal-Insurance-Threshold-Ms: 90000
X-Hypereal-Insurance-Credits: 3

Managed endpoint

Use the managed path for OpenAI-compatible chat completions: /v1/managed/chat/completions, Responses API: /v1/managed/responses, and OpenAI image generations: /v1/managed/images/generations. Use /v1/managed/messages for direct Anthropic-native requests. Claude Code should use https://api.hypereal.cloud as its base URL.