v1稳定版Claude / GPT / Gemini 低于官方直连

Hypereal API 参考

一个 ck_前缀的 API Key。OpenAI 兼容的 REST 接口。可直接接入 Claude Code、Codex CLI、Cursor、OpenAI SDK、Anthropic SDK，或者直接用 curl 调用。对话、图像、视频、音频、代码 Agent — 全部统一在一个 Base URL 下。

APITOKEN

Coding Credits · limited launch

Claude Sonnet 4.6 · GPT-5.5 · Gemini 3.5 — pay as you go, no subscription

Ends in 0d 00h 00m 00s

Enterprise API uses a separate managed API surface.

This page documents the standard API paths. For managed Enterprise API models, capacity controls, and insurance, use the Enterprise overview and Enterprise API docs.

Enterprise Enterprise API

01 · 90 秒上手

快速开始

申请 Key，将客户端指向 hypereal.cloud，立即上线。认证方式与请求结构均与 OpenAI 兼容 — 大多数 SDK 只需修改 Base URL 即可使用。

1. 获取 Key

至少充值 $2（200 额度），并在 /manage-api-keys处创建 Key。Key 以 ck_开头。

2. 配置客户端

基础 URL: https://hypereal.cloud/api/v1

3. 发送请求

认证头使用 Authorization: Bearer ck_...。请求体与你已经熟悉的 OpenAI 格式完全一致。

curlbash

curl https://hypereal.cloud/api/v1/chat/completions \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.5",
    "messages": [{"role": "user", "content": "Say hi in one word."}]
  }'

Node — OpenAI SDKts

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.HYPEREAL_API_KEY, // ck_...
  baseURL: 'https://hypereal.cloud/api/v1',
});

const completion = await client.chat.completions.create({
  model: 'gpt-5.5',
  messages: [{ role: 'user', content: 'Say hi in one word.' }],
});

console.log(completion.choices[0].message.content);

Cache support

For coding agents, start withclaude-sonnet-4-6and use Claude Code or another Anthropic-compatible client that sendscache_control. Hypereal supportscache_controlcaching and Hypereal Cache. Hypereal Cache is on by default and can sharply reduce token consumption for repeated coding-agent context. You can sethypereal.cacheto"auto"explicitly, or omit it for the same default.

SDK

Hypereal SDK

Install hypereal-sdk for typed access to chat, responses, image generation, video generation, audio, jobs and storage from Node.js 18+.

Install

Published as hypereal-sdk on npm.

Resources

Use client.images.generate(), chat, responses, jobs and storage.

Landing page

See the full SDK overview at /sdk.

Installbash

pnpm add hypereal-sdk

Quickstartts

import { Hypereal } from 'hypereal-sdk';

const client = new Hypereal({
  apiKey: process.env.HYPEREAL_API_KEY!,
});

const image = await client.images.generate({
  model: 'gemini-3-1-flash-t2i',
  prompt: 'A cinematic portrait in neon light',
  aspect_ratio: '16:9',
});

console.log(image);

Storage uploadts

const object = await client.storage.uploadFile(file, {
  filename: 'training-image.png',
  contentType: 'image/png',
  kind: 'dataset',
});

const listed = await client.storage.list({ kind: 'dataset' });

身份认证

每次请求都需要 ck_ 前缀的 Key。我们接受三种请求头格式，覆盖所有 SDK。

Authorization

header

必填Bearer ck_... — OpenAI SDK、Codex CLI、Cursor 使用此头。

x-api-key

header

必填ck_... — Anthropic SDK 与 Claude Code 在 /v1/messages上使用此头。

x-goog-api-key

header

必填ck_... — Google Gemini SDK / 原生格式， /v1/gemini.?key=ck_... 也可使用。

Key 与用户绑定，会计入你在 /manage-api-keys里设置的单 Key 消费上限。速率限制按用户维度计算，而非按 Key。

03 · OpenAI 兼容

Chat Completions

主力端点，使用 OpenAI Chat Completions 协议。适用于 GPT、Gemini、Qwen、DeepSeek、GLM 以及所有非 Anthropic 系大模型。

POST/api/v1/chat/completions

请求体

model

string

必填任意非 Anthropic 模型 ID。详见下方表格。Anthropic 模型会返回 400 — 请改用 /v1/messages 代替。

messages

Message[]

必填标准 OpenAI 消息数组（role， content)。

stream

boolean

可选默认值为 false。设为 true时返回 SSE 流；用量信息会随最后一个 chunk 一起返回。

max_tokens

number

可选原样转发到上游，遵循各厂商默认值。

temperature, top_p, tools, …

any

可选其他 OpenAI 参数透传，不做修改。

定价

按 token 计费，使用各模型的输入/输出单价。100 额度 = $1.00。调用此端点的最低余额为 200 额度（$2.00）。

curl — 流式bash

curl https://hypereal.cloud/api/v1/chat/completions \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.5",
    "messages": [
      {"role": "system", "content": "You are a terse assistant."},
      {"role": "user", "content": "Two-line haiku about caches."}
    ],
    "stream": true,
    "max_tokens": 256
  }'

Node — OpenAI SDK 流式ts

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.HYPEREAL_API_KEY,
  baseURL: 'https://hypereal.cloud/api/v1',
});

const stream = await client.chat.completions.create({
  model: 'gpt-5.5',
  stream: true,
  messages: [{ role: 'user', content: 'Stream me a haiku.' }],
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? '');
}

OpenAI 与同协议模型

模型 ID

名称

输入 / 输出

gpt-5.5

GPT-5.5· OpenAI

$0.420 / $2.52 per MTok

gpt-5.5-instant

GPT-5.5 Instant· OpenAI

$0.420 / $2.52 per MTok

gpt-5.5-pro

GPT-5.5 Pro· OpenAI

$2.07 / $12.42 per MTok

gpt-5.4

GPT-5.4· OpenAI

$0.240 / $1.09 per MTok

gpt-5.4-mini

GPT-5.4 Mini· OpenAI

$0.040 / $0.250 per MTok

deepseek-v4-pro

DeepSeek V4 Pro· DeepSeek

$0.490 / $0.990 per MTok

deepseek-v4-flash

DeepSeek V4 Flash· DeepSeek

$0.160 / $0.330 per MTok

deepseek-v3.2

DeepSeek V3.2· DeepSeek

$0.230 / $0.920 per MTok

kimi-k2.6

Kimi K2.6· Moonshot

$1.07 / $4.44 per MTok

kimi-k2.5

Kimi K2.5· Moonshot

$0.460 / $2.42 per MTok

glm-5.1

GLM-5.1· Zhipu

$0.990 / $3.94 per MTok

glm-5

GLM-5· Zhipu

$0.460 / $2.07 per MTok

qwen3-max

Qwen 3 Max· Alibaba

$0.810 / $3.22 per MTok

qwen3.5-plus

Qwen 3.5 Plus· Alibaba

$0.460 / $2.76 per MTok

qwen3.5-flash

Qwen 3.5 Flash· Alibaba

$0.140 / $1.38 per MTok

MiniMax-M2.5

MiniMax M2.5· MiniMax

$0.240 / $0.970 per MTok

04 · Anthropic 兼容

Messages

Anthropic /v1/messages 协议，支持 extended thinking、多上游切换以及 15 秒 SSE 心跳。Claude Code、OpenCode、OpenClaw 以及官方 Anthropic SDK 都使用这个端点。

POST/api/v1/messages

请求体

model

string

必填claude-sonnet-4-6, claude-opus-4-6, 或 claude-haiku-4-5。旧版 Anthropic ID（claude-sonnet-4-5-20250929, claude-3-5-sonnet-20241022, claude-3-5-haiku-20241022）会自动映射到对应的最新模型。

messages

Message[]

必填Anthropic 格式的消息数组，包含 image、tool_use 等内容块。

max_tokens

number

必填Anthropic 协议要求必填。

cache_control

{ type: "ephemeral" }

可选Add it to stablesystem,tools, or text content blocks for Anthropic prompt caching. Hypereal defaults a cache breakpoint when omitted and reports cache usage in response metadata.

hypereal.cache

"auto" | false

可选Hypereal Cache is on by default. Use"auto" to make the default explicit for repeated requests, orfalse to bypass it for a request.

thinking

{ type: "enabled" | "adaptive", budget_tokens?: number }

可选扩展思考。 budget_tokens 用于限制推理痕迹长度。端点会每 15 秒发送 SSE 心跳，避免长 thinking 流被代理超时关闭。

stream, system, tools, …

any

可选与 Anthropic SDK 一致透传。

切换上游重试时，签名失效的旧 thinking 块会被自动过滤 — 你不用手动处理。

curl — extended thinkingbash

curl https://api.hypereal.cloud/v1/messages \
  -H "x-api-key: ck_..." \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "max_tokens": 1024,
    "system": [{
      "type": "text",
      "text": "You are a senior TypeScript refactoring assistant.",
      "cache_control": {"type": "ephemeral"}
    }],
    "messages": [
      {"role": "user", "content": "Plan a 3-step refactor of a Next.js app."}
    ],
    "hypereal": {"cache": "auto"}
  }'

Node — Anthropic SDKts

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({
  apiKey: process.env.HYPEREAL_API_KEY, // ck_...
  baseURL: 'https://api.hypereal.cloud',
});

const msg = await client.messages.create({
  model: 'claude-sonnet-4-6',
  max_tokens: 1024,
  system: [{
    type: 'text',
    text: 'You are a senior TypeScript refactoring assistant.',
    cache_control: { type: 'ephemeral' },
  }],
  hypereal: { cache: 'auto' },
  messages: [{ role: 'user', content: 'Hello, Claude.' }],
});

console.log(msg.content);

Anthropic 模型

模型 ID

名称

输入 / 输出

claude-opus-4-7

Claude Opus 4.7· Anthropic

$3.40 / $16.96 per MTok

claude-opus-4-6

Claude Opus 4.6· Anthropic

$3.40 / $16.96 per MTok

claude-sonnet-4-6

Claude Sonnet 4.6· Anthropic

$0.680 / $3.40 per MTok

claude-haiku-4-5

Claude Haiku 4.5· Anthropic

$0.130 / $0.650 per MTok

managed-claude-opus-4-7-max

Claude Opus 4.7· Hypereal Managed

$5.25 / $26.25 per MTok

managed-claude-opus-4-6-max

Claude Opus 4.6 (1M)· Hypereal Managed

$5.25 / $26.25 per MTok

managed-claude-opus-4-5-max

Claude Opus 4.5· Hypereal Managed

$5.25 / $26.25 per MTok

managed-claude-sonnet-4-6-max

Claude Sonnet 4.6· Hypereal Managed

$3.15 / $15.75 per MTok

managed-claude-sonnet-4-5-max

Claude Sonnet 4.5· Hypereal Managed

$3.15 / $15.75 per MTok

managed-claude-haiku-4-5-max

Claude Haiku 4.5· Hypereal Managed

$1.05 / $5.25 per MTok

05 · OpenAI Responses API

Responses

OpenAI 较新的 Responses API（Codex CLI 的 `wire_api = responses` 模式与 OpenAI Agents SDK 都使用）。认证方式与 chat/completions 一致；请求体使用 `input` 而非 `messages`。

POST/api/v1/responses

说明

Anthropic 模型会返回 400 — 应当使用 /v1/messages。
无论流式还是非流式，都按response.usage.input_tokens / output_tokens计费。
部分上游始终返回 SSE — 即使 stream:false，端点也会自动识别并透传流式响应。
支持多上游切换。请将客户端超时设到 300 秒以上。

curlbash

curl https://hypereal.cloud/api/v1/responses \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.1-codex",
    "input": "Write a TypeScript function that debounces a callback.",
    "stream": true
  }'

Node — OpenAI SDK responses.createts

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.HYPEREAL_API_KEY,
  baseURL: 'https://hypereal.cloud/api/v1',
});

const response = await client.responses.create({
  model: 'gpt-5.3-codex',
  input: 'Refactor this file into smaller modules.',
});

console.log(response.output_text);

Codex 优化模型

模型 ID

名称

输入 / 输出

gpt-5.3-codex

GPT-5.3 Codex· OpenAI

$0.090 / $0.680 per MTok

gpt-5.3-codex-spark

GPT-5.3 Codex Spark· OpenAI

$0.090 / $0.680 per MTok

06 · Codex CLI / Codex Desktop

Codex CLI

Codex 把 `wire_api = responses` 的 provider 指向 /api/v1/responses。CLI 会自动在 Base URL 后追加 `/responses`，按下方示例配置 Base URL 即可。

POST/api/v1/responses

~/.codex/config.tomltoml

# ~/.codex/config.toml
model_provider = "hypereal"
model = "gpt-5.3-codex"

[model_providers.hypereal]
name = "Hypereal"
base_url = "https://hypereal.cloud/api/v1"
wire_api = "responses"
env_key = "HYPEREAL_API_KEY"

随后导出你的 Key：
export HYPEREAL_API_KEY=ck_...

像往常一样运行 codex 。Codex 发出的所有内容 — 完整推理流、工具调用、文件编辑 — 都会原样代理。计费基于标准的 input_tokens / output_tokens 用量块。

OpenCode、Claude Code（使用 /v1/messages）、Cursor（使用 /v1/chat/completions）以及 Gemini CLI（使用 /v1/gemini）配置方式相同。

图像生成

OpenAI 兼容的 /images/generations 协议。同步调用 — 上游完成后，端点直接返回图片 URL（或 base64）。按图计费；`n` 限制在 1–10。

POST/api/v1/images/generations

请求体

model

string

必填图像模型 ID — 详见下方表格。

prompt

string

必填文字提示词。对于支持图生图的模型，请通过该模型的原生参数（例如 image， reference_images）传入参考图。

number

可选图片数量，1–10（默认 1）。

size

string

可选原样转发，例如 1024x1024， 1536x1024。具体支持取决于上游。

quality, style, …

any

可选其他参数会透传到上游。

等级要求：图像生成需要 Starter 等级（累计充值 $10 以上）。如果余额无法覆盖预估的 creditsPerGeneration × n，端点将返回 402。

Use an image model ID here, not a chat model ID. Valid examples include gpt-image-2, nano_banana_pro, and gemini-3-1-flash-t2i. Use gpt-5.5 only with chat, messages, or responses endpoints.

curlbash

curl https://hypereal.cloud/api/v1/images/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nano_banana_pro",
    "prompt": "isometric studio shot of a tiny cyberpunk apartment, neon rim light",
    "n": 1,
    "size": "1024x1024"
  }'

Node — fetchts

const res = await fetch('https://hypereal.cloud/api/v1/images/generations', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.HYPEREAL_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'nano_banana_pro',
    prompt: 'a chrome teapot floating over the ocean at sunset',
    n: 1,
  }),
});

const { data } = await res.json();
console.log(data[0].url); // or data[0].b64_json depending on the model

Model

GPT Image 2 — text-to-image & image-to-image

Use the same /api/v1/images/generations endpoint with "model": "gpt-image-2". Pass an array of public image URLs in reference_images to switch from pure text-to-image to image-conditioned generation (edits, restyles, character consistency).

size accepts 1024x1024, 1536x1024 (landscape), 1024x1536 (portrait), 2048x2048, 4096x4096. 2K and 4K are square only.
Reference images must be public HTTPS URLs (base64 is not accepted by this model). Up to 4 references per request.
Pricing is per-tier: 1K, 2K, and 4K each have their own credit cost — see the model table below.
Synchronous response: the call returns the final image URL (no polling needed). Allow up to ~120 s.

# Text-to-image (1K landscape)
curl https://hypereal.cloud/api/v1/images/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-image-2",
    "prompt": "a chrome teapot floating over the ocean at sunset",
    "size": "1536x1024"
  }'

# Image-to-image / edit
curl https://hypereal.cloud/api/v1/images/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-image-2",
    "prompt": "same character, snowy mountain background, golden hour",
    "size": "1024x1024",
    "reference_images": [
      "https://example.com/source.jpg"
    ]
  }'

Model

NanoBanana 2 — image-to-image & multimodal inputs

Model id gemini-3-1-flash-t2i (NanoBanana 2). Pass references in image_urls to switch into image-to-image / multi-reference mode. Up to 4 reference images, blended in prompt order. Use the standard aspect_ratio field — landscape, portrait, and square are all supported at every resolution tier.

Supported aspect_ratio: 1:1, 3:2, 2:3, 4:3, 3:4, 16:9, 9:16, 21:9.
Supported resolution: 0.5K, 1K, 2K, 4K.
Reference images may be public HTTPS URLs or base64 data URLs.
Multi-reference works with a text prompt — combine, e.g., a character + outfit + scene reference and describe the final composition in the prompt.

# Multimodal: text + multiple reference images
curl https://hypereal.cloud/api/v1/images/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3-1-flash-t2i",
    "prompt": "Place the character (img 1) wearing the jacket (img 2) into the scene from img 3, cinematic light",
    "aspect_ratio": "16:9",
    "resolution": "2K",
    "image_urls": [
      "https://example.com/character.png",
      "https://example.com/jacket.png",
      "https://example.com/scene.png"
    ]
  }'

Hosting

Calling the API from a subdomain on shared hosting

No special setup is required. Our API accepts requests from any origin — there are no domain allowlists by default. Two things that catch shared-host users out, though:

Make API calls from your server, not the browser. Calling the API directly from client-side JavaScript would expose your ck_… key to every visitor. Always proxy through your own backend (PHP, Node, Python — whatever your subdomain runs).
Set a generous request timeout. Image and video calls can hold the connection open up to ~120 s (image) or ~300 s (video). Many shared hosts cap PHP/cURL at 30 s by default — raise max_execution_time, CURLOPT_TIMEOUT, and your reverse-proxy / FastCGI read timeout.
Lock keys to your subdomain (optional). In the dashboard you can scope an API key to a specific Origin or IP — recommended if your subdomain handles untrusted traffic.
Use HTTPS. Some shared-hosting subdomains default to HTTP — outbound HTTPS is required to reach the API.

# Minimal PHP server-side proxy (drop into /api/generate.php)
<?php
$body = file_get_contents('php://input');
$ch = curl_init('https://hypereal.cloud/api/v1/images/generations');
curl_setopt_array($ch, [
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_TIMEOUT        => 180,    // raise above shared-host default
  CURLOPT_POST           => true,
  CURLOPT_POSTFIELDS     => $body,
  CURLOPT_HTTPHEADER     => [
    'Authorization: Bearer ' . getenv('HYPEREAL_API_KEY'),
    'Content-Type: application/json',
  ],
]);
echo curl_exec($ch);

图像模型

模型 ID

名称

价格

gpt-image-2

GPT Image 2· OpenAI

$0.030 / image

gpt-4o-image

GPT-4o Image· OpenAI

$0.012 / image

nano_banana

Nano Banana· Nano Banana

$0.024 / image

nano_banana_2

Nano Banana 2· Nano Banana

$0.040 / image

gemini-3.1-flash-image-preview

Gemini 3.1 Flash Image· Google

$0.050 / image

gemini-2.5-flash-image-preview

Gemini 2.5 Flash Image· Google

$0.024 / image

flux-kontext-pro

Flux Kontext Pro· Flux

$0.040 / image

flux-2-pro

Flux 2 Pro· Flux

$0.050 / image

doubao-seedream-4-0

Doubao Seedream 4.0· ByteDance

$0.057 / image

doubao-seedream-4-5

Doubao Seedream 4.5· ByteDance

$0.071 / image

doubao-seedream-5-0

Doubao Seedream 5.0· ByteDance

$0.063 / image

gemini-3.1-flash-image-preview-official

Gemini 3.1 Flash Image (Official)· Google

$0.064 / image

flux-kontext-max

Flux Kontext Max· Flux

$0.080 / image

gemini-2.5-flash-image-official

Gemini 2.5 Flash Image (Official)· Google

$0.098 / image

nano_banana_pro

Nano Banana Pro· Nano Banana

$0.100 / image

gemini-3-pro-image-preview

Gemini 3 Pro Image· Google

$0.100 / image

flux-2-flex

Flux 2 Flex· Flux

$0.140 / image

gemini-3-pro-image-preview-official

Gemini 3 Pro Image (Official)· Google

$0.216 / image

gemini-3-pro-image-preview-4K

Gemini 3 Pro Image 4K· Google

$0.190 / image

gemini-3.1-fast-imagen

Gemini 3.1 Fast Imagen· Google

$0.020 / image

gemini-3.1-thinking-imagen

Gemini 3.1 Thinking Imagen· Google

$0.020 / image

08 · 长任务

视频生成

异步视频端点 — 先创建任务，再轮询返回的任务 URL，直到视频就绪。多数模型按秒计费；Gemini Omni Flash、Veo、Vidu、Grok 等模型按段计费。

POST/api/v1/videos/generate

请求体

model

string

必填视频模型 ID — 详见下方表格。

prompt

string

必填描述视频内容的文字提示词。

duration

number

可选时长（秒）。Gemini Omni Flash 支持 6 或 10；更宽范围仅对 per_second 类模型有效。

aspect_ratio

string

可选例如 16:9， 9:16， 1:1。具体支持取决于上游。Gemini Omni Flash accepts 16:9 or 9:16.

resolution

string

可选Forwarded when the selected model supports resolution. Gemini Omni Flash currently accepts 720P.

image_urls

string[]

可选For Gemini Omni Flash, pass 1-3 uploaded or public image URLs as visual references. Upload local images first and send the returned URL; direct base64 image payloads are not supported.

image_url

string

可选图生视频模型的首帧图。部分模型还接受 last_image_url 或 image — 详见对应模型的上游文档。

注意：视频生成是异步任务。创建接口会返回 jobId 和 pollUrl，请在服务端轮询到 completed 后再展示返回的 MP4 URL。

curl — 文字 + 图生视频bash

curl https://hypereal.cloud/api/v1/videos/generate \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini_omni_flash",
    "prompt": "a white cube rotating on a black background, clean product demo",
    "duration": 6,
    "aspect_ratio": "16:9",
    "resolution": "720P",
    "image_urls": [
      "https://example.com/product-reference.png"
    ]
  }'

Node — fetchts

const res = await fetch('https://hypereal.cloud/api/v1/videos/generate', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.HYPEREAL_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'gemini_omni_flash',
    prompt: 'a cat walking on the moon, cinematic, no text',
    duration: 6,
    aspect_ratio: '16:9',
    resolution: '720P',
    image_urls: ['https://example.com/cat-reference.png'],
  }),
});

const data = await res.json();
console.log(data.jobId, data.pollUrl); // poll /v1/jobs/{id} for the mp4

视频模型

模型 ID

名称

价格

happyhorse-1.0

HappyHorse 1.0· Alibaba

$0.110 / 720p / second · $0.190 / 1080p / second

gemini_omni_flash

Gemini Omni Flash· Google

$0.180 / clip

wan2.6-flash

WAN 2.6 Flash· Alibaba

$0.060 / sec

kling-2-6

Kling 2.6· Kuaishou

$0.074 / sec

MiniMax-Hailuo-02

MiniMax Hailuo 02· MiniMax

$0.080 / sec

doubao-seedance-1-0-pro-fast

Doubao Seedance Pro Fast· ByteDance

$0.083 / sec

MiniMax-Hailuo-2.3

MiniMax Hailuo 2.3· MiniMax

$0.098 / sec

wan2.6

WAN 2.6· Alibaba

$0.100 / sec

kling-video-o1

Kling Video O1· Kuaishou

$0.134 / sec

kling-v3-omni

Kling V3 Omni· Kuaishou

$0.134 / sec

kling-v3

Kling V3· Kuaishou

$0.134 / sec

kling-v3-video

Kling V3 Video· Kuaishou

$0.134 / sec

doubao-seedance-1-0-pro-quality

Doubao Seedance Pro Quality· ByteDance

$0.208 / sec

doubao-seedance-2-0

Doubao Seedance 2.0· ByteDance

$0.200 / sec

doubao-seedance-2-0-fast

Doubao Seedance 2.0 Fast· ByteDance

$0.105 / sec

doubao-seedance-1-5-pro

Doubao Seedance 1.5 Pro· ByteDance

$0.216 / sec

Veo3.1-fast-official

Veo 3.1 Fast· Google

$0.160 / sec

Veo3.1-quality-official

Veo 3.1 Quality· Google

$0.320 / sec

veo3.1-fast

Veo 3.1 Fast· Google

$0.160 / clip

veo3.1-quality

Veo 3.1 Quality· Google

$1.20 / clip

vidu-q3-pro

Vidu Q3 Pro· Vidu

$0.020 / clip

grok-video-3

Grok Video 3· xAI

$0.160 / clip

09 · Fish Audio

音频 — TTS、声音克隆、ASR

三个模型 ID 共用一个端点。请求体和响应结构取决于你调用的模型。提供方为 Fish Audio（直连，未走 ToAPI），按次计费。

POST/api/v1/audio/generations

model

"audio-tts" | "audio-clone" | "audio-asr"

必填选择执行的操作。

text

string

可选对 audio-tts 和 audio-clone为必填。

audio

string (URL)

可选对 audio-asr （输入）和 audio-clone （参考音频 ≥ 10 秒）为必填。

voice_id, format, sample_rate, …

any

可选其他 Fish Audio 参数透传。

响应结构： data: [{ url }] 用于 TTS / 克隆， text （外加可选的 segments， duration）用于 ASR。

TTSbash

curl https://hypereal.cloud/api/v1/audio/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "audio-tts",
    "text": "Welcome to Hypereal. One key, every model.",
    "voice_id": "en_male_calm"
  }'

声音克隆bash

curl https://hypereal.cloud/api/v1/audio/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "audio-clone",
    "text": "This is my cloned voice.",
    "audio": "https://example.com/reference-30s.mp3"
  }'

ASR（语音 → 文字）bash

curl https://hypereal.cloud/api/v1/audio/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "audio-asr",
    "audio": "https://example.com/recording.mp3"
  }'

音频模型

模型 ID

名称

价格

audio-tts

Text to Speech· Fish Audio

$0.020 / request

audio-clone

Voice Clone· Fish Audio

$0.020 / request

audio-asr

Speech Recognition· Fish Audio

$0.010 / request

10 · Google 原生协议

Gemini

同一端点同时支持 Gemini 原生协议（`contents` / `generationConfig` / `systemInstruction`）和 OpenAI 协议。端点会在内部转换为 OpenAI 格式后再转发。对于多数代码而言，使用 /v1/chat/completions 加 Gemini 模型 ID 更简单。

POST/api/v1/gemini

model

string

必填任意 Gemini 模型 ID — 详见下方表格。

contents

Content[]

可选Gemini 原生消息数组。

systemInstruction

Content

可选可选的系统消息（Gemini 格式）。

generationConfig

object

可选temperature， maxOutputTokens等。

messages

Message[]

可选OpenAI 格式，作为 contents的替代选项。

认证头： x-goog-api-key: ck_...， ?key=ck_...或 Authorization: Bearer ck_... 都可以。

curl — Gemini 原生bash

curl "https://hypereal.cloud/api/v1/gemini" \
  -H "x-goog-api-key: ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3.5-thinking",
    "contents": [
      {"role": "user", "parts": [{"text": "Outline a launch plan."}]}
    ],
    "generationConfig": {"temperature": 0.6, "maxOutputTokens": 2048}
  }'

Node — fetchts

// The /v1/gemini endpoint accepts both Gemini-native and OpenAI shapes.
// For SDK use, the OpenAI client + /v1/chat/completions is simpler.
const res = await fetch('https://hypereal.cloud/api/v1/gemini', {
  method: 'POST',
  headers: {
    'x-goog-api-key': process.env.HYPEREAL_API_KEY!,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'gemini-3.5-fast',
    contents: [{ role: 'user', parts: [{ text: 'Hi' }] }],
  }),
});

console.log(await res.json());

Gemini 模型

模型 ID

名称

输入 / 输出

gemini-3.5-thinking

Gemini 3.5 Thinking· Google

$0.900 / $5.40 per MTok

gemini-3.5-fast

Gemini 3.5 Fast· Google

$0.900 / $5.40 per MTok

gemini-3.1-pro-preview

Gemini 3.1 Pro Preview· Google

$0.390 / $1.74 per MTok

gemini-3-pro-preview

Gemini 3 Pro Preview· Google

$0.390 / $1.74 per MTok

gemini-3-flash-preview

Gemini 3 Flash Preview· Google

$0.050 / $0.290 per MTok

错误与速率限制

所有错误都返回 '{ error: { type, message } }' 形式的 JSON。速率限制按用户维度计算，不按 Key — 多个 Key 共享同一配额。

401 authentication_error

JSON

可选Key 缺失、格式错误（缺少 ck_ 前缀）、过期或被停用。

402 insufficient_credits

JSON

可选余额低于 200 额度（$2），或本次请求的预估费用超出余额。

403 access_denied

JSON

可选你的累计充值等级未解锁该模型（图像/视频/音频要求 $10 以上；部分旗舰 LLM 要求更高等级）。

429 rate_limit_error / spending_limit_error

JSON

可选触发用户级每小时上限（对话 1000/h，图像 500/h，视频与音频 200/h）或你为单 Key 设置的消费上限。 X-RateLimit-Limit， X-RateLimit-Remaining和 X-RateLimit-Reset 响应头会在限流响应中返回。

400 invalid_request_error

JSON

可选缺少 model、模型 ID 未知（响应中会包含 available_models），或端点与请求格式不匹配（例如把 Anthropic 模型发到了 /chat/completions上）。

502 api_error

JSON

可选该模型对应的所有上游都失败。错误信息中会附带最后一个上游返回的错误字符串。

DEVELOPER

ComfyUI as API

Deploy a ComfyUI container as a Hypereal-managed GPU endpoint. Same per-second billing, auto-scaling, webhook delivery as any other deployment — you control the workflow graph and the model weights.

Heads up — flow changed. The legacy /comfy workflow-JSON paster and /v1/comfy/* routes were retired. ComfyUI now ships as a regular Deployment — you bring a Docker image (e.g. runpod/worker-comfyui or your own), we mount it on real GPUs.

POST/v1/gpu/run/{slug}

Submits a job to your ComfyUI deployment. Async by default; pass "sync": true to wait inline up to 240s.

Submit a jobbash

curl -X POST https://hypereal.cloud/v1/gpu/run/my-comfy-workflow \
  -H "Authorization: Bearer $HYPEREAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": {
      "prompt": "a cinematic portrait of an astronaut",
      "seed": 42,
      "workflow_overrides": { "Sampler.steps": 30 }
    }
  }'

Submit responsejson

{
  "job_id": "K3uA7Pq9xLm4",
  "status": "queued",
  "provider_job_id": "..."
}

GET/v1/gpu/jobs/{id}

Poll for status. We live-poll the worker on each request so you see queued → running → succeeded in near real time. On succeeded credits settle to the actual GPU-seconds; on failed we refund the hold. Pin a webhookUrl on the deployment to skip polling.

Status responsejson

{
  "job_id": "K3uA7Pq9xLm4",
  "status": "succeeded",
  "output": { "images": ["data:image/png;base64,..."] },
  "executionMs": 18420,
  "creditsCharged": 56
}

See your deploymentsbash

# List
curl https://hypereal.cloud/v1/deployments \
  -H "Authorization: Bearer $HYPEREAL_API_KEY"

# Create (point at any ComfyUI worker image)
curl -X POST https://hypereal.cloud/v1/deployments \
  -H "Authorization: Bearer $HYPEREAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "slug": "my-comfy-workflow",
    "name": "My Comfy",
    "dockerImage": "runpod/worker-comfyui:dev-cuda12.1.1",
    "gpuTypes": "ADA_48_PRO,AMPERE_80"
  }'

Workflow setup

Open /infra/deployments/new: pick a GPU tier, point at your ComfyUI Docker image (custom builds with your weights and custom nodes pre-baked work fine), set min/max workers and idle timeout. Your endpoint goes live in 60s.

Full Infrastructure docs: /docs/infra — handler spec, pricing, webhook protocol, R2 storage for weights.

ENTERPRISE

Gateway features

Cost visibility, budget guardrails, request logs, multi-provider failover, and smart routing — all built into the same API key. No extra setup, no separate dashboard tier.

Cost Dashboard

Spend, by model, in real time

Per-model pie, daily cost trend, top-10 most expensive requests. Available on every account at /usage. Export the underlying logs to CSV at any time:

GET /api/api-usage/export?days=30
Authorization: session cookie

→ hypereal-usage-2026-05-10.csv

Budget Alerts

Per-key monthly cap, with email guardrails

Set spendingLimit on any API key. We email at 80% (heads up) and 100% (hard cap). Optional: auto-disable the key on overshoot so a runaway loop never costs you a four-figure invoice.

POST /api/api-keys
{
  "name": "prod-eu",
  "spendingLimit": 50000   // 500 USD / month
}

Request Logs

Every call, searchable

Every API call is indexed by endpoint, model, status code, latency, and cost. Filter and search at /usage, or pull the JSON directly:

GET /api/api-usage?days=30&limit=1000

{
  logs: [...],
  costByModel: [...],
  topExpensiveRequests: [...]
}

Multi-Provider Failover

Outages don't reach your users

Every supported model has a fallback chain. On 5xx, timeout, or 429 we transparently retry the next provider with exponential backoff. You always get a result or a single, clean error — never a flap.

primary:  seedance-2-0-turbo-t2v   (region us-east)
fallback: seedance-2-0-t2v         (region us-west)
fallback: seedance-2-0             (region eu-central)
retries:  1 per target, exp backoff

Smart Routing

Pick by intent, we pick the cheapest qualified model

Send intent instead of model and we'll route to the cheapest provider in that capability bucket — without giving up determinism: pin a model whenever you want and we'll honor it exactly.

POST /v1/images/generate
{
  "intent": "text-to-image-fast",   // ← we'll pick the cheapest qualified model
  "prompt": "a quiet sunrise over Mt Fuji"
}

# Or pin explicitly:
{ "model": "nano-banana-t2i", "prompt": "..." }

SERVERLESS

GPU models

Hosted serverless GPU inference at /v1/gpu/{slug}. One API key, credit billing, audit log, and webhooks. Same wallet and dashboard as your LLM calls.

1. Pick a model

Browse the live catalog at /gpu-recommend. Each model lists its slug, per-call or per-second credit cost, and the maximum execution time per call.

2. Sync invocation (small jobs)

Short-running models return the output inline.

curl -X POST https://api.hypereal.cloud/v1/gpu/sdxl \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{"input": {"prompt": "a tabby cat astronaut"}}'

→ { "id": "...",
    "status": "succeeded",
    "outputs": ["https://cdn.hypereal.cloud/gpu/.../out.png"],
    "costCredits": 50,
    "durationMs": 4210 }

3. Async invocation (long jobs)

Long-running models queue and return a job id immediately with a 202. Poll, or wait for our cron + webhook poller to settle the job.

# Submit
POST /v1/gpu/wan-video
{ "input": { "prompt": "drone over Tokyo, neon, rain", "seconds": 5 } }
→ 202 { "id": "abc...", "status": "queued", "pollUrl": "/v1/gpu/jobs/abc..." }

# Poll
GET /v1/gpu/jobs/abc...
→ { "id": "abc...",
    "status": "succeeded",
    "outputs": ["https://cdn.hypereal.cloud/gpu/.../clip.mp4"],
    "costCredits": 312,
    "durationMs": 156000 }

Failed and timed-out jobs auto-refund the credit reservation. Per-second billing reconciles on completion using the model's reported execution time, capped at the model'smaxSeconds.

ENTERPRISE

Teams, RBAC & SSO

Organizations, five built-in roles, SAML and OIDC single sign-on. Built so security and procurement can sign off without a custom rider.

Organizations

Org-scoped keys, audit log, billing

Every API key, webhook, ComfyUI workflow, and GPU template can belong to an organization instead of an individual. Teammates share one budget, one audit trail, and one invoice. Personal keys keep working alongside.

POST /api/orgs
{
  "name": "Acme Inc"
}
→ { id, slug, role: "owner" }

Five built-in roles

Owner · Admin · Developer · Billing · Viewer

Owner — everything, including delete-org
Admin — manage members, keys, SSO, webhooks
Developer — create/delete API keys, manage workflows + GPUs
Billing — view + manage payments and audit log
Viewer — read-only access to keys, billing, audit

SAML 2.0

Configure your IdP in 3 steps

Create a SAML app in Okta / Azure AD / Auth0 / Google.
Set ACS URL to https://hypereal.cloud/api/auth/sso/<providerId>
Paste the IdP metadata XML into /settings/organization → SSO.

Set the email-domain claim (e.g. acme.com) and the login form will auto-route corporate emails to your IdP — no password prompt.

OIDC

Issuer + client credentials

Drop in your issuer URL, client id, and client secret. We fetch the/.well-known/openid-configuration on save and surface a green check when the IdP is reachable.

POST /api/orgs/{id}/sso
{
  "type": "oidc",
  "issuer": "https://idp.acme.com",
  "clientId": "...",
  "clientSecret": "...",
  "domain": "acme.com"
}

定价与额度

统一单位：100 额度 = $1.00 USD。LLM 按 token 计费，使用各模型的输入/输出单价。媒体模型按张、按秒或按段计费。

大型语言模型

Token × 每百万 token 单价。流式请求按最后一个 usage chunk 计费。

图像

按生成次数 × 实际返回的 n 数量计费。

视频与音频

按秒（多数视频）、按段（Veo、Vidu、Grok）或按次（Fish Audio）计费。

Claude、GPT、Gemini 和部分图像模型（GPT Image 2、Nano Banana）的价格低于官方直连。视频、音频及其他媒体模型按标准价格计费。

v1稳定版Claude / GPT / Gemini 低于官方直连

Hypereal API 参考

APITOKEN

Coding Credits · limited launch

Claude Sonnet 4.6 · GPT-5.5 · Gemini 3.5 — pay as you go, no subscription

Ends in 0d 00h 00m 00s

Enterprise API uses a separate managed API surface.

This page documents the standard API paths. For managed Enterprise API models, capacity controls, and insurance, use the Enterprise overview and Enterprise API docs.

Enterprise Enterprise API

01 · 90 秒上手

快速开始

申请 Key，将客户端指向 hypereal.cloud，立即上线。认证方式与请求结构均与 OpenAI 兼容 — 大多数 SDK 只需修改 Base URL 即可使用。

1. 获取 Key

至少充值 $2（200 额度），并在 /manage-api-keys处创建 Key。Key 以 ck_开头。

2. 配置客户端

基础 URL: https://hypereal.cloud/api/v1

3. 发送请求

认证头使用 Authorization: Bearer ck_...。请求体与你已经熟悉的 OpenAI 格式完全一致。

curlbash

curl https://hypereal.cloud/api/v1/chat/completions \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.5",
    "messages": [{"role": "user", "content": "Say hi in one word."}]
  }'

Node — OpenAI SDKts

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.HYPEREAL_API_KEY, // ck_...
  baseURL: 'https://hypereal.cloud/api/v1',
});

const completion = await client.chat.completions.create({
  model: 'gpt-5.5',
  messages: [{ role: 'user', content: 'Say hi in one word.' }],
});

console.log(completion.choices[0].message.content);

Cache support

SDK

Hypereal SDK

Install hypereal-sdk for typed access to chat, responses, image generation, video generation, audio, jobs and storage from Node.js 18+.

Install

Published as hypereal-sdk on npm.

Resources

Use client.images.generate(), chat, responses, jobs and storage.

Landing page

See the full SDK overview at /sdk.

Installbash

pnpm add hypereal-sdk

Quickstartts

import { Hypereal } from 'hypereal-sdk';

const client = new Hypereal({
  apiKey: process.env.HYPEREAL_API_KEY!,
});

const image = await client.images.generate({
  model: 'gemini-3-1-flash-t2i',
  prompt: 'A cinematic portrait in neon light',
  aspect_ratio: '16:9',
});

console.log(image);

Storage uploadts

const object = await client.storage.uploadFile(file, {
  filename: 'training-image.png',
  contentType: 'image/png',
  kind: 'dataset',
});

const listed = await client.storage.list({ kind: 'dataset' });

身份认证

每次请求都需要 ck_ 前缀的 Key。我们接受三种请求头格式，覆盖所有 SDK。

Authorization

header

必填Bearer ck_... — OpenAI SDK、Codex CLI、Cursor 使用此头。

x-api-key

header

必填ck_... — Anthropic SDK 与 Claude Code 在 /v1/messages上使用此头。

x-goog-api-key

header

必填ck_... — Google Gemini SDK / 原生格式， /v1/gemini.?key=ck_... 也可使用。

Key 与用户绑定，会计入你在 /manage-api-keys里设置的单 Key 消费上限。速率限制按用户维度计算，而非按 Key。

03 · OpenAI 兼容

Chat Completions

主力端点，使用 OpenAI Chat Completions 协议。适用于 GPT、Gemini、Qwen、DeepSeek、GLM 以及所有非 Anthropic 系大模型。

POST/api/v1/chat/completions

请求体

model

string

必填任意非 Anthropic 模型 ID。详见下方表格。Anthropic 模型会返回 400 — 请改用 /v1/messages 代替。

messages

Message[]

必填标准 OpenAI 消息数组（role， content)。

stream

boolean

可选默认值为 false。设为 true时返回 SSE 流；用量信息会随最后一个 chunk 一起返回。

max_tokens

number

可选原样转发到上游，遵循各厂商默认值。

temperature, top_p, tools, …

any

可选其他 OpenAI 参数透传，不做修改。

定价

按 token 计费，使用各模型的输入/输出单价。100 额度 = $1.00。调用此端点的最低余额为 200 额度（$2.00）。

curl — 流式bash

curl https://hypereal.cloud/api/v1/chat/completions \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.5",
    "messages": [
      {"role": "system", "content": "You are a terse assistant."},
      {"role": "user", "content": "Two-line haiku about caches."}
    ],
    "stream": true,
    "max_tokens": 256
  }'

Node — OpenAI SDK 流式ts

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.HYPEREAL_API_KEY,
  baseURL: 'https://hypereal.cloud/api/v1',
});

const stream = await client.chat.completions.create({
  model: 'gpt-5.5',
  stream: true,
  messages: [{ role: 'user', content: 'Stream me a haiku.' }],
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? '');
}

OpenAI 与同协议模型

模型 ID

名称

输入 / 输出

gpt-5.5

GPT-5.5· OpenAI

$0.420 / $2.52 per MTok

gpt-5.5-instant

GPT-5.5 Instant· OpenAI

$0.420 / $2.52 per MTok

gpt-5.5-pro

GPT-5.5 Pro· OpenAI

$2.07 / $12.42 per MTok

gpt-5.4

GPT-5.4· OpenAI

$0.240 / $1.09 per MTok

gpt-5.4-mini

GPT-5.4 Mini· OpenAI

$0.040 / $0.250 per MTok

deepseek-v4-pro

DeepSeek V4 Pro· DeepSeek

$0.490 / $0.990 per MTok

deepseek-v4-flash

DeepSeek V4 Flash· DeepSeek

$0.160 / $0.330 per MTok

deepseek-v3.2

DeepSeek V3.2· DeepSeek

$0.230 / $0.920 per MTok

kimi-k2.6

Kimi K2.6· Moonshot

$1.07 / $4.44 per MTok

kimi-k2.5

Kimi K2.5· Moonshot

$0.460 / $2.42 per MTok

glm-5.1

GLM-5.1· Zhipu

$0.990 / $3.94 per MTok

glm-5

GLM-5· Zhipu

$0.460 / $2.07 per MTok

qwen3-max

Qwen 3 Max· Alibaba

$0.810 / $3.22 per MTok

qwen3.5-plus

Qwen 3.5 Plus· Alibaba

$0.460 / $2.76 per MTok

qwen3.5-flash

Qwen 3.5 Flash· Alibaba

$0.140 / $1.38 per MTok

MiniMax-M2.5

MiniMax M2.5· MiniMax

$0.240 / $0.970 per MTok

04 · Anthropic 兼容

Messages

Anthropic /v1/messages 协议，支持 extended thinking、多上游切换以及 15 秒 SSE 心跳。Claude Code、OpenCode、OpenClaw 以及官方 Anthropic SDK 都使用这个端点。

POST/api/v1/messages

请求体

model

string

messages

Message[]

必填Anthropic 格式的消息数组，包含 image、tool_use 等内容块。

max_tokens

number

必填Anthropic 协议要求必填。

cache_control

{ type: "ephemeral" }

可选Add it to stablesystem,tools, or text content blocks for Anthropic prompt caching. Hypereal defaults a cache breakpoint when omitted and reports cache usage in response metadata.

hypereal.cache

"auto" | false

可选Hypereal Cache is on by default. Use"auto" to make the default explicit for repeated requests, orfalse to bypass it for a request.

thinking

{ type: "enabled" | "adaptive", budget_tokens?: number }

可选扩展思考。 budget_tokens 用于限制推理痕迹长度。端点会每 15 秒发送 SSE 心跳，避免长 thinking 流被代理超时关闭。

stream, system, tools, …

any

可选与 Anthropic SDK 一致透传。

切换上游重试时，签名失效的旧 thinking 块会被自动过滤 — 你不用手动处理。

curl — extended thinkingbash

curl https://api.hypereal.cloud/v1/messages \
  -H "x-api-key: ck_..." \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "max_tokens": 1024,
    "system": [{
      "type": "text",
      "text": "You are a senior TypeScript refactoring assistant.",
      "cache_control": {"type": "ephemeral"}
    }],
    "messages": [
      {"role": "user", "content": "Plan a 3-step refactor of a Next.js app."}
    ],
    "hypereal": {"cache": "auto"}
  }'

Node — Anthropic SDKts

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({
  apiKey: process.env.HYPEREAL_API_KEY, // ck_...
  baseURL: 'https://api.hypereal.cloud',
});

const msg = await client.messages.create({
  model: 'claude-sonnet-4-6',
  max_tokens: 1024,
  system: [{
    type: 'text',
    text: 'You are a senior TypeScript refactoring assistant.',
    cache_control: { type: 'ephemeral' },
  }],
  hypereal: { cache: 'auto' },
  messages: [{ role: 'user', content: 'Hello, Claude.' }],
});

console.log(msg.content);

Anthropic 模型

模型 ID

名称

输入 / 输出

claude-opus-4-7

Claude Opus 4.7· Anthropic

$3.40 / $16.96 per MTok

claude-opus-4-6

Claude Opus 4.6· Anthropic

$3.40 / $16.96 per MTok

claude-sonnet-4-6

Claude Sonnet 4.6· Anthropic

$0.680 / $3.40 per MTok

claude-haiku-4-5

Claude Haiku 4.5· Anthropic

$0.130 / $0.650 per MTok

managed-claude-opus-4-7-max

Claude Opus 4.7· Hypereal Managed

$5.25 / $26.25 per MTok

managed-claude-opus-4-6-max

Claude Opus 4.6 (1M)· Hypereal Managed

$5.25 / $26.25 per MTok

managed-claude-opus-4-5-max

Claude Opus 4.5· Hypereal Managed

$5.25 / $26.25 per MTok

managed-claude-sonnet-4-6-max

Claude Sonnet 4.6· Hypereal Managed

$3.15 / $15.75 per MTok

managed-claude-sonnet-4-5-max

Claude Sonnet 4.5· Hypereal Managed

$3.15 / $15.75 per MTok

managed-claude-haiku-4-5-max

Claude Haiku 4.5· Hypereal Managed

$1.05 / $5.25 per MTok

05 · OpenAI Responses API

Responses

OpenAI 较新的 Responses API（Codex CLI 的 `wire_api = responses` 模式与 OpenAI Agents SDK 都使用）。认证方式与 chat/completions 一致；请求体使用 `input` 而非 `messages`。

POST/api/v1/responses

说明

Anthropic 模型会返回 400 — 应当使用 /v1/messages。
无论流式还是非流式，都按response.usage.input_tokens / output_tokens计费。
部分上游始终返回 SSE — 即使 stream:false，端点也会自动识别并透传流式响应。
支持多上游切换。请将客户端超时设到 300 秒以上。

curlbash

curl https://hypereal.cloud/api/v1/responses \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.1-codex",
    "input": "Write a TypeScript function that debounces a callback.",
    "stream": true
  }'

Node — OpenAI SDK responses.createts

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.HYPEREAL_API_KEY,
  baseURL: 'https://hypereal.cloud/api/v1',
});

const response = await client.responses.create({
  model: 'gpt-5.3-codex',
  input: 'Refactor this file into smaller modules.',
});

console.log(response.output_text);

Codex 优化模型

模型 ID

名称

输入 / 输出

gpt-5.3-codex

GPT-5.3 Codex· OpenAI

$0.090 / $0.680 per MTok

gpt-5.3-codex-spark

GPT-5.3 Codex Spark· OpenAI

$0.090 / $0.680 per MTok

06 · Codex CLI / Codex Desktop

Codex CLI

Codex 把 `wire_api = responses` 的 provider 指向 /api/v1/responses。CLI 会自动在 Base URL 后追加 `/responses`，按下方示例配置 Base URL 即可。

POST/api/v1/responses

~/.codex/config.tomltoml

# ~/.codex/config.toml
model_provider = "hypereal"
model = "gpt-5.3-codex"

[model_providers.hypereal]
name = "Hypereal"
base_url = "https://hypereal.cloud/api/v1"
wire_api = "responses"
env_key = "HYPEREAL_API_KEY"

随后导出你的 Key：
export HYPEREAL_API_KEY=ck_...

OpenCode、Claude Code（使用 /v1/messages）、Cursor（使用 /v1/chat/completions）以及 Gemini CLI（使用 /v1/gemini）配置方式相同。

图像生成

OpenAI 兼容的 /images/generations 协议。同步调用 — 上游完成后，端点直接返回图片 URL（或 base64）。按图计费；`n` 限制在 1–10。

POST/api/v1/images/generations

请求体

model

string

必填图像模型 ID — 详见下方表格。

prompt

string

必填文字提示词。对于支持图生图的模型，请通过该模型的原生参数（例如 image， reference_images）传入参考图。

number

可选图片数量，1–10（默认 1）。

size

string

可选原样转发，例如 1024x1024， 1536x1024。具体支持取决于上游。

quality, style, …

any

可选其他参数会透传到上游。

等级要求：图像生成需要 Starter 等级（累计充值 $10 以上）。如果余额无法覆盖预估的 creditsPerGeneration × n，端点将返回 402。

Use an image model ID here, not a chat model ID. Valid examples include gpt-image-2, nano_banana_pro, and gemini-3-1-flash-t2i. Use gpt-5.5 only with chat, messages, or responses endpoints.

curlbash

curl https://hypereal.cloud/api/v1/images/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nano_banana_pro",
    "prompt": "isometric studio shot of a tiny cyberpunk apartment, neon rim light",
    "n": 1,
    "size": "1024x1024"
  }'

Node — fetchts

const res = await fetch('https://hypereal.cloud/api/v1/images/generations', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.HYPEREAL_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'nano_banana_pro',
    prompt: 'a chrome teapot floating over the ocean at sunset',
    n: 1,
  }),
});

const { data } = await res.json();
console.log(data[0].url); // or data[0].b64_json depending on the model

Model

GPT Image 2 — text-to-image & image-to-image

size accepts 1024x1024, 1536x1024 (landscape), 1024x1536 (portrait), 2048x2048, 4096x4096. 2K and 4K are square only.
Reference images must be public HTTPS URLs (base64 is not accepted by this model). Up to 4 references per request.
Pricing is per-tier: 1K, 2K, and 4K each have their own credit cost — see the model table below.
Synchronous response: the call returns the final image URL (no polling needed). Allow up to ~120 s.

# Text-to-image (1K landscape)
curl https://hypereal.cloud/api/v1/images/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-image-2",
    "prompt": "a chrome teapot floating over the ocean at sunset",
    "size": "1536x1024"
  }'

# Image-to-image / edit
curl https://hypereal.cloud/api/v1/images/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-image-2",
    "prompt": "same character, snowy mountain background, golden hour",
    "size": "1024x1024",
    "reference_images": [
      "https://example.com/source.jpg"
    ]
  }'

Model

NanoBanana 2 — image-to-image & multimodal inputs

Supported aspect_ratio: 1:1, 3:2, 2:3, 4:3, 3:4, 16:9, 9:16, 21:9.
Supported resolution: 0.5K, 1K, 2K, 4K.
Reference images may be public HTTPS URLs or base64 data URLs.
Multi-reference works with a text prompt — combine, e.g., a character + outfit + scene reference and describe the final composition in the prompt.

# Multimodal: text + multiple reference images
curl https://hypereal.cloud/api/v1/images/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3-1-flash-t2i",
    "prompt": "Place the character (img 1) wearing the jacket (img 2) into the scene from img 3, cinematic light",
    "aspect_ratio": "16:9",
    "resolution": "2K",
    "image_urls": [
      "https://example.com/character.png",
      "https://example.com/jacket.png",
      "https://example.com/scene.png"
    ]
  }'

Hosting

Calling the API from a subdomain on shared hosting

No special setup is required. Our API accepts requests from any origin — there are no domain allowlists by default. Two things that catch shared-host users out, though:

Make API calls from your server, not the browser. Calling the API directly from client-side JavaScript would expose your ck_… key to every visitor. Always proxy through your own backend (PHP, Node, Python — whatever your subdomain runs).
Set a generous request timeout. Image and video calls can hold the connection open up to ~120 s (image) or ~300 s (video). Many shared hosts cap PHP/cURL at 30 s by default — raise max_execution_time, CURLOPT_TIMEOUT, and your reverse-proxy / FastCGI read timeout.
Lock keys to your subdomain (optional). In the dashboard you can scope an API key to a specific Origin or IP — recommended if your subdomain handles untrusted traffic.
Use HTTPS. Some shared-hosting subdomains default to HTTP — outbound HTTPS is required to reach the API.

# Minimal PHP server-side proxy (drop into /api/generate.php)
<?php
$body = file_get_contents('php://input');
$ch = curl_init('https://hypereal.cloud/api/v1/images/generations');
curl_setopt_array($ch, [
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_TIMEOUT        => 180,    // raise above shared-host default
  CURLOPT_POST           => true,
  CURLOPT_POSTFIELDS     => $body,
  CURLOPT_HTTPHEADER     => [
    'Authorization: Bearer ' . getenv('HYPEREAL_API_KEY'),
    'Content-Type: application/json',
  ],
]);
echo curl_exec($ch);

图像模型

模型 ID

名称

价格

gpt-image-2

GPT Image 2· OpenAI

$0.030 / image

gpt-4o-image

GPT-4o Image· OpenAI

$0.012 / image

nano_banana

Nano Banana· Nano Banana

$0.024 / image

nano_banana_2

Nano Banana 2· Nano Banana

$0.040 / image

gemini-3.1-flash-image-preview

Gemini 3.1 Flash Image· Google

$0.050 / image

gemini-2.5-flash-image-preview

Gemini 2.5 Flash Image· Google

$0.024 / image

flux-kontext-pro

Flux Kontext Pro· Flux

$0.040 / image

flux-2-pro

Flux 2 Pro· Flux

$0.050 / image

doubao-seedream-4-0

Doubao Seedream 4.0· ByteDance

$0.057 / image

doubao-seedream-4-5

Doubao Seedream 4.5· ByteDance

$0.071 / image

doubao-seedream-5-0

Doubao Seedream 5.0· ByteDance

$0.063 / image

gemini-3.1-flash-image-preview-official

Gemini 3.1 Flash Image (Official)· Google

$0.064 / image

flux-kontext-max

Flux Kontext Max· Flux

$0.080 / image

gemini-2.5-flash-image-official

Gemini 2.5 Flash Image (Official)· Google

$0.098 / image

nano_banana_pro

Nano Banana Pro· Nano Banana

$0.100 / image

gemini-3-pro-image-preview

Gemini 3 Pro Image· Google

$0.100 / image

flux-2-flex

Flux 2 Flex· Flux

$0.140 / image

gemini-3-pro-image-preview-official

Gemini 3 Pro Image (Official)· Google

$0.216 / image

gemini-3-pro-image-preview-4K

Gemini 3 Pro Image 4K· Google

$0.190 / image

gemini-3.1-fast-imagen

Gemini 3.1 Fast Imagen· Google

$0.020 / image

gemini-3.1-thinking-imagen

Gemini 3.1 Thinking Imagen· Google

$0.020 / image

08 · 长任务

视频生成

异步视频端点 — 先创建任务，再轮询返回的任务 URL，直到视频就绪。多数模型按秒计费；Gemini Omni Flash、Veo、Vidu、Grok 等模型按段计费。

POST/api/v1/videos/generate

请求体

model

string

必填视频模型 ID — 详见下方表格。

prompt

string

必填描述视频内容的文字提示词。

duration

number

可选时长（秒）。Gemini Omni Flash 支持 6 或 10；更宽范围仅对 per_second 类模型有效。

aspect_ratio

string

可选例如 16:9， 9:16， 1:1。具体支持取决于上游。Gemini Omni Flash accepts 16:9 or 9:16.

resolution

string

可选Forwarded when the selected model supports resolution. Gemini Omni Flash currently accepts 720P.

image_urls

string[]

可选For Gemini Omni Flash, pass 1-3 uploaded or public image URLs as visual references. Upload local images first and send the returned URL; direct base64 image payloads are not supported.

image_url

string

可选图生视频模型的首帧图。部分模型还接受 last_image_url 或 image — 详见对应模型的上游文档。

注意：视频生成是异步任务。创建接口会返回 jobId 和 pollUrl，请在服务端轮询到 completed 后再展示返回的 MP4 URL。

curl — 文字 + 图生视频bash

curl https://hypereal.cloud/api/v1/videos/generate \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini_omni_flash",
    "prompt": "a white cube rotating on a black background, clean product demo",
    "duration": 6,
    "aspect_ratio": "16:9",
    "resolution": "720P",
    "image_urls": [
      "https://example.com/product-reference.png"
    ]
  }'

Node — fetchts

const res = await fetch('https://hypereal.cloud/api/v1/videos/generate', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.HYPEREAL_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'gemini_omni_flash',
    prompt: 'a cat walking on the moon, cinematic, no text',
    duration: 6,
    aspect_ratio: '16:9',
    resolution: '720P',
    image_urls: ['https://example.com/cat-reference.png'],
  }),
});

const data = await res.json();
console.log(data.jobId, data.pollUrl); // poll /v1/jobs/{id} for the mp4

视频模型

模型 ID

名称

价格

happyhorse-1.0

HappyHorse 1.0· Alibaba

$0.110 / 720p / second · $0.190 / 1080p / second

gemini_omni_flash

Gemini Omni Flash· Google

$0.180 / clip

wan2.6-flash

WAN 2.6 Flash· Alibaba

$0.060 / sec

kling-2-6

Kling 2.6· Kuaishou

$0.074 / sec

MiniMax-Hailuo-02

MiniMax Hailuo 02· MiniMax

$0.080 / sec

doubao-seedance-1-0-pro-fast

Doubao Seedance Pro Fast· ByteDance

$0.083 / sec

MiniMax-Hailuo-2.3

MiniMax Hailuo 2.3· MiniMax

$0.098 / sec

wan2.6

WAN 2.6· Alibaba

$0.100 / sec

kling-video-o1

Kling Video O1· Kuaishou

$0.134 / sec

kling-v3-omni

Kling V3 Omni· Kuaishou

$0.134 / sec

kling-v3

Kling V3· Kuaishou

$0.134 / sec

kling-v3-video

Kling V3 Video· Kuaishou

$0.134 / sec

doubao-seedance-1-0-pro-quality

Doubao Seedance Pro Quality· ByteDance

$0.208 / sec

doubao-seedance-2-0

Doubao Seedance 2.0· ByteDance

$0.200 / sec

doubao-seedance-2-0-fast

Doubao Seedance 2.0 Fast· ByteDance

$0.105 / sec

doubao-seedance-1-5-pro

Doubao Seedance 1.5 Pro· ByteDance

$0.216 / sec

Veo3.1-fast-official

Veo 3.1 Fast· Google

$0.160 / sec

Veo3.1-quality-official

Veo 3.1 Quality· Google

$0.320 / sec

veo3.1-fast

Veo 3.1 Fast· Google

$0.160 / clip

veo3.1-quality

Veo 3.1 Quality· Google

$1.20 / clip

vidu-q3-pro

Vidu Q3 Pro· Vidu

$0.020 / clip

grok-video-3

Grok Video 3· xAI

$0.160 / clip

09 · Fish Audio

音频 — TTS、声音克隆、ASR

三个模型 ID 共用一个端点。请求体和响应结构取决于你调用的模型。提供方为 Fish Audio（直连，未走 ToAPI），按次计费。

POST/api/v1/audio/generations

model

"audio-tts" | "audio-clone" | "audio-asr"

必填选择执行的操作。

text

string

可选对 audio-tts 和 audio-clone为必填。

audio

string (URL)

可选对 audio-asr （输入）和 audio-clone （参考音频 ≥ 10 秒）为必填。

voice_id, format, sample_rate, …

any

可选其他 Fish Audio 参数透传。

响应结构： data: [{ url }] 用于 TTS / 克隆， text （外加可选的 segments， duration）用于 ASR。

TTSbash

curl https://hypereal.cloud/api/v1/audio/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "audio-tts",
    "text": "Welcome to Hypereal. One key, every model.",
    "voice_id": "en_male_calm"
  }'

声音克隆bash

curl https://hypereal.cloud/api/v1/audio/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "audio-clone",
    "text": "This is my cloned voice.",
    "audio": "https://example.com/reference-30s.mp3"
  }'

ASR（语音 → 文字）bash

curl https://hypereal.cloud/api/v1/audio/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "audio-asr",
    "audio": "https://example.com/recording.mp3"
  }'

音频模型

模型 ID

名称

价格

audio-tts

Text to Speech· Fish Audio

$0.020 / request

audio-clone

Voice Clone· Fish Audio

$0.020 / request

audio-asr

Speech Recognition· Fish Audio

$0.010 / request

10 · Google 原生协议

Gemini

POST/api/v1/gemini

model

string

必填任意 Gemini 模型 ID — 详见下方表格。

contents

Content[]

可选Gemini 原生消息数组。

systemInstruction

Content

可选可选的系统消息（Gemini 格式）。

generationConfig

object

可选temperature， maxOutputTokens等。

messages

Message[]

可选OpenAI 格式，作为 contents的替代选项。

认证头： x-goog-api-key: ck_...， ?key=ck_...或 Authorization: Bearer ck_... 都可以。

curl — Gemini 原生bash

curl "https://hypereal.cloud/api/v1/gemini" \
  -H "x-goog-api-key: ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3.5-thinking",
    "contents": [
      {"role": "user", "parts": [{"text": "Outline a launch plan."}]}
    ],
    "generationConfig": {"temperature": 0.6, "maxOutputTokens": 2048}
  }'

Node — fetchts

// The /v1/gemini endpoint accepts both Gemini-native and OpenAI shapes.
// For SDK use, the OpenAI client + /v1/chat/completions is simpler.
const res = await fetch('https://hypereal.cloud/api/v1/gemini', {
  method: 'POST',
  headers: {
    'x-goog-api-key': process.env.HYPEREAL_API_KEY!,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'gemini-3.5-fast',
    contents: [{ role: 'user', parts: [{ text: 'Hi' }] }],
  }),
});

console.log(await res.json());

Gemini 模型

模型 ID

名称

输入 / 输出

gemini-3.5-thinking

Gemini 3.5 Thinking· Google

$0.900 / $5.40 per MTok

gemini-3.5-fast

Gemini 3.5 Fast· Google

$0.900 / $5.40 per MTok

gemini-3.1-pro-preview

Gemini 3.1 Pro Preview· Google

$0.390 / $1.74 per MTok

gemini-3-pro-preview

Gemini 3 Pro Preview· Google

$0.390 / $1.74 per MTok

gemini-3-flash-preview

Gemini 3 Flash Preview· Google

$0.050 / $0.290 per MTok

错误与速率限制

所有错误都返回 '{ error: { type, message } }' 形式的 JSON。速率限制按用户维度计算，不按 Key — 多个 Key 共享同一配额。

401 authentication_error

JSON

可选Key 缺失、格式错误（缺少 ck_ 前缀）、过期或被停用。

402 insufficient_credits

JSON

可选余额低于 200 额度（$2），或本次请求的预估费用超出余额。

403 access_denied

JSON

可选你的累计充值等级未解锁该模型（图像/视频/音频要求 $10 以上；部分旗舰 LLM 要求更高等级）。

429 rate_limit_error / spending_limit_error

JSON

400 invalid_request_error

JSON

可选缺少 model、模型 ID 未知（响应中会包含 available_models），或端点与请求格式不匹配（例如把 Anthropic 模型发到了 /chat/completions上）。

502 api_error

JSON

可选该模型对应的所有上游都失败。错误信息中会附带最后一个上游返回的错误字符串。

DEVELOPER

ComfyUI as API

POST/v1/gpu/run/{slug}

Submits a job to your ComfyUI deployment. Async by default; pass "sync": true to wait inline up to 240s.

Submit a jobbash

curl -X POST https://hypereal.cloud/v1/gpu/run/my-comfy-workflow \
  -H "Authorization: Bearer $HYPEREAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": {
      "prompt": "a cinematic portrait of an astronaut",
      "seed": 42,
      "workflow_overrides": { "Sampler.steps": 30 }
    }
  }'

Submit responsejson

{
  "job_id": "K3uA7Pq9xLm4",
  "status": "queued",
  "provider_job_id": "..."
}

GET/v1/gpu/jobs/{id}

Status responsejson

{
  "job_id": "K3uA7Pq9xLm4",
  "status": "succeeded",
  "output": { "images": ["data:image/png;base64,..."] },
  "executionMs": 18420,
  "creditsCharged": 56
}

See your deploymentsbash

# List
curl https://hypereal.cloud/v1/deployments \
  -H "Authorization: Bearer $HYPEREAL_API_KEY"

# Create (point at any ComfyUI worker image)
curl -X POST https://hypereal.cloud/v1/deployments \
  -H "Authorization: Bearer $HYPEREAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "slug": "my-comfy-workflow",
    "name": "My Comfy",
    "dockerImage": "runpod/worker-comfyui:dev-cuda12.1.1",
    "gpuTypes": "ADA_48_PRO,AMPERE_80"
  }'

Workflow setup

Full Infrastructure docs: /docs/infra — handler spec, pricing, webhook protocol, R2 storage for weights.

ENTERPRISE

Gateway features

Cost visibility, budget guardrails, request logs, multi-provider failover, and smart routing — all built into the same API key. No extra setup, no separate dashboard tier.

Cost Dashboard

Spend, by model, in real time

Per-model pie, daily cost trend, top-10 most expensive requests. Available on every account at /usage. Export the underlying logs to CSV at any time:

GET /api/api-usage/export?days=30
Authorization: session cookie

→ hypereal-usage-2026-05-10.csv

Budget Alerts

Per-key monthly cap, with email guardrails

Set spendingLimit on any API key. We email at 80% (heads up) and 100% (hard cap). Optional: auto-disable the key on overshoot so a runaway loop never costs you a four-figure invoice.

POST /api/api-keys
{
  "name": "prod-eu",
  "spendingLimit": 50000   // 500 USD / month
}

Request Logs

Every call, searchable

Every API call is indexed by endpoint, model, status code, latency, and cost. Filter and search at /usage, or pull the JSON directly:

GET /api/api-usage?days=30&limit=1000

{
  logs: [...],
  costByModel: [...],
  topExpensiveRequests: [...]
}

Multi-Provider Failover

Outages don't reach your users

primary:  seedance-2-0-turbo-t2v   (region us-east)
fallback: seedance-2-0-t2v         (region us-west)
fallback: seedance-2-0             (region eu-central)
retries:  1 per target, exp backoff

Smart Routing

Pick by intent, we pick the cheapest qualified model

Send intent instead of model and we'll route to the cheapest provider in that capability bucket — without giving up determinism: pin a model whenever you want and we'll honor it exactly.

POST /v1/images/generate
{
  "intent": "text-to-image-fast",   // ← we'll pick the cheapest qualified model
  "prompt": "a quiet sunrise over Mt Fuji"
}

# Or pin explicitly:
{ "model": "nano-banana-t2i", "prompt": "..." }

SERVERLESS

GPU models

Hosted serverless GPU inference at /v1/gpu/{slug}. One API key, credit billing, audit log, and webhooks. Same wallet and dashboard as your LLM calls.

1. Pick a model

Browse the live catalog at /gpu-recommend. Each model lists its slug, per-call or per-second credit cost, and the maximum execution time per call.

2. Sync invocation (small jobs)

Short-running models return the output inline.

curl -X POST https://api.hypereal.cloud/v1/gpu/sdxl \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{"input": {"prompt": "a tabby cat astronaut"}}'

→ { "id": "...",
    "status": "succeeded",
    "outputs": ["https://cdn.hypereal.cloud/gpu/.../out.png"],
    "costCredits": 50,
    "durationMs": 4210 }

3. Async invocation (long jobs)

Long-running models queue and return a job id immediately with a 202. Poll, or wait for our cron + webhook poller to settle the job.

# Submit
POST /v1/gpu/wan-video
{ "input": { "prompt": "drone over Tokyo, neon, rain", "seconds": 5 } }
→ 202 { "id": "abc...", "status": "queued", "pollUrl": "/v1/gpu/jobs/abc..." }

# Poll
GET /v1/gpu/jobs/abc...
→ { "id": "abc...",
    "status": "succeeded",
    "outputs": ["https://cdn.hypereal.cloud/gpu/.../clip.mp4"],
    "costCredits": 312,
    "durationMs": 156000 }

Failed and timed-out jobs auto-refund the credit reservation. Per-second billing reconciles on completion using the model's reported execution time, capped at the model'smaxSeconds.

ENTERPRISE

Teams, RBAC & SSO

Organizations, five built-in roles, SAML and OIDC single sign-on. Built so security and procurement can sign off without a custom rider.

Organizations

Org-scoped keys, audit log, billing

POST /api/orgs
{
  "name": "Acme Inc"
}
→ { id, slug, role: "owner" }

Five built-in roles

Owner · Admin · Developer · Billing · Viewer

Owner — everything, including delete-org
Admin — manage members, keys, SSO, webhooks
Developer — create/delete API keys, manage workflows + GPUs
Billing — view + manage payments and audit log
Viewer — read-only access to keys, billing, audit

SAML 2.0

Configure your IdP in 3 steps

Create a SAML app in Okta / Azure AD / Auth0 / Google.
Set ACS URL to https://hypereal.cloud/api/auth/sso/<providerId>
Paste the IdP metadata XML into /settings/organization → SSO.

Set the email-domain claim (e.g. acme.com) and the login form will auto-route corporate emails to your IdP — no password prompt.

OIDC

Issuer + client credentials

Drop in your issuer URL, client id, and client secret. We fetch the/.well-known/openid-configuration on save and surface a green check when the IdP is reachable.

POST /api/orgs/{id}/sso
{
  "type": "oidc",
  "issuer": "https://idp.acme.com",
  "clientId": "...",
  "clientSecret": "...",
  "domain": "acme.com"
}

定价与额度

统一单位：100 额度 = $1.00 USD。LLM 按 token 计费，使用各模型的输入/输出单价。媒体模型按张、按秒或按段计费。

大型语言模型

Token × 每百万 token 单价。流式请求按最后一个 usage chunk 计费。

图像

按生成次数 × 实际返回的 n 数量计费。

视频与音频

按秒（多数视频）、按段（Veo、Vidu、Grok）或按次（Fish Audio）计费。

Claude、GPT、Gemini 和部分图像模型（GPT Image 2、Nano Banana）的价格低于官方直连。视频、音频及其他媒体模型按标准价格计费。