Hypereal API 参考
一个 ck_前缀的 API Key。OpenAI 兼容的 REST 接口。可直接接入 Claude Code、Codex CLI、Cursor、OpenAI SDK、Anthropic SDK,或者直接用 curl 调用。对话、图像、视频、音频、代码 Agent — 全部统一在一个 Base URL 下。
Enterprise API uses a separate managed API surface.
This page documents the standard API paths. For managed Enterprise API models, capacity controls, and insurance, use the Enterprise overview and Enterprise API docs.
01 · 90 秒上手
快速开始
申请 Key,将客户端指向 hypereal.cloud,立即上线。认证方式与请求结构均与 OpenAI 兼容 — 大多数 SDK 只需修改 Base URL 即可使用。
至少充值 $2(200 额度),并在 /manage-api-keys处创建 Key。Key 以 ck_开头。
基础 URL: https://hypereal.cloud/api/v1
认证头使用 Authorization: Bearer ck_...。请求体与你已经熟悉的 OpenAI 格式完全一致。
curl https://hypereal.cloud/api/v1/chat/completions \
-H "Authorization: Bearer ck_..." \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.5",
"messages": [{"role": "user", "content": "Say hi in one word."}]
}'import OpenAI from 'openai';
const client = new OpenAI({
apiKey: process.env.HYPEREAL_API_KEY, // ck_...
baseURL: 'https://hypereal.cloud/api/v1',
});
const completion = await client.chat.completions.create({
model: 'gpt-5.5',
messages: [{ role: 'user', content: 'Say hi in one word.' }],
});
console.log(completion.choices[0].message.content);For coding agents, start withclaude-sonnet-4-6and use Claude Code or another Anthropic-compatible client that sendscache_control. Hypereal supportscache_controlcaching and Hypereal Cache. Hypereal Cache is on by default and can sharply reduce token consumption for repeated coding-agent context. You can sethypereal.cacheto"auto"explicitly, or omit it for the same default.
SDK
Hypereal SDK
Install hypereal-sdk for typed access to chat, responses, image generation, video generation, audio, jobs and storage from Node.js 18+.
Published as hypereal-sdk on npm.
Use client.images.generate(), chat, responses, jobs and storage.
See the full SDK overview at /sdk.
pnpm add hypereal-sdk
import { Hypereal } from 'hypereal-sdk';
const client = new Hypereal({
apiKey: process.env.HYPEREAL_API_KEY!,
});
const image = await client.images.generate({
model: 'gemini-3-1-flash-t2i',
prompt: 'A cinematic portrait in neon light',
aspect_ratio: '16:9',
});
console.log(image);const object = await client.storage.uploadFile(file, {
filename: 'training-image.png',
contentType: 'image/png',
kind: 'dataset',
});
const listed = await client.storage.list({ kind: 'dataset' });02
身份认证
每次请求都需要 ck_ 前缀的 Key。我们接受三种请求头格式,覆盖所有 SDK。
Bearer ck_... — OpenAI SDK、Codex CLI、Cursor 使用此头。ck_... — Anthropic SDK 与 Claude Code 在 /v1/messages上使用此头。ck_... — Google Gemini SDK / 原生格式, /v1/gemini.?key=ck_... 也可使用。03 · OpenAI 兼容
Chat Completions
主力端点,使用 OpenAI Chat Completions 协议。适用于 GPT、Gemini、Qwen、DeepSeek、GLM 以及所有非 Anthropic 系大模型。
/api/v1/chat/completions请求体
/v1/messages 代替。role, content)。false。设为 true时返回 SSE 流;用量信息会随最后一个 chunk 一起返回。定价
按 token 计费,使用各模型的输入/输出单价。100 额度 = $1.00。调用此端点的最低余额为 200 额度($2.00)。
curl https://hypereal.cloud/api/v1/chat/completions \
-H "Authorization: Bearer ck_..." \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.5",
"messages": [
{"role": "system", "content": "You are a terse assistant."},
{"role": "user", "content": "Two-line haiku about caches."}
],
"stream": true,
"max_tokens": 256
}'import OpenAI from 'openai';
const client = new OpenAI({
apiKey: process.env.HYPEREAL_API_KEY,
baseURL: 'https://hypereal.cloud/api/v1',
});
const stream = await client.chat.completions.create({
model: 'gpt-5.5',
stream: true,
messages: [{ role: 'user', content: 'Stream me a haiku.' }],
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content ?? '');
}OpenAI 与同协议模型
gpt-5.5gpt-5.5-instantgpt-5.5-progpt-5.4gpt-5.4-minideepseek-v4-prodeepseek-v4-flashdeepseek-v3.2kimi-k2.6kimi-k2.5glm-5.1glm-5qwen3-maxqwen3.5-plusqwen3.5-flashMiniMax-M2.504 · Anthropic 兼容
Messages
Anthropic /v1/messages 协议,支持 extended thinking、多上游切换以及 15 秒 SSE 心跳。Claude Code、OpenCode、OpenClaw 以及官方 Anthropic SDK 都使用这个端点。
/api/v1/messages请求体
claude-sonnet-4-6, claude-opus-4-6, 或 claude-haiku-4-5。旧版 Anthropic ID(claude-sonnet-4-5-20250929, claude-3-5-sonnet-20241022, claude-3-5-haiku-20241022)会自动映射到对应的最新模型。system,tools, or text content blocks for Anthropic prompt caching. Hypereal defaults a cache breakpoint when omitted and reports cache usage in response metadata."auto" to make the default explicit for repeated requests, orfalse to bypass it for a request.budget_tokens 用于限制推理痕迹长度。端点会每 15 秒发送 SSE 心跳,避免长 thinking 流被代理超时关闭。curl https://api.hypereal.cloud/v1/messages \
-H "x-api-key: ck_..." \
-H "anthropic-version: 2023-06-01" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-6",
"max_tokens": 1024,
"system": [{
"type": "text",
"text": "You are a senior TypeScript refactoring assistant.",
"cache_control": {"type": "ephemeral"}
}],
"messages": [
{"role": "user", "content": "Plan a 3-step refactor of a Next.js app."}
],
"hypereal": {"cache": "auto"}
}'import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic({
apiKey: process.env.HYPEREAL_API_KEY, // ck_...
baseURL: 'https://api.hypereal.cloud',
});
const msg = await client.messages.create({
model: 'claude-sonnet-4-6',
max_tokens: 1024,
system: [{
type: 'text',
text: 'You are a senior TypeScript refactoring assistant.',
cache_control: { type: 'ephemeral' },
}],
hypereal: { cache: 'auto' },
messages: [{ role: 'user', content: 'Hello, Claude.' }],
});
console.log(msg.content);Anthropic 模型
claude-opus-4-7claude-opus-4-6claude-sonnet-4-6claude-haiku-4-5managed-claude-opus-4-7-maxmanaged-claude-opus-4-6-maxmanaged-claude-opus-4-5-maxmanaged-claude-sonnet-4-6-maxmanaged-claude-sonnet-4-5-maxmanaged-claude-haiku-4-5-max05 · OpenAI Responses API
Responses
OpenAI 较新的 Responses API(Codex CLI 的 `wire_api = responses` 模式与 OpenAI Agents SDK 都使用)。认证方式与 chat/completions 一致;请求体使用 `input` 而非 `messages`。
/api/v1/responses说明
- Anthropic 模型会返回 400 — 应当使用
/v1/messages。 - 无论流式还是非流式,都按
response.usage.input_tokens/output_tokens计费。 - 部分上游始终返回 SSE — 即使
stream:false,端点也会自动识别并透传流式响应。 - 支持多上游切换。请将客户端超时设到 300 秒以上。
curl https://hypereal.cloud/api/v1/responses \
-H "Authorization: Bearer ck_..." \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.1-codex",
"input": "Write a TypeScript function that debounces a callback.",
"stream": true
}'import OpenAI from 'openai';
const client = new OpenAI({
apiKey: process.env.HYPEREAL_API_KEY,
baseURL: 'https://hypereal.cloud/api/v1',
});
const response = await client.responses.create({
model: 'gpt-5.3-codex',
input: 'Refactor this file into smaller modules.',
});
console.log(response.output_text);Codex 优化模型
gpt-5.3-codexgpt-5.3-codex-spark06 · Codex CLI / Codex Desktop
Codex CLI
Codex 把 `wire_api = responses` 的 provider 指向 /api/v1/responses。CLI 会自动在 Base URL 后追加 `/responses`,按下方示例配置 Base URL 即可。
/api/v1/responses# ~/.codex/config.toml model_provider = "hypereal" model = "gpt-5.3-codex" [model_providers.hypereal] name = "Hypereal" base_url = "https://hypereal.cloud/api/v1" wire_api = "responses" env_key = "HYPEREAL_API_KEY"
随后导出你的 Key:export HYPEREAL_API_KEY=ck_...
像往常一样运行 codex 。Codex 发出的所有内容 — 完整推理流、工具调用、文件编辑 — 都会原样代理。计费基于标准的 input_tokens / output_tokens 用量块。
OpenCode、Claude Code(使用 /v1/messages)、Cursor(使用 /v1/chat/completions)以及 Gemini CLI(使用 /v1/gemini)配置方式相同。
07
图像生成
OpenAI 兼容的 /images/generations 协议。同步调用 — 上游完成后,端点直接返回图片 URL(或 base64)。按图计费;`n` 限制在 1–10。
/api/v1/images/generations请求体
image, reference_images)传入参考图。1024x1024, 1536x1024。具体支持取决于上游。creditsPerGeneration × n,端点将返回 402。gpt-image-2, nano_banana_pro, and gemini-3-1-flash-t2i. Use gpt-5.5 only with chat, messages, or responses endpoints.curl https://hypereal.cloud/api/v1/images/generations \
-H "Authorization: Bearer ck_..." \
-H "Content-Type: application/json" \
-d '{
"model": "nano_banana_pro",
"prompt": "isometric studio shot of a tiny cyberpunk apartment, neon rim light",
"n": 1,
"size": "1024x1024"
}'const res = await fetch('https://hypereal.cloud/api/v1/images/generations', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.HYPEREAL_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'nano_banana_pro',
prompt: 'a chrome teapot floating over the ocean at sunset',
n: 1,
}),
});
const { data } = await res.json();
console.log(data[0].url); // or data[0].b64_json depending on the modelGPT Image 2 — text-to-image & image-to-image
Use the same /api/v1/images/generations endpoint with "model": "gpt-image-2". Pass an array of public image URLs in reference_images to switch from pure text-to-image to image-conditioned generation (edits, restyles, character consistency).
sizeaccepts1024x1024,1536x1024(landscape),1024x1536(portrait),2048x2048,4096x4096. 2K and 4K are square only.- Reference images must be public HTTPS URLs (base64 is not accepted by this model). Up to 4 references per request.
- Pricing is per-tier: 1K, 2K, and 4K each have their own credit cost — see the model table below.
- Synchronous response: the call returns the final image URL (no polling needed). Allow up to ~120 s.
# Text-to-image (1K landscape)
curl https://hypereal.cloud/api/v1/images/generations \
-H "Authorization: Bearer ck_..." \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-image-2",
"prompt": "a chrome teapot floating over the ocean at sunset",
"size": "1536x1024"
}'
# Image-to-image / edit
curl https://hypereal.cloud/api/v1/images/generations \
-H "Authorization: Bearer ck_..." \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-image-2",
"prompt": "same character, snowy mountain background, golden hour",
"size": "1024x1024",
"reference_images": [
"https://example.com/source.jpg"
]
}'NanoBanana 2 — image-to-image & multimodal inputs
Model id gemini-3-1-flash-t2i (NanoBanana 2). Pass references in image_urls to switch into image-to-image / multi-reference mode. Up to 4 reference images, blended in prompt order. Use the standard aspect_ratio field — landscape, portrait, and square are all supported at every resolution tier.
- Supported
aspect_ratio: 1:1, 3:2, 2:3, 4:3, 3:4, 16:9, 9:16, 21:9. - Supported
resolution: 0.5K, 1K, 2K, 4K. - Reference images may be public HTTPS URLs or base64 data URLs.
- Multi-reference works with a text prompt — combine, e.g., a character + outfit + scene reference and describe the final composition in the prompt.
# Multimodal: text + multiple reference images
curl https://hypereal.cloud/api/v1/images/generations \
-H "Authorization: Bearer ck_..." \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-3-1-flash-t2i",
"prompt": "Place the character (img 1) wearing the jacket (img 2) into the scene from img 3, cinematic light",
"aspect_ratio": "16:9",
"resolution": "2K",
"image_urls": [
"https://example.com/character.png",
"https://example.com/jacket.png",
"https://example.com/scene.png"
]
}'图像模型
gpt-image-2gpt-4o-imagenano_banananano_banana_2gemini-3.1-flash-image-previewgemini-2.5-flash-image-previewflux-kontext-proflux-2-prodoubao-seedream-4-0doubao-seedream-4-5doubao-seedream-5-0gemini-3.1-flash-image-preview-officialflux-kontext-maxgemini-2.5-flash-image-officialnano_banana_progemini-3-pro-image-previewflux-2-flexgemini-3-pro-image-preview-officialgemini-3-pro-image-preview-4Kgemini-3.1-fast-imagengemini-3.1-thinking-imagen08 · 长任务
视频生成
异步视频端点 — 先创建任务,再轮询返回的任务 URL,直到视频就绪。多数模型按秒计费;Gemini Omni Flash、Veo、Vidu、Grok 等模型按段计费。
/api/v1/videos/generate请求体
per_second 类模型有效。16:9, 9:16, 1:1。具体支持取决于上游。Gemini Omni Flash accepts 16:9 or 9:16.720P.last_image_url 或 image — 详见对应模型的上游文档。curl https://hypereal.cloud/api/v1/videos/generate \
-H "Authorization: Bearer ck_..." \
-H "Content-Type: application/json" \
-d '{
"model": "gemini_omni_flash",
"prompt": "a white cube rotating on a black background, clean product demo",
"duration": 6,
"aspect_ratio": "16:9",
"resolution": "720P",
"image_urls": [
"https://example.com/product-reference.png"
]
}'const res = await fetch('https://hypereal.cloud/api/v1/videos/generate', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.HYPEREAL_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'gemini_omni_flash',
prompt: 'a cat walking on the moon, cinematic, no text',
duration: 6,
aspect_ratio: '16:9',
resolution: '720P',
image_urls: ['https://example.com/cat-reference.png'],
}),
});
const data = await res.json();
console.log(data.jobId, data.pollUrl); // poll /v1/jobs/{id} for the mp4视频模型
happyhorse-1.0gemini_omni_flashwan2.6-flashkling-2-6MiniMax-Hailuo-02doubao-seedance-1-0-pro-fastMiniMax-Hailuo-2.3wan2.6kling-video-o1kling-v3-omnikling-v3kling-v3-videodoubao-seedance-1-0-pro-qualitydoubao-seedance-2-0doubao-seedance-2-0-fastdoubao-seedance-1-5-proVeo3.1-fast-officialVeo3.1-quality-officialveo3.1-fastveo3.1-qualityvidu-q3-progrok-video-309 · Fish Audio
音频 — TTS、声音克隆、ASR
三个模型 ID 共用一个端点。请求体和响应结构取决于你调用的模型。提供方为 Fish Audio(直连,未走 ToAPI),按次计费。
/api/v1/audio/generationsaudio-tts 和 audio-clone为必填。audio-asr (输入)和 audio-clone (参考音频 ≥ 10 秒)为必填。data: [{ url }] 用于 TTS / 克隆, text (外加可选的 segments, duration)用于 ASR。curl https://hypereal.cloud/api/v1/audio/generations \
-H "Authorization: Bearer ck_..." \
-H "Content-Type: application/json" \
-d '{
"model": "audio-tts",
"text": "Welcome to Hypereal. One key, every model.",
"voice_id": "en_male_calm"
}'curl https://hypereal.cloud/api/v1/audio/generations \
-H "Authorization: Bearer ck_..." \
-H "Content-Type: application/json" \
-d '{
"model": "audio-clone",
"text": "This is my cloned voice.",
"audio": "https://example.com/reference-30s.mp3"
}'curl https://hypereal.cloud/api/v1/audio/generations \
-H "Authorization: Bearer ck_..." \
-H "Content-Type: application/json" \
-d '{
"model": "audio-asr",
"audio": "https://example.com/recording.mp3"
}'音频模型
audio-ttsaudio-cloneaudio-asr10 · Google 原生协议
Gemini
同一端点同时支持 Gemini 原生协议(`contents` / `generationConfig` / `systemInstruction`)和 OpenAI 协议。端点会在内部转换为 OpenAI 格式后再转发。对于多数代码而言,使用 /v1/chat/completions 加 Gemini 模型 ID 更简单。
/api/v1/geminitemperature, maxOutputTokens等。contents的替代选项。认证头: x-goog-api-key: ck_..., ?key=ck_...或 Authorization: Bearer ck_... 都可以。
curl "https://hypereal.cloud/api/v1/gemini" \
-H "x-goog-api-key: ck_..." \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-3.5-thinking",
"contents": [
{"role": "user", "parts": [{"text": "Outline a launch plan."}]}
],
"generationConfig": {"temperature": 0.6, "maxOutputTokens": 2048}
}'// The /v1/gemini endpoint accepts both Gemini-native and OpenAI shapes.
// For SDK use, the OpenAI client + /v1/chat/completions is simpler.
const res = await fetch('https://hypereal.cloud/api/v1/gemini', {
method: 'POST',
headers: {
'x-goog-api-key': process.env.HYPEREAL_API_KEY!,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'gemini-3.5-fast',
contents: [{ role: 'user', parts: [{ text: 'Hi' }] }],
}),
});
console.log(await res.json());Gemini 模型
gemini-3.5-thinkinggemini-3.5-fastgemini-3.1-pro-previewgemini-3-pro-previewgemini-3-flash-preview11
错误与速率限制
所有错误都返回 '{ error: { type, message } }' 形式的 JSON。速率限制按用户维度计算,不按 Key — 多个 Key 共享同一配额。
ck_ 前缀)、过期或被停用。X-RateLimit-Limit, X-RateLimit-Remaining和 X-RateLimit-Reset 响应头会在限流响应中返回。model、模型 ID 未知(响应中会包含 available_models),或端点与请求格式不匹配(例如把 Anthropic 模型发到了 /chat/completions上)。DEVELOPER
ComfyUI as API
Deploy a ComfyUI container as a Hypereal-managed GPU endpoint. Same per-second billing, auto-scaling, webhook delivery as any other deployment — you control the workflow graph and the model weights.
/comfy workflow-JSON paster and /v1/comfy/* routes were retired. ComfyUI now ships as a regular Deployment — you bring a Docker image (e.g. runpod/worker-comfyui or your own), we mount it on real GPUs./v1/gpu/run/{slug}Submits a job to your ComfyUI deployment. Async by default; pass "sync": true to wait inline up to 240s.
curl -X POST https://hypereal.cloud/v1/gpu/run/my-comfy-workflow \
-H "Authorization: Bearer $HYPEREAL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"input": {
"prompt": "a cinematic portrait of an astronaut",
"seed": 42,
"workflow_overrides": { "Sampler.steps": 30 }
}
}'{
"job_id": "K3uA7Pq9xLm4",
"status": "queued",
"provider_job_id": "..."
}/v1/gpu/jobs/{id}Poll for status. We live-poll the worker on each request so you see queued → running → succeeded in near real time. On succeeded credits settle to the actual GPU-seconds; on failed we refund the hold. Pin a webhookUrl on the deployment to skip polling.
{
"job_id": "K3uA7Pq9xLm4",
"status": "succeeded",
"output": { "images": ["data:image/png;base64,..."] },
"executionMs": 18420,
"creditsCharged": 56
}# List
curl https://hypereal.cloud/v1/deployments \
-H "Authorization: Bearer $HYPEREAL_API_KEY"
# Create (point at any ComfyUI worker image)
curl -X POST https://hypereal.cloud/v1/deployments \
-H "Authorization: Bearer $HYPEREAL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"slug": "my-comfy-workflow",
"name": "My Comfy",
"dockerImage": "runpod/worker-comfyui:dev-cuda12.1.1",
"gpuTypes": "ADA_48_PRO,AMPERE_80"
}'Open /infra/deployments/new: pick a GPU tier, point at your ComfyUI Docker image (custom builds with your weights and custom nodes pre-baked work fine), set min/max workers and idle timeout. Your endpoint goes live in 60s.
Full Infrastructure docs: /docs/infra — handler spec, pricing, webhook protocol, R2 storage for weights.
ENTERPRISE
Gateway features
Cost visibility, budget guardrails, request logs, multi-provider failover, and smart routing — all built into the same API key. No extra setup, no separate dashboard tier.
Spend, by model, in real time
Per-model pie, daily cost trend, top-10 most expensive requests. Available on every account at /usage. Export the underlying logs to CSV at any time:
GET /api/api-usage/export?days=30 Authorization: session cookie → hypereal-usage-2026-05-10.csv
Per-key monthly cap, with email guardrails
Set spendingLimit on any API key. We email at 80% (heads up) and 100% (hard cap). Optional: auto-disable the key on overshoot so a runaway loop never costs you a four-figure invoice.
POST /api/api-keys
{
"name": "prod-eu",
"spendingLimit": 50000 // 500 USD / month
}Every call, searchable
Every API call is indexed by endpoint, model, status code, latency, and cost. Filter and search at /usage, or pull the JSON directly:
GET /api/api-usage?days=30&limit=1000
{
logs: [...],
costByModel: [...],
topExpensiveRequests: [...]
}Outages don't reach your users
Every supported model has a fallback chain. On 5xx, timeout, or 429 we transparently retry the next provider with exponential backoff. You always get a result or a single, clean error — never a flap.
primary: seedance-2-0-turbo-t2v (region us-east) fallback: seedance-2-0-t2v (region us-west) fallback: seedance-2-0 (region eu-central) retries: 1 per target, exp backoff
Pick by intent, we pick the cheapest qualified model
Send intent instead of model and we'll route to the cheapest provider in that capability bucket — without giving up determinism: pin a model whenever you want and we'll honor it exactly.
POST /v1/images/generate
{
"intent": "text-to-image-fast", // ← we'll pick the cheapest qualified model
"prompt": "a quiet sunrise over Mt Fuji"
}
# Or pin explicitly:
{ "model": "nano-banana-t2i", "prompt": "..." }SERVERLESS
GPU models
Hosted serverless GPU inference at /v1/gpu/{slug}. One API key, credit billing, audit log, and webhooks. Same wallet and dashboard as your LLM calls.
1. Pick a model
Browse the live catalog at /gpu-recommend. Each model lists its slug, per-call or per-second credit cost, and the maximum execution time per call.
2. Sync invocation (small jobs)
Short-running models return the output inline.
curl -X POST https://api.hypereal.cloud/v1/gpu/sdxl \
-H "Authorization: Bearer ck_..." \
-H "Content-Type: application/json" \
-d '{"input": {"prompt": "a tabby cat astronaut"}}'
→ { "id": "...",
"status": "succeeded",
"outputs": ["https://cdn.hypereal.cloud/gpu/.../out.png"],
"costCredits": 50,
"durationMs": 4210 }3. Async invocation (long jobs)
Long-running models queue and return a job id immediately with a 202. Poll, or wait for our cron + webhook poller to settle the job.
# Submit
POST /v1/gpu/wan-video
{ "input": { "prompt": "drone over Tokyo, neon, rain", "seconds": 5 } }
→ 202 { "id": "abc...", "status": "queued", "pollUrl": "/v1/gpu/jobs/abc..." }
# Poll
GET /v1/gpu/jobs/abc...
→ { "id": "abc...",
"status": "succeeded",
"outputs": ["https://cdn.hypereal.cloud/gpu/.../clip.mp4"],
"costCredits": 312,
"durationMs": 156000 }Failed and timed-out jobs auto-refund the credit reservation. Per-second billing reconciles on completion using the model's reported execution time, capped at the model'smaxSeconds.
ENTERPRISE
Teams, RBAC & SSO
Organizations, five built-in roles, SAML and OIDC single sign-on. Built so security and procurement can sign off without a custom rider.
Org-scoped keys, audit log, billing
Every API key, webhook, ComfyUI workflow, and GPU template can belong to an organization instead of an individual. Teammates share one budget, one audit trail, and one invoice. Personal keys keep working alongside.
POST /api/orgs
{
"name": "Acme Inc"
}
→ { id, slug, role: "owner" }Owner · Admin · Developer · Billing · Viewer
- Owner — everything, including delete-org
- Admin — manage members, keys, SSO, webhooks
- Developer — create/delete API keys, manage workflows + GPUs
- Billing — view + manage payments and audit log
- Viewer — read-only access to keys, billing, audit
Configure your IdP in 3 steps
- Create a SAML app in Okta / Azure AD / Auth0 / Google.
- Set ACS URL to
https://hypereal.cloud/api/auth/sso/<providerId> - Paste the IdP metadata XML into /settings/organization → SSO.
Set the email-domain claim (e.g. acme.com) and the login form will auto-route corporate emails to your IdP — no password prompt.
Issuer + client credentials
Drop in your issuer URL, client id, and client secret. We fetch the/.well-known/openid-configuration on save and surface a green check when the IdP is reachable.
POST /api/orgs/{id}/sso
{
"type": "oidc",
"issuer": "https://idp.acme.com",
"clientId": "...",
"clientSecret": "...",
"domain": "acme.com"
}12
定价与额度
统一单位:100 额度 = $1.00 USD。LLM 按 token 计费,使用各模型的输入/输出单价。媒体模型按张、按秒或按段计费。
大型语言模型
Token × 每百万 token 单价。流式请求按最后一个 usage chunk 计费。
图像
按生成次数 × 实际返回的 n 数量计费。
视频与音频
按秒(多数视频)、按段(Veo、Vidu、Grok)或按次(Fish Audio)计费。
Claude、GPT、Gemini 和部分图像模型(GPT Image 2、Nano Banana)的价格低于官方直连。视频、音频及其他媒体模型按标准价格计费。

