LogoHypereal AI
ModelsCoding LLMLimitedAgentPricingDocsEnterpriseAffiliate
Start Building
Hypereal AI
  • Models
  • Coding LLM
  • Products
  • Rent GPU
  • Train Models
  • ComfyUI as API
  • Deploy Any Model
  • Hypereal SDK
  • Agent
  • Pricing
  • Docs
  • Enterprise
  • Affiliate
v1StableClaude / GPT / Gemini under direct

Hypereal API Reference

One ck_-prefixed API key. OpenAI-compatible REST. Drop into Claude Code, Codex CLI, Cursor, the OpenAI SDK, the Anthropic SDK, or call it directly with curl. Chat, images, video, audio, code agents — all behind one base URL.

7.7×VALUE
Coding Credits · limited launch
Claude Opus 4.7 · Sonnet 4.6 · GPT-5.5 — pay as you go, no subscription
Ends in 0d 00h 00m 00s

On this page

QuickstartHypereal SDKAuthenticationChat CompletionsMessages (Anthropic)ResponsesCodex CLIImage generationVideo generationAudio (TTS / clone / ASR)GeminiErrors & rate limitsComfyUI as APIGateway featuresGPU passthroughPricing & credits

01 · Get started in 90s

Quickstart

Mint a key, point your client at hypereal.cloud, ship. Auth and request shapes are OpenAI-compatible — most SDKs work by changing only the base URL.

1. Get a key

Top up at least $2 (200 credits) and create a key at /manage-api-keys. Keys start with ck_.

2. Point your client

Base URL: https://hypereal.cloud/api/v1

3. Send a request

Auth header is Authorization: Bearer ck_.... Same OpenAI request bodies you already know.

curlbash
curl https://hypereal.cloud/api/v1/chat/completions \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.5",
    "messages": [{"role": "user", "content": "Say hi in one word."}]
  }'
Node — OpenAI SDKts
import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.HYPEREAL_API_KEY, // ck_...
  baseURL: 'https://hypereal.cloud/api/v1',
});

const completion = await client.chat.completions.create({
  model: 'gpt-5.5',
  messages: [{ role: 'user', content: 'Say hi in one word.' }],
});

console.log(completion.choices[0].message.content);

SDK

Hypereal SDK

Install hypereal-sdk for typed access to chat, responses, image generation, video generation, audio, jobs and storage from Node.js 18+.

Install

Published as hypereal-sdk on npm.

Resources

Use client.images.generate(), chat, responses, jobs and storage.

Landing page

See the full SDK overview at /sdk.

Installbash
pnpm add hypereal-sdk
Quickstartts
import { Hypereal } from 'hypereal-sdk';

const client = new Hypereal({
  apiKey: process.env.HYPEREAL_API_KEY!,
});

const image = await client.images.generate({
  model: 'gemini-3-1-flash-t2i',
  prompt: 'A cinematic portrait in neon light',
  aspect_ratio: '16:9',
});

console.log(image);
Storage uploadts
const object = await client.storage.uploadFile(file, {
  filename: 'training-image.png',
  contentType: 'image/png',
  kind: 'dataset',
});

const listed = await client.storage.list({ kind: 'dataset' });

02

Authentication

Every request needs a ck_-prefixed key. Three accepted header forms cover all SDKs.

Authorization
header
requiredBearer ck_... — used by the OpenAI SDK, Codex CLI, and Cursor.
x-api-key
header
requiredck_... — used by the Anthropic SDK and Claude Code on /v1/messages.
x-goog-api-key
header
requiredck_... — Google Gemini SDK / native shape, accepted by /v1/gemini.?key=ck_... also works.
Keys are bound to a user. They count toward per-key spending caps you can set in /manage-api-keys. Rate limits are evaluated per user, not per key.

03 · OpenAI-compatible

Chat Completions

The workhorse endpoint. OpenAI Chat Completions wire format. Used for GPT, Gemini, Qwen, DeepSeek, GLM, and every other non-Anthropic LLM.

POST/api/v1/chat/completions

Request body

model
string
requiredAny non-Anthropic model ID. See the table below. Anthropic models return a 400 — use /v1/messages instead.
messages
Message[]
requiredStandard OpenAI message array (role, content).
stream
boolean
optionalDefaults to false. SSE stream when true; usage is included in the final chunk.
max_tokens
number
optionalForwarded to upstream. Provider-specific defaults apply.
temperature, top_p, tools, …
any
optionalOther OpenAI params pass through unchanged.

Pricing

Billed per token using each model's input/output rate. 100 credits = $1.00. The minimum balance to call the endpoint is 200 credits ($2.00).

curl — streamingbash
curl https://hypereal.cloud/api/v1/chat/completions \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.5",
    "messages": [
      {"role": "system", "content": "You are a terse assistant."},
      {"role": "user", "content": "Two-line haiku about caches."}
    ],
    "stream": true,
    "max_tokens": 256
  }'
Node — OpenAI SDK streamingts
import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.HYPEREAL_API_KEY,
  baseURL: 'https://hypereal.cloud/api/v1',
});

const stream = await client.chat.completions.create({
  model: 'gpt-5.5',
  stream: true,
  messages: [{ role: 'user', content: 'Stream me a haiku.' }],
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? '');
}

OpenAI & provider-compatible models

Model ID
Label
Input / Output
gpt-5
GPT-5· OpenAI
$0.440 / $3.45 per MTok
gpt-5.1
GPT-5.1· OpenAI
$0.440 / $3.45 per MTok
gpt-5.2
GPT-5.2· OpenAI
$0.610 / $4.83 per MTok
gpt-5.3
GPT-5.3· OpenAI
$0.100 / $0.390 per MTok
gpt-5.4
GPT-5.4· OpenAI
$0.130 / $0.730 per MTok
gpt-5.5
GPT-5.5· OpenAI
$0.250 / $1.45 per MTok
gpt-5.5-instant
GPT-5.5 Instant· OpenAI
$0.250 / $1.45 per MTok
gpt-5.5-pro
GPT-5.5 Pro· OpenAI
$1.45 / $8.70 per MTok
gpt-5.4-mini
GPT-5.4 Mini· OpenAI
$0.040 / $0.220 per MTok
gpt-5.4-nano
GPT-5.4 Nano· OpenAI
$0.010 / $0.070 per MTok
gpt-5.4-official
GPT-5.4 (Official)· OpenAI
$2.30 / $13.80 per MTok
gpt-5.4-pro-official
GPT-5.4 Pro (Official)· OpenAI
$27.60 / $165.60 per MTok
gpt-5.2-official
GPT-5.2 (Official)· OpenAI
$1.61 / $12.88 per MTok
gpt-5-pro-official
GPT-5 Pro (Official)· OpenAI
$13.80 / $110.40 per MTok
gpt-realtime-1.5-official
GPT Realtime 1.5 (Official)· OpenAI
$3.68 / $14.72 per MTok
gpt-audio-1.5-official
GPT Audio 1.5 (Official)· OpenAI
$2.30 / $9.20 per MTok
glm-5
GLM-5· Zhipu AI
$0.460 / $2.07 per MTok
qwen3.5-plus
Qwen 3.5 Plus· Alibaba
$0.460 / $2.76 per MTok
qwen3.5-flash
Qwen 3.5 Flash· Alibaba
$0.140 / $1.38 per MTok
qwen3-max
Qwen 3 Max· Alibaba
$0.810 / $3.22 per MTok
deepseek-v3.2
DeepSeek V3.2· DeepSeek
$0.460 / $1.84 per MTok
kimi-k2.5
Kimi K2.5· Moonshot
$0.460 / $2.42 per MTok
MiniMax-M2.5
MiniMax M2.5· MiniMax
$0.250 / $0.970 per MTok
nano-banana-2
Nano Banana 2· Nano Banana
$0.010 / $0.010 per MTok

04 · Anthropic-compatible

Messages

Anthropic /v1/messages wire format with extended thinking, multi-upstream failover, and 15-second SSE keepalives. Use this for Claude Code, OpenCode, OpenClaw, and the official Anthropic SDK.

POST/api/v1/messages

Request body

model
string
requiredclaude-opus-4-6, claude-sonnet-4-6, or claude-haiku-4-5. Older Anthropic IDs (claude-sonnet-4-5-20250929, claude-3-5-sonnet-20241022, claude-3-5-haiku-20241022) auto-alias to the latest equivalents.
messages
Message[]
requiredAnthropic-format messages, including image and tool_use blocks.
max_tokens
number
requiredRequired by the Anthropic spec.
thinking
{ type: "enabled" | "adaptive", budget_tokens?: number }
optionalExtended thinking. budget_tokens caps the reasoning trace. The endpoint sends 15s SSE pings to keep proxies from closing long thinking streams.
stream, system, tools, …
any
optionalPass through as in the Anthropic SDK.
On retry to a failover upstream, stale thinking blocks with invalid signatures are filtered automatically — you don't have to handle that.
curl — extended thinkingbash
curl https://hypereal.cloud/api/v1/messages \
  -H "x-api-key: ck_..." \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-opus-4-6",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Plan a 3-step refactor of a Next.js app."}
    ],
    "thinking": {"type": "enabled", "budget_tokens": 4000}
  }'
Node — Anthropic SDKts
import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({
  apiKey: process.env.HYPEREAL_API_KEY, // ck_...
  baseURL: 'https://hypereal.cloud/api/v1',
});

const msg = await client.messages.create({
  model: 'claude-sonnet-4-6',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Hello, Claude.' }],
});

console.log(msg.content);

Anthropic models

Model ID
Label
Input / Output
claude-opus-4-6
Claude Opus 4.6· Anthropic
$1.73 / $8.63 per MTok
claude-sonnet-4-6
Claude Sonnet 4.6· Anthropic
$1.04 / $5.18 per MTok
claude-haiku-4-5
Claude Haiku 4.5· Anthropic
$0.350 / $1.73 per MTok

05 · OpenAI Responses API

Responses

OpenAI's newer Responses API (used by Codex CLI's `wire_api = responses` mode and the OpenAI Agents SDK). Same auth as chat/completions; the request body uses `input` instead of `messages`.

POST/api/v1/responses

Notes

  • Anthropic models return a 400 — they belong on /v1/messages.
  • Streaming and non-streaming both billed offresponse.usage.input_tokens / output_tokens.
  • Some upstreams always emit SSE — the endpoint detects this and streams through transparently even if stream:false.
  • Multi-upstream failover. Set a long client timeout (300s+).
curlbash
curl https://hypereal.cloud/api/v1/responses \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.1-codex",
    "input": "Write a TypeScript function that debounces a callback.",
    "stream": true
  }'
Node — OpenAI SDK responses.createts
import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.HYPEREAL_API_KEY,
  baseURL: 'https://hypereal.cloud/api/v1',
});

const response = await client.responses.create({
  model: 'gpt-5.3-codex',
  input: 'Refactor this file into smaller modules.',
});

console.log(response.output_text);

Codex-tuned models

Model ID
Label
Input / Output
gpt-5-codex
GPT-5 Codex· OpenAI
$0.440 / $3.45 per MTok
gpt-5-codex-mini
GPT-5 Codex Mini· OpenAI
$0.090 / $0.690 per MTok
gpt-5.1-codex
GPT-5.1 Codex· OpenAI
$0.440 / $3.45 per MTok
gpt-5.1-codex-mini
GPT-5.1 Codex Mini· OpenAI
$0.090 / $0.690 per MTok
gpt-5.1-codex-max
GPT-5.1 Codex Max· OpenAI
$0.440 / $3.45 per MTok
gpt-5.2-codex
GPT-5.2 Codex· OpenAI
$0.610 / $4.83 per MTok
gpt-5.3-codex
GPT-5.3 Codex· OpenAI
$0.610 / $4.83 per MTok
gpt-5.3-codex-spark
GPT-5.3 Codex Spark· OpenAI
$0.610 / $4.83 per MTok
gpt-5.3-codex-official
GPT-5.3 Codex (Official)· OpenAI
$1.61 / $12.88 per MTok

06 · Codex CLI / Codex Desktop

Codex CLI

Codex points its `wire_api = responses` provider at /api/v1/responses. The CLI prepends `/responses` to the base URL, so configure the base URL as shown.

POST/api/v1/responses
~/.codex/config.tomltoml
# ~/.codex/config.toml
model_provider = "hypereal"
model = "gpt-5.3-codex"

[model_providers.hypereal]
name = "Hypereal"
base_url = "https://hypereal.cloud/api/v1"
wire_api = "responses"
env_key = "HYPEREAL_API_KEY"

Then export your key:
export HYPEREAL_API_KEY=ck_...

Run codex as usual. Anything Codex sends — full reasoning streams, tool calls, file edits — proxies through unchanged. Billing keys off the standard input_tokens / output_tokens usage block.

Same setup works for OpenCode, Claude Code (use /v1/messages), Cursor (use /v1/chat/completions), and the Gemini CLI (use /v1/gemini).

07

Image generation

OpenAI-compatible /images/generations shape. Synchronous — the endpoint returns image URLs (or base64) when the upstream finishes. Billed per image; `n` is clamped to 1–10.

POST/api/v1/images/generations

Request body

model
string
requiredImage model ID — see the table.
prompt
string
requiredText prompt. For edit-capable models, include reference images via the model's native param (e.g. image, reference_images).
n
number
optionalNumber of images, 1–10 (default 1).
size
string
optionalForwarded as-is, e.g. 1024x1024, 1536x1024. Provider-dependent.
quality, style, …
any
optionalAdditional params pass through to the upstream.
Tier requirement: image generation needs Starter tier ($19.99+ cumulative top-up). If your balance can't cover the estimated creditsPerGeneration × n, the endpoint returns 402.
curlbash
curl https://hypereal.cloud/api/v1/images/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nano_banana_pro",
    "prompt": "isometric studio shot of a tiny cyberpunk apartment, neon rim light",
    "n": 1,
    "size": "1024x1024"
  }'
Node — fetchts
const res = await fetch('https://hypereal.cloud/api/v1/images/generations', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.HYPEREAL_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'gemini-3-pro-image-preview',
    prompt: 'a chrome teapot floating over the ocean at sunset',
    n: 1,
  }),
});

const { data } = await res.json();
console.log(data[0].url); // or data[0].b64_json depending on the model
Model

GPT Image 2 — text-to-image & image-to-image

Use the same /api/v1/images/generations endpoint with "model": "gpt-image-2". Pass an array of public image URLs in reference_images to switch from pure text-to-image to image-conditioned generation (edits, restyles, character consistency).

  • size accepts 1024x1024, 1536x1024 (landscape), 1024x1536 (portrait), 2048x2048, 4096x4096. 2K and 4K are square only.
  • Reference images must be public HTTPS URLs (base64 is not accepted by this model). Up to 4 references per request.
  • Pricing is per-tier: 1K, 2K, and 4K each have their own credit cost — see the model table below.
  • Synchronous response: the call returns the final image URL (no polling needed). Allow up to ~120 s.
# Text-to-image (1K landscape)
curl https://hypereal.cloud/api/v1/images/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-image-2",
    "prompt": "a chrome teapot floating over the ocean at sunset",
    "size": "1536x1024"
  }'

# Image-to-image / edit
curl https://hypereal.cloud/api/v1/images/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-image-2",
    "prompt": "same character, snowy mountain background, golden hour",
    "size": "1024x1024",
    "reference_images": [
      "https://example.com/source.jpg"
    ]
  }'
Model

NanoBanana 2 — image-to-image & multimodal inputs

Model id gemini-3-1-flash-t2i (NanoBanana 2). Pass references in image_urls to switch into image-to-image / multi-reference mode. Up to 4 reference images, blended in prompt order. Use the standard aspect_ratio field — landscape, portrait, and square are all supported at every resolution tier.

  • Supported aspect_ratio: 1:1, 3:2, 2:3, 4:3, 3:4, 16:9, 9:16, 21:9.
  • Supported resolution: 0.5K, 1K, 2K, 4K.
  • Reference images may be public HTTPS URLs or base64 data URLs.
  • Multi-reference works with a text prompt — combine, e.g., a character + outfit + scene reference and describe the final composition in the prompt.
# Multimodal: text + multiple reference images
curl https://hypereal.cloud/api/v1/images/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3-1-flash-t2i",
    "prompt": "Place the character (img 1) wearing the jacket (img 2) into the scene from img 3, cinematic light",
    "aspect_ratio": "16:9",
    "resolution": "2K",
    "image_urls": [
      "https://example.com/character.png",
      "https://example.com/jacket.png",
      "https://example.com/scene.png"
    ]
  }'
Hosting

Calling the API from a subdomain on shared hosting

No special setup is required. Our API accepts requests from any origin — there are no domain allowlists by default. Two things that catch shared-host users out, though:

  • Make API calls from your server, not the browser. Calling the API directly from client-side JavaScript would expose your ck_… key to every visitor. Always proxy through your own backend (PHP, Node, Python — whatever your subdomain runs).
  • Set a generous request timeout. Image and video calls can hold the connection open up to ~120 s (image) or ~300 s (video). Many shared hosts cap PHP/cURL at 30 s by default — raise max_execution_time, CURLOPT_TIMEOUT, and your reverse-proxy / FastCGI read timeout.
  • Lock keys to your subdomain (optional). In the dashboard you can scope an API key to a specific Origin or IP — recommended if your subdomain handles untrusted traffic.
  • Use HTTPS. Some shared-hosting subdomains default to HTTP — outbound HTTPS is required to reach the API.
# Minimal PHP server-side proxy (drop into /api/generate.php)
<?php
$body = file_get_contents('php://input');
$ch = curl_init('https://hypereal.cloud/api/v1/images/generations');
curl_setopt_array($ch, [
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_TIMEOUT        => 180,    // raise above shared-host default
  CURLOPT_POST           => true,
  CURLOPT_POSTFIELDS     => $body,
  CURLOPT_HTTPHEADER     => [
    'Authorization: Bearer ' . getenv('HYPEREAL_API_KEY'),
    'Content-Type: application/json',
  ],
]);
echo curl_exec($ch);

Image models

Model ID
Label
Price
gpt-image-2
GPT Image 2· OpenAI
$0.030 / image
gpt-4o-image
GPT-4o Image· OpenAI
$0.012 / image
nano_banana
Nano Banana· Nano Banana
$0.024 / image
nano_banana_2
Nano Banana 2· Nano Banana
$0.050 / image
gemini-3.1-flash-image-preview
Gemini 3.1 Flash Image· Google
$0.050 / image
gemini-2.5-flash-image-preview
Gemini 2.5 Flash Image· Google
$0.024 / image
flux-kontext-pro
Flux Kontext Pro· Flux
$0.040 / image
flux-2-pro
Flux 2 Pro· Flux
$0.050 / image
doubao-seedream-4-0
Doubao Seedream 4.0· ByteDance
$0.057 / image
doubao-seedream-4-5
Doubao Seedream 4.5· ByteDance
$0.071 / image
doubao-seedream-5-0
Doubao Seedream 5.0· ByteDance
$0.063 / image
gemini-3.1-flash-image-preview-official
Gemini 3.1 Flash Image (Official)· Google
$0.064 / image
flux-kontext-max
Flux Kontext Max· Flux
$0.080 / image
gemini-2.5-flash-image-official
Gemini 2.5 Flash Image (Official)· Google
$0.098 / image
nano_banana_pro
Nano Banana Pro· Nano Banana
$0.100 / image
gemini-3-pro-image-preview
Gemini 3 Pro Image· Google
$0.100 / image
flux-2-flex
Flux 2 Flex· Flux
$0.140 / image
gemini-3-pro-image-preview-official
Gemini 3 Pro Image (Official)· Google
$0.216 / image
gemini-3-pro-image-preview-4K
Gemini 3 Pro Image 4K· Google
$0.190 / image
gemini-3.1-fast-imagen
Gemini 3.1 Fast Imagen· Google
$0.020 / image
gemini-3.1-thinking-imagen
Gemini 3.1 Thinking Imagen· Google
$0.020 / image

08 · long-running

Video generation

Synchronous long-poll endpoint — keep the connection open until the clip is ready. Set your HTTP client timeout to 600s. Billing is per second (most models) or per clip (Veo, Vidu, Grok).

POST/api/v1/video/generations

Request body

model
string
requiredVideo model ID — see the table.
prompt
string
requiredText prompt describing the clip.
duration
number
optionalSeconds, 1–60 (default 5). Only meaningful for per_second models.
aspect_ratio
string
optionale.g. 16:9, 9:16, 1:1. Provider-dependent.
image_url
string
optionalFirst-frame keyframe for image-to-video models. Some models also accept last_image_url or image — see the upstream docs for that model.
Heads-up: this is a single long-running POST. There's no job-id polling; the response body contains the rendered video URL when the upstream is done. Use a server-side runtime (Node, edge with extended duration) — browsers and most CDNs will time out before a 5-second clip finishes rendering.
curl — text+image-to-videobash
curl https://hypereal.cloud/api/v1/video/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "doubao-seedance-2-0",
    "prompt": "drone shot flying over a foggy forest at dawn, cinematic",
    "duration": 5,
    "aspect_ratio": "16:9",
    "image_url": "https://example.com/keyframe.jpg"
  }'
Node — fetchts
const res = await fetch('https://hypereal.cloud/api/v1/video/generations', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.HYPEREAL_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'kling-v3',
    prompt: 'a cat walking on the moon',
    duration: 5,
    aspect_ratio: '16:9',
  }),
});

// Long-running: connection stays open until the upstream returns the clip.
// Set a generous timeout (300+ seconds).
const data = await res.json();
console.log(data); // contains url(s) to the rendered mp4

Video models

Model ID
Label
Price
wan2.6-flash
WAN 2.6 Flash· Alibaba
$0.060 / sec
kling-2-6
Kling 2.6· Kuaishou
$0.074 / sec
MiniMax-Hailuo-02
MiniMax Hailuo 02· MiniMax
$0.080 / sec
doubao-seedance-1-0-pro-fast
Doubao Seedance Pro Fast· ByteDance
$0.083 / sec
MiniMax-Hailuo-2.3
MiniMax Hailuo 2.3· MiniMax
$0.098 / sec
wan2.6
WAN 2.6· Alibaba
$0.100 / sec
kling-video-o1
Kling Video O1· Kuaishou
$0.134 / sec
kling-v3-omni
Kling V3 Omni· Kuaishou
$0.134 / sec
kling-v3
Kling V3· Kuaishou
$0.134 / sec
kling-v3-video
Kling V3 Video· Kuaishou
$0.134 / sec
doubao-seedance-1-0-pro-quality
Doubao Seedance Pro Quality· ByteDance
$0.208 / sec
doubao-seedance-2-0
Doubao Seedance 2.0· ByteDance
$0.200 / sec
doubao-seedance-2-0-fast
Doubao Seedance 2.0 Fast· ByteDance
$0.105 / sec
doubao-seedance-1-5-pro
Doubao Seedance 1.5 Pro· ByteDance
$0.216 / sec
Veo3.1-fast-official
Veo 3.1 Fast· Google
$0.160 / sec
Veo3.1-quality-official
Veo 3.1 Quality· Google
$0.320 / sec
veo3.1-fast
Veo 3.1 Fast· Google
$0.160 / clip
veo3.1-quality
Veo 3.1 Quality· Google
$1.20 / clip
vidu-q3-pro
Vidu Q3 Pro· Vidu
$0.020 / clip
grok-video-3
Grok Video 3· xAI
$0.160 / clip

09 · Fish Audio

Audio — TTS, voice cloning, ASR

Three model IDs share one endpoint. The shape of the body and response depends on which one you call. Provider is Fish Audio (called direct, not via ToAPI), billed per request.

POST/api/v1/audio/generations
model
"audio-tts" | "audio-clone" | "audio-asr"
requiredSelects the operation.
text
string
optionalRequired for audio-tts and audio-clone.
audio
string (URL)
optionalRequired for audio-asr (input) and audio-clone (reference voice ≥ 10s).
voice_id, format, sample_rate, …
any
optionalExtra Fish Audio params pass through.
Response shape: data: [{ url }] for TTS / clone, text (+ optional segments, duration) for ASR.
TTSbash
curl https://hypereal.cloud/api/v1/audio/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "audio-tts",
    "text": "Welcome to Hypereal. One key, every model.",
    "voice_id": "en_male_calm"
  }'
Voice clonebash
curl https://hypereal.cloud/api/v1/audio/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "audio-clone",
    "text": "This is my cloned voice.",
    "audio": "https://example.com/reference-30s.mp3"
  }'
ASR (speech → text)bash
curl https://hypereal.cloud/api/v1/audio/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "audio-asr",
    "audio": "https://example.com/recording.mp3"
  }'

Audio models

Model ID
Label
Price
audio-tts
Text to Speech· Fish Audio
$0.020 / request
audio-clone
Voice Clone· Fish Audio
$0.020 / request
audio-asr
Speech Recognition· Fish Audio
$0.010 / request

10 · Google native shape

Gemini

Accepts both Gemini-native (`contents` / `generationConfig` / `systemInstruction`) and OpenAI shapes on the same endpoint. The endpoint converts to OpenAI internally before forwarding. For most code, /v1/chat/completions with a Gemini model ID is simpler.

POST/api/v1/gemini
model
string
requiredAny Gemini model ID — see the table.
contents
Content[]
optionalGemini-native messages array.
systemInstruction
Content
optionalOptional system message in Gemini shape.
generationConfig
object
optionaltemperature, maxOutputTokens, etc.
messages
Message[]
optionalOpenAI shape, accepted as an alternative to contents.

Auth header: x-goog-api-key: ck_..., ?key=ck_..., or Authorization: Bearer ck_... all work.

curl — Gemini-nativebash
curl "https://hypereal.cloud/api/v1/gemini" \
  -H "x-goog-api-key: ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3.1-pro",
    "contents": [
      {"role": "user", "parts": [{"text": "Outline a launch plan."}]}
    ],
    "generationConfig": {"temperature": 0.6, "maxOutputTokens": 2048}
  }'
Node — fetchts
// The /v1/gemini endpoint accepts both Gemini-native and OpenAI shapes.
// For SDK use, the OpenAI client + /v1/chat/completions is simpler.
const res = await fetch('https://hypereal.cloud/api/v1/gemini', {
  method: 'POST',
  headers: {
    'x-goog-api-key': process.env.HYPEREAL_API_KEY!,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'gemini-3.1-fast',
    contents: [{ role: 'user', parts: [{ text: 'Hi' }] }],
  }),
});

console.log(await res.json());

Gemini models

Model ID
Label
Input / Output
gemini-3-pro-official
Gemini 3 Pro· Google
$1.84 / $11.04 per MTok
gemini-3-pro-preview-official
Gemini 3 Pro Preview· Google
$1.84 / $11.04 per MTok
gemini-3-flash-official
Gemini 3 Flash· Google
$0.460 / $2.76 per MTok
gemini-3-flash-preview-official
Gemini 3 Flash Preview· Google
$0.460 / $2.76 per MTok
gemini-3.1-pro
Gemini 3.1 Pro· Google
$0.010 / $0.010 per MTok
gemini-3.1-pro-preview-official
Gemini 3.1 Pro Preview· Google
$1.84 / $11.04 per MTok
gemini-3.1-fast
Gemini 3.1 Fast· Google
$0.580 / $3.45 per MTok
gemini-3.1-thinking
Gemini 3.1 Thinking· Google
$0.580 / $3.45 per MTok
gemini-3.1-flash-lite-preview-official
Gemini 3.1 Flash Lite Preview· Google
$0.230 / $1.38 per MTok
gemini-2.5-pro-official
Gemini 2.5 Pro· Google
$1.15 / $9.20 per MTok
gemini-2.5-flash-official
Gemini 2.5 Flash· Google
$0.280 / $2.30 per MTok
gemini-2.5-flash-lite-official
Gemini 2.5 Flash Lite· Google
$0.100 / $0.370 per MTok
gemini-2.0-flash-official
Gemini 2.0 Flash· Google
$0.140 / $0.560 per MTok
gemini-2.0-flash-lite-official
Gemini 2.0 Flash Lite· Google
$0.070 / $0.280 per MTok
gemini-2.0-flash-vip
Gemini 2.0 Flash VIP· Google
$0.050 / $0.210 per MTok
gemini-2.5-flash-vip
Gemini 2.5 Flash VIP· Google
$0.110 / $0.870 per MTok
gemini-2.5-pro-vip
Gemini 2.5 Pro VIP· Google
$0.440 / $3.45 per MTok
gemini-3-flash-preview-vip
Gemini 3 Flash Preview VIP· Google
$0.180 / $1.04 per MTok

11

Errors & rate limits

All errors are JSON of the form { error: { type, message } }. Rate limits are evaluated per user, not per key — multiple keys share the same quota.

401 authentication_error
JSON
optionalMissing, malformed (no ck_ prefix), expired, or inactive key.
402 insufficient_credits
JSON
optionalBalance under 200 credits ($2), or the request's estimated cost exceeds your balance.
403 access_denied
JSON
optionalYour cumulative top-up tier doesn't unlock that model (image/video/audio require $19.99+; some flagship LLMs require higher tiers).
429 rate_limit_error / spending_limit_error
JSON
optionalPer-user hourly cap (1000/h for chat, 500/h for images, 200/h for video and audio) or a per-key spending limit you set. X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset headers are returned on rate limit responses.
400 invalid_request_error
JSON
optionalMissing model, unknown model ID (the response includes available_models), or wrong endpoint for the format (e.g. an Anthropic model on /chat/completions).
502 api_error
JSON
optionalAll upstreams failed for that model. The message includes the last upstream's error string.

DEVELOPER

ComfyUI as API

Deploy a ComfyUI container as a Hypereal-managed GPU endpoint. Same per-second billing, auto-scaling, webhook delivery as any other deployment — you control the workflow graph and the model weights.

Heads up — flow changed. The legacy /comfy workflow-JSON paster and /v1/comfy/* routes were retired. ComfyUI now ships as a regular Deployment — you bring a Docker image (e.g. runpod/worker-comfyui or your own), we mount it on real GPUs.
POST/v1/gpu/run/{slug}

Submits a job to your ComfyUI deployment. Async by default; pass "sync": true to wait inline up to 240s.

Submit a jobbash
curl -X POST https://hypereal.cloud/v1/gpu/run/my-comfy-workflow \
  -H "Authorization: Bearer $HYPEREAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": {
      "prompt": "a cinematic portrait of an astronaut",
      "seed": 42,
      "workflow_overrides": { "Sampler.steps": 30 }
    }
  }'
Submit responsejson
{
  "job_id": "K3uA7Pq9xLm4",
  "status": "queued",
  "provider_job_id": "..."
}
GET/v1/gpu/jobs/{id}

Poll for status. We live-poll the worker on each request so you see queued → running → succeeded in near real time. On succeeded credits settle to the actual GPU-seconds; on failed we refund the hold. Pin a webhookUrl on the deployment to skip polling.

Status responsejson
{
  "job_id": "K3uA7Pq9xLm4",
  "status": "succeeded",
  "output": { "images": ["data:image/png;base64,..."] },
  "executionMs": 18420,
  "creditsCharged": 56
}
See your deploymentsbash
# List
curl https://hypereal.cloud/v1/deployments \
  -H "Authorization: Bearer $HYPEREAL_API_KEY"

# Create (point at any ComfyUI worker image)
curl -X POST https://hypereal.cloud/v1/deployments \
  -H "Authorization: Bearer $HYPEREAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "slug": "my-comfy-workflow",
    "name": "My Comfy",
    "dockerImage": "runpod/worker-comfyui:dev-cuda12.1.1",
    "gpuTypes": "ADA_48_PRO,AMPERE_80"
  }'
Workflow setup

Open /infra/deployments/new: pick a GPU tier, point at your ComfyUI Docker image (custom builds with your weights and custom nodes pre-baked work fine), set min/max workers and idle timeout. Your endpoint goes live in 60s.

Full Infrastructure docs: /docs/infra — handler spec, pricing, webhook protocol, R2 storage for weights.

ENTERPRISE

Gateway features

Cost visibility, budget guardrails, request logs, multi-provider failover, and smart routing — all built into the same API key. No extra setup, no separate dashboard tier.

Cost Dashboard

Spend, by model, in real time

Per-model pie, daily cost trend, top-10 most expensive requests. Available on every account at /usage. Export the underlying logs to CSV at any time:

GET /api/api-usage/export?days=30
Authorization: session cookie

→ hypereal-usage-2026-05-10.csv
Budget Alerts

Per-key monthly cap, with email guardrails

Set spendingLimit on any API key. We email at 80% (heads up) and 100% (hard cap). Optional: auto-disable the key on overshoot so a runaway loop never costs you a four-figure invoice.

POST /api/api-keys
{
  "name": "prod-eu",
  "spendingLimit": 50000   // 500 USD / month
}
Request Logs

Every call, searchable

Every API call is indexed by endpoint, model, status code, latency, and cost. Filter and search at /usage, or pull the JSON directly:

GET /api/api-usage?days=30&limit=1000

{
  logs: [...],
  costByModel: [...],
  topExpensiveRequests: [...]
}
Multi-Provider Failover

Outages don't reach your users

Every supported model has a fallback chain. On 5xx, timeout, or 429 we transparently retry the next provider with exponential backoff. You always get a result or a single, clean error — never a flap.

primary:  seedance-2-0-turbo-t2v   (region us-east)
fallback: seedance-2-0-t2v         (region us-west)
fallback: seedance-2-0             (region eu-central)
retries:  1 per target, exp backoff
Smart Routing

Pick by intent, we pick the cheapest qualified model

Send intent instead of model and we'll route to the cheapest provider in that capability bucket — without giving up determinism: pin a model whenever you want and we'll honor it exactly.

POST /v1/images/generate
{
  "intent": "text-to-image-fast",   // ← we'll pick the cheapest qualified model
  "prompt": "a quiet sunrise over Mt Fuji"
}

# Or pin explicitly:
{ "model": "nano-banana-t2i", "prompt": "..." }

SERVERLESS

GPU models

Hosted serverless GPU inference at /v1/gpu/{slug}. One API key, credit billing, audit log, and webhooks. Same wallet and dashboard as your LLM calls.

1. Pick a model

Browse the live catalog at /gpu-recommend. Each model lists its slug, per-call or per-second credit cost, and the maximum execution time per call.

2. Sync invocation (small jobs)

Short-running models return the output inline.

curl -X POST https://api.hypereal.cloud/v1/gpu/sdxl \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{"input": {"prompt": "a tabby cat astronaut"}}'

→ { "id": "...",
    "status": "succeeded",
    "outputs": ["https://cdn.hypereal.cloud/gpu/.../out.png"],
    "costCredits": 50,
    "durationMs": 4210 }

3. Async invocation (long jobs)

Long-running models queue and return a job id immediately with a 202. Poll, or wait for our cron + webhook poller to settle the job.

# Submit
POST /v1/gpu/wan-video
{ "input": { "prompt": "drone over Tokyo, neon, rain", "seconds": 5 } }
→ 202 { "id": "abc...", "status": "queued", "pollUrl": "/v1/gpu/jobs/abc..." }

# Poll
GET /v1/gpu/jobs/abc...
→ { "id": "abc...",
    "status": "succeeded",
    "outputs": ["https://cdn.hypereal.cloud/gpu/.../clip.mp4"],
    "costCredits": 312,
    "durationMs": 156000 }

Failed and timed-out jobs auto-refund the credit reservation. Per-second billing reconciles on completion using the model's reported execution time, capped at the model'smaxSeconds.

ENTERPRISE

Teams, RBAC & SSO

Organizations, five built-in roles, SAML and OIDC single sign-on. Built so security and procurement can sign off without a custom rider.

Organizations

Org-scoped keys, audit log, billing

Every API key, webhook, ComfyUI workflow, and GPU template can belong to an organization instead of an individual. Teammates share one budget, one audit trail, and one invoice. Personal keys keep working alongside.

POST /api/orgs
{
  "name": "Acme Inc"
}
→ { id, slug, role: "owner" }
Five built-in roles

Owner · Admin · Developer · Billing · Viewer

  • Owner — everything, including delete-org
  • Admin — manage members, keys, SSO, webhooks
  • Developer — create/delete API keys, manage workflows + GPUs
  • Billing — view + manage payments and audit log
  • Viewer — read-only access to keys, billing, audit
SAML 2.0

Configure your IdP in 3 steps

  1. Create a SAML app in Okta / Azure AD / Auth0 / Google.
  2. Set ACS URL to https://hypereal.cloud/api/auth/sso/<providerId>
  3. Paste the IdP metadata XML into /settings/organization → SSO.

Set the email-domain claim (e.g. acme.com) and the login form will auto-route corporate emails to your IdP — no password prompt.

OIDC

Issuer + client credentials

Drop in your issuer URL, client id, and client secret. We fetch the/.well-known/openid-configuration on save and surface a green check when the IdP is reachable.

POST /api/orgs/{id}/sso
{
  "type": "oidc",
  "issuer": "https://idp.acme.com",
  "clientId": "...",
  "clientSecret": "...",
  "domain": "acme.com"
}

12

Pricing & credits

One unit: 100 credits = $1.00 USD. LLMs bill per token using each model's input / output rate. Media models bill per image, per second, or per clip.

LLMs

Tokens × per-MTok rate. Streaming requests are billed off the final usage chunk.

Images

Flat per generation × actual n returned.

Video & audio

Per second (most video), per clip (Veo, Vidu, Grok), or per request (Fish Audio).

Claude, GPT, Gemini, and select image models (GPT Image 2, Nano Banana) are priced under direct providers. Video, audio, and other media models are billed at standard rates.

Logo
Hypereal AIExplore Curiosity
TwitterGitHubLinkedInYouTubeEmail
Infrastructure
  • Rent GPU
  • Train Models
  • ComfyUI as API
  • Deploy Any Model
  • Explore Catalog
  • Infrastructure Docs
  • GPU Logs
  • Pricing
LLM API
  • Hypereal SDK
  • Coding Credits
  • All LLM Models
  • Claude Opus 4.7
  • Claude Sonnet 4.6
  • GPT-5.5
  • Claude Haiku 4.5
  • GPT-5.5 Pro
  • GPT-5.3 Codex
  • Gemini 3.1 Pro Preview
  • DeepSeek V4 Pro
  • Kimi K2.6
  • GLM-5.1
AI API
  • AI API Overview
  • Seedance 2.0 API
  • Kling 3.0 API
  • Veo 3.1 API
  • FLUX API
  • GPT Image 2 API
  • vs WaveSpeed
  • vs fal.ai
  • vs Replicate
  • vs KIE.ai
Video Models
  • Google Veo 3.1 API
  • Kling 3.0 API
  • Kling O3 Pro API
  • Seedance 2.0 API
  • HappyHorse 1.0 API
  • WAN 2.7 API
  • WAN Video API
  • Grok Video API
  • Hunyuan Video API
  • PixVerse V6 API
  • Pika Video API
  • Luma Dream Machine API
  • MiniMax Video API
  • Vidu Video API
Image Models
  • NanoBanana 2 API
  • FLUX 2 API
  • GPT Image 1 API
  • Grok Image API
  • SeeDream V5 API
  • Imagen 4 API
  • Ideogram API
  • Recraft API
  • DALL-E 3 API
  • Stable Diffusion API
  • Gemini Image API
Tools
  • Face Swap API
  • Video Face Swap API
  • Virtual Try-On API
  • Image Upscaler API
  • Video Upscaler API
  • AI Talking Avatar API
  • Lip Sync API
  • OmniHuman Avatar API
  • Tripo3D H3.1 API
  • ElevenLabs TTS API
  • Fish Audio TTS API
  • Whisper STT API
  • Lyria Music API
Generators
  • Hypereal Agent
  • AI Image Generator
  • AI Video Generator
  • AI Avatar Generator
  • AI Audio Generator
  • AI 3D Generator
  • AI Tools
  • Image Upscaler
  • Video Upscaler
Collections
  • Best Video Models
  • Best Image Models
  • Seedance 2.0
  • WAN 2.7
  • Qwen Image 2
  • Grok AI
  • Seedance 1.5
  • Motion Control
  • Content Detection
  • Object Detection
Company
  • About
  • Docs
  • Hypereal SDK
  • Cookbook
  • Blog
  • Changelog
  • Contact
  • FAQ
  • Tips & Tutorials
  • Roadmap
  • Enterprise
  • Affiliate Program
  • Platform
  • Developer Program
Legal
  • Privacy Policy
  • Terms of Service
  • Refund Policy
  • Cookie Policy
  • Pricing
  • All Models
  • Sitemap
  • Status
All systems normal
•Built from California with Love ❤️
© Copyright 2026. All Rights Reserved.