v1StableClaude / GPT / Gemini를 직판가 이하로

Hypereal API 레퍼런스

하나의 ck_ 접두사 API 키. OpenAI 호환 REST. Claude Code, Codex CLI, Cursor, OpenAI SDK, Anthropic SDK에 그대로 사용하시거나 curl로 직접 호출하실 수 있습니다. 채팅, 이미지, 비디오, 오디오, 코드 에이전트 — 모두 하나의 베이스 URL 뒤에 있습니다.

APITOKEN

Coding Credits · limited launch

Claude Sonnet 4.6 · GPT-5.5 · Gemini 3.5 — pay as you go, no subscription

Ends in 0d 00h 00m 00s

Enterprise API uses a separate managed API surface.

This page documents the standard API paths. For managed Enterprise API models, capacity controls, and insurance, use the Enterprise overview and Enterprise API docs.

Enterprise Enterprise API

01 · 90초 만에 시작

빠른 시작

키를 발급받고, 클라이언트를 hypereal.cloud로 지정한 후 출시하십시오. 인증과 요청 형식은 OpenAI 호환이므로 대부분의 SDK는 베이스 URL만 변경하면 작동합니다.

1. 키 발급

최소 $2 (200 크레딧)을 충전하시고 다음에서 키를 생성하십시오: /manage-api-keys. 키는 다음으로 시작합니다: ck_.

2. 클라이언트 지정

베이스 URL: https://hypereal.cloud/api/v1

3. 요청 보내기

인증 헤더는 Authorization: Bearer ck_...입니다. 익숙하신 OpenAI 요청 본문을 그대로 사용하실 수 있습니다.

curlbash

curl https://hypereal.cloud/api/v1/chat/completions \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.5",
    "messages": [{"role": "user", "content": "Say hi in one word."}]
  }'

Node — OpenAI SDKts

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.HYPEREAL_API_KEY, // ck_...
  baseURL: 'https://hypereal.cloud/api/v1',
});

const completion = await client.chat.completions.create({
  model: 'gpt-5.5',
  messages: [{ role: 'user', content: 'Say hi in one word.' }],
});

console.log(completion.choices[0].message.content);

Cache support

For coding agents, start withclaude-sonnet-4-6and use Claude Code or another Anthropic-compatible client that sendscache_control. Hypereal supportscache_controlcaching and Hypereal Cache. Hypereal Cache is on by default and can sharply reduce token consumption for repeated coding-agent context. You can sethypereal.cacheto"auto"explicitly, or omit it for the same default.

SDK

Hypereal SDK

Install hypereal-sdk for typed access to chat, responses, image generation, video generation, audio, jobs and storage from Node.js 18+.

Install

Published as hypereal-sdk on npm.

Resources

Use client.images.generate(), chat, responses, jobs and storage.

Landing page

See the full SDK overview at /sdk.

Installbash

pnpm add hypereal-sdk

Quickstartts

import { Hypereal } from 'hypereal-sdk';

const client = new Hypereal({
  apiKey: process.env.HYPEREAL_API_KEY!,
});

const image = await client.images.generate({
  model: 'gemini-3-1-flash-t2i',
  prompt: 'A cinematic portrait in neon light',
  aspect_ratio: '16:9',
});

console.log(image);

Storage uploadts

const object = await client.storage.uploadFile(file, {
  filename: 'training-image.png',
  contentType: 'image/png',
  kind: 'dataset',
});

const listed = await client.storage.list({ kind: 'dataset' });

인증

모든 요청에는 ck_ 접두사 키가 필요합니다. 세 가지 헤더 형식을 지원하여 모든 SDK를 커버합니다.

Authorization

header

필수Bearer ck_... — OpenAI SDK, Codex CLI, Cursor에서 사용됩니다.

x-api-key

header

필수ck_... — Anthropic SDK 및 Claude Code에서 사용됩니다 (다음 환경에서): /v1/messages.

x-goog-api-key

header

필수ck_... — Google Gemini SDK / 네이티브 형식에 사용됩니다 (수신: /v1/gemini.?key=ck_... 도 작동합니다).

키는 사용자에게 바인딩됩니다. 키별 지출 한도는 다음에서 설정하실 수 있습니다: /manage-api-keys. 속도 제한은 다음 단위로 평가됩니다: 사용자 (키 단위가 아님).

03 · OpenAI 호환

Chat Completions

주력 엔드포인트입니다. OpenAI Chat Completions 와이어 형식을 사용합니다. GPT, Gemini, Qwen, DeepSeek, GLM 및 Anthropic 이외의 모든 LLM에 사용됩니다.

POST/api/v1/chat/completions

요청 본문

model

string

필수Anthropic이 아닌 모든 모델 ID. 아래 표를 참조하십시오. Anthropic 모델은 400을 반환합니다 — 대신 다음을 사용하십시오: /v1/messages 대신 사용합니다.

messages

Message[]

필수표준 OpenAI 메시지 배열 (role, content).

stream

boolean

선택기본값은 false입니다. 다음일 때 SSE 스트림이 사용됩니다: true; usage는 마지막 청크에 포함됩니다.

max_tokens

number

선택업스트림으로 전달됩니다. 공급자별 기본값이 적용됩니다.

temperature, top_p, tools, …

any

선택기타 OpenAI 매개변수는 변경 없이 그대로 통과합니다.

요금

각 모델의 입력/출력 단가에 따라 토큰당 과금됩니다. 100 크레딧 = $1.00. 엔드포인트를 호출하기 위한 최소 잔액은 200 크레딧 ($2.00)입니다.

curl — 스트리밍bash

curl https://hypereal.cloud/api/v1/chat/completions \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.5",
    "messages": [
      {"role": "system", "content": "You are a terse assistant."},
      {"role": "user", "content": "Two-line haiku about caches."}
    ],
    "stream": true,
    "max_tokens": 256
  }'

Node — OpenAI SDK 스트리밍ts

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.HYPEREAL_API_KEY,
  baseURL: 'https://hypereal.cloud/api/v1',
});

const stream = await client.chat.completions.create({
  model: 'gpt-5.5',
  stream: true,
  messages: [{ role: 'user', content: 'Stream me a haiku.' }],
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? '');
}

OpenAI 및 공급자 호환 모델

모델 ID

레이블

입력 / 출력

gpt-5.5

GPT-5.5· OpenAI

$0.420 / $2.52 per MTok

gpt-5.5-instant

GPT-5.5 Instant· OpenAI

$0.420 / $2.52 per MTok

gpt-5.5-pro

GPT-5.5 Pro· OpenAI

$2.07 / $12.42 per MTok

gpt-5.4

GPT-5.4· OpenAI

$0.240 / $1.09 per MTok

gpt-5.4-mini

GPT-5.4 Mini· OpenAI

$0.040 / $0.250 per MTok

deepseek-v4-pro

DeepSeek V4 Pro· DeepSeek

$0.490 / $0.990 per MTok

deepseek-v4-flash

DeepSeek V4 Flash· DeepSeek

$0.160 / $0.330 per MTok

deepseek-v3.2

DeepSeek V3.2· DeepSeek

$0.230 / $0.920 per MTok

kimi-k2.6

Kimi K2.6· Moonshot

$1.07 / $4.44 per MTok

kimi-k2.5

Kimi K2.5· Moonshot

$0.460 / $2.42 per MTok

glm-5.1

GLM-5.1· Zhipu

$0.990 / $3.94 per MTok

glm-5

GLM-5· Zhipu

$0.460 / $2.07 per MTok

qwen3-max

Qwen 3 Max· Alibaba

$0.810 / $3.22 per MTok

qwen3.5-plus

Qwen 3.5 Plus· Alibaba

$0.460 / $2.76 per MTok

qwen3.5-flash

Qwen 3.5 Flash· Alibaba

$0.140 / $1.38 per MTok

MiniMax-M2.5

MiniMax M2.5· MiniMax

$0.240 / $0.970 per MTok

04 · Anthropic 호환

Messages

Anthropic /v1/messages 와이어 형식이며, 확장 사고(extended thinking), 다중 업스트림 페일오버, 15초 SSE keepalive를 지원합니다. Claude Code, OpenCode, OpenClaw, 공식 Anthropic SDK에 사용하십시오.

POST/api/v1/messages

요청 본문

model

string

필수claude-sonnet-4-6, claude-opus-4-6, 또는 claude-haiku-4-5. 이전 Anthropic ID (claude-sonnet-4-5-20250929, claude-3-5-sonnet-20241022, claude-3-5-haiku-20241022)는 자동으로 최신 동등 모델로 매핑됩니다.

messages

Message[]

필수이미지 및 tool_use 블록을 포함한 Anthropic 형식 메시지입니다.

max_tokens

number

필수Anthropic 사양에서 필수입니다.

cache_control

{ type: "ephemeral" }

선택Add it to stablesystem,tools, or text content blocks for Anthropic prompt caching. Hypereal defaults a cache breakpoint when omitted and reports cache usage in response metadata.

hypereal.cache

"auto" | false

선택Hypereal Cache is on by default. Use"auto" to make the default explicit for repeated requests, orfalse to bypass it for a request.

thinking

{ type: "enabled" | "adaptive", budget_tokens?: number }

선택확장 사고(Extended thinking). budget_tokens 는 추론 트레이스의 상한을 설정합니다. 엔드포인트는 15초마다 SSE 핑을 전송하여 긴 사고 스트림 중 프록시가 연결을 끊지 않도록 합니다.

stream, system, tools, …

any

선택Anthropic SDK에서와 동일하게 통과됩니다.

페일오버 업스트림으로 재시도할 때, 유효하지 않은 서명을 가진 오래된 thinking 블록은 자동으로 필터링됩니다 — 직접 처리하실 필요가 없습니다.

curl — 확장 사고bash

curl https://api.hypereal.cloud/v1/messages \
  -H "x-api-key: ck_..." \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "max_tokens": 1024,
    "system": [{
      "type": "text",
      "text": "You are a senior TypeScript refactoring assistant.",
      "cache_control": {"type": "ephemeral"}
    }],
    "messages": [
      {"role": "user", "content": "Plan a 3-step refactor of a Next.js app."}
    ],
    "hypereal": {"cache": "auto"}
  }'

Node — Anthropic SDKts

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({
  apiKey: process.env.HYPEREAL_API_KEY, // ck_...
  baseURL: 'https://api.hypereal.cloud',
});

const msg = await client.messages.create({
  model: 'claude-sonnet-4-6',
  max_tokens: 1024,
  system: [{
    type: 'text',
    text: 'You are a senior TypeScript refactoring assistant.',
    cache_control: { type: 'ephemeral' },
  }],
  hypereal: { cache: 'auto' },
  messages: [{ role: 'user', content: 'Hello, Claude.' }],
});

console.log(msg.content);

Anthropic 모델

모델 ID

레이블

입력 / 출력

claude-opus-4-7

Claude Opus 4.7· Anthropic

$3.40 / $16.96 per MTok

claude-opus-4-6

Claude Opus 4.6· Anthropic

$3.40 / $16.96 per MTok

claude-sonnet-4-6

Claude Sonnet 4.6· Anthropic

$0.680 / $3.40 per MTok

claude-haiku-4-5

Claude Haiku 4.5· Anthropic

$0.130 / $0.650 per MTok

managed-claude-opus-4-7-max

Claude Opus 4.7· Hypereal Managed

$5.25 / $26.25 per MTok

managed-claude-opus-4-6-max

Claude Opus 4.6 (1M)· Hypereal Managed

$5.25 / $26.25 per MTok

managed-claude-opus-4-5-max

Claude Opus 4.5· Hypereal Managed

$5.25 / $26.25 per MTok

managed-claude-sonnet-4-6-max

Claude Sonnet 4.6· Hypereal Managed

$3.15 / $15.75 per MTok

managed-claude-sonnet-4-5-max

Claude Sonnet 4.5· Hypereal Managed

$3.15 / $15.75 per MTok

managed-claude-haiku-4-5-max

Claude Haiku 4.5· Hypereal Managed

$1.05 / $5.25 per MTok

05 · OpenAI Responses API

Responses

OpenAI의 새로운 Responses API입니다 (Codex CLI의 `wire_api = responses` 모드 및 OpenAI Agents SDK에서 사용됨). chat/completions와 동일한 인증을 사용하며, 요청 본문에서는 `messages` 대신 `input`을 사용합니다.

POST/api/v1/responses

비고

Anthropic 모델은 400을 반환합니다 — 다음 엔드포인트에 속합니다: /v1/messages.
스트리밍과 비스트리밍 모두 다음에서 과금됩니다:response.usage.input_tokens / output_tokens.
일부 업스트림은 항상 SSE를 내보냅니다 — 엔드포인트가 이를 감지하여, 다음일 때라도 투명하게 스트리밍합니다: stream:false.
다중 업스트림 페일오버를 지원합니다. 클라이언트 타임아웃을 길게(300초 이상) 설정하십시오.

curlbash

curl https://hypereal.cloud/api/v1/responses \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.1-codex",
    "input": "Write a TypeScript function that debounces a callback.",
    "stream": true
  }'

Node — OpenAI SDK responses.createts

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.HYPEREAL_API_KEY,
  baseURL: 'https://hypereal.cloud/api/v1',
});

const response = await client.responses.create({
  model: 'gpt-5.3-codex',
  input: 'Refactor this file into smaller modules.',
});

console.log(response.output_text);

Codex 최적화 모델

모델 ID

레이블

입력 / 출력

gpt-5.3-codex

GPT-5.3 Codex· OpenAI

$0.090 / $0.680 per MTok

gpt-5.3-codex-spark

GPT-5.3 Codex Spark· OpenAI

$0.090 / $0.680 per MTok

06 · Codex CLI / Codex Desktop

Codex CLI

Codex는 `wire_api = responses` 공급자를 /api/v1/responses로 지정합니다. CLI는 베이스 URL 앞에 `/responses`를 자동으로 붙이므로, 베이스 URL을 표시된 대로 설정하십시오.

POST/api/v1/responses

~/.codex/config.tomltoml

# ~/.codex/config.toml
model_provider = "hypereal"
model = "gpt-5.3-codex"

[model_providers.hypereal]
name = "Hypereal"
base_url = "https://hypereal.cloud/api/v1"
wire_api = "responses"
env_key = "HYPEREAL_API_KEY"

그런 다음 키를 export 하십시오:
export HYPEREAL_API_KEY=ck_...

다음을 평소와 같이 실행하십시오: codex . Codex가 보내는 모든 것 — 전체 추론 스트림, 도구 호출, 파일 편집 — 이 변경 없이 프록시됩니다. 과금은 표준 input_tokens / output_tokens usage 블록을 기준으로 합니다.

동일한 설정이 OpenCode, Claude Code (다음 사용: /v1/messages), Cursor (다음 사용: /v1/chat/completions), Gemini CLI (다음 사용: /v1/gemini)에서도 작동합니다.

이미지 생성

OpenAI 호환 /images/generations 형식입니다. 동기식 — 업스트림이 완료되면 엔드포인트가 이미지 URL(또는 base64)을 반환합니다. 이미지당 과금되며, `n`은 1–10으로 클램프됩니다.

POST/api/v1/images/generations

요청 본문

model

string

필수이미지 모델 ID — 표를 참조하십시오.

prompt

string

필수텍스트 프롬프트입니다. 편집이 가능한 모델의 경우, 모델의 네이티브 매개변수를 통해 참조 이미지를 포함하십시오 (예: image, reference_images).

number

선택이미지 수, 1–10 (기본값 1).

size

string

선택그대로 전달됩니다 (예: 1024x1024, 1536x1024). 공급자에 따라 다릅니다.

quality, style, …

any

선택추가 매개변수는 업스트림으로 그대로 전달됩니다.

등급 요건: 이미지 생성에는 Starter 등급 (누적 충전 $10 이상)이 필요합니다. 잔액으로 예상 creditsPerGeneration × n을 충당할 수 없는 경우, 엔드포인트는 402를 반환합니다.

Use an image model ID here, not a chat model ID. Valid examples include gpt-image-2, nano_banana_pro, and gemini-3-1-flash-t2i. Use gpt-5.5 only with chat, messages, or responses endpoints.

curlbash

curl https://hypereal.cloud/api/v1/images/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nano_banana_pro",
    "prompt": "isometric studio shot of a tiny cyberpunk apartment, neon rim light",
    "n": 1,
    "size": "1024x1024"
  }'

Node — fetchts

const res = await fetch('https://hypereal.cloud/api/v1/images/generations', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.HYPEREAL_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'nano_banana_pro',
    prompt: 'a chrome teapot floating over the ocean at sunset',
    n: 1,
  }),
});

const { data } = await res.json();
console.log(data[0].url); // or data[0].b64_json depending on the model

Model

GPT Image 2 — text-to-image & image-to-image

Use the same /api/v1/images/generations endpoint with "model": "gpt-image-2". Pass an array of public image URLs in reference_images to switch from pure text-to-image to image-conditioned generation (edits, restyles, character consistency).

size accepts 1024x1024, 1536x1024 (landscape), 1024x1536 (portrait), 2048x2048, 4096x4096. 2K and 4K are square only.
Reference images must be public HTTPS URLs (base64 is not accepted by this model). Up to 4 references per request.
Pricing is per-tier: 1K, 2K, and 4K each have their own credit cost — see the model table below.
Synchronous response: the call returns the final image URL (no polling needed). Allow up to ~120 s.

# Text-to-image (1K landscape)
curl https://hypereal.cloud/api/v1/images/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-image-2",
    "prompt": "a chrome teapot floating over the ocean at sunset",
    "size": "1536x1024"
  }'

# Image-to-image / edit
curl https://hypereal.cloud/api/v1/images/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-image-2",
    "prompt": "same character, snowy mountain background, golden hour",
    "size": "1024x1024",
    "reference_images": [
      "https://example.com/source.jpg"
    ]
  }'

Model

NanoBanana 2 — image-to-image & multimodal inputs

Model id gemini-3-1-flash-t2i (NanoBanana 2). Pass references in image_urls to switch into image-to-image / multi-reference mode. Up to 4 reference images, blended in prompt order. Use the standard aspect_ratio field — landscape, portrait, and square are all supported at every resolution tier.

Supported aspect_ratio: 1:1, 3:2, 2:3, 4:3, 3:4, 16:9, 9:16, 21:9.
Supported resolution: 0.5K, 1K, 2K, 4K.
Reference images may be public HTTPS URLs or base64 data URLs.
Multi-reference works with a text prompt — combine, e.g., a character + outfit + scene reference and describe the final composition in the prompt.

# Multimodal: text + multiple reference images
curl https://hypereal.cloud/api/v1/images/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3-1-flash-t2i",
    "prompt": "Place the character (img 1) wearing the jacket (img 2) into the scene from img 3, cinematic light",
    "aspect_ratio": "16:9",
    "resolution": "2K",
    "image_urls": [
      "https://example.com/character.png",
      "https://example.com/jacket.png",
      "https://example.com/scene.png"
    ]
  }'

Hosting

Calling the API from a subdomain on shared hosting

No special setup is required. Our API accepts requests from any origin — there are no domain allowlists by default. Two things that catch shared-host users out, though:

Make API calls from your server, not the browser. Calling the API directly from client-side JavaScript would expose your ck_… key to every visitor. Always proxy through your own backend (PHP, Node, Python — whatever your subdomain runs).
Set a generous request timeout. Image and video calls can hold the connection open up to ~120 s (image) or ~300 s (video). Many shared hosts cap PHP/cURL at 30 s by default — raise max_execution_time, CURLOPT_TIMEOUT, and your reverse-proxy / FastCGI read timeout.
Lock keys to your subdomain (optional). In the dashboard you can scope an API key to a specific Origin or IP — recommended if your subdomain handles untrusted traffic.
Use HTTPS. Some shared-hosting subdomains default to HTTP — outbound HTTPS is required to reach the API.

# Minimal PHP server-side proxy (drop into /api/generate.php)
<?php
$body = file_get_contents('php://input');
$ch = curl_init('https://hypereal.cloud/api/v1/images/generations');
curl_setopt_array($ch, [
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_TIMEOUT        => 180,    // raise above shared-host default
  CURLOPT_POST           => true,
  CURLOPT_POSTFIELDS     => $body,
  CURLOPT_HTTPHEADER     => [
    'Authorization: Bearer ' . getenv('HYPEREAL_API_KEY'),
    'Content-Type: application/json',
  ],
]);
echo curl_exec($ch);

이미지 모델

모델 ID

레이블

가격

gpt-image-2

GPT Image 2· OpenAI

$0.030 / image

gpt-4o-image

GPT-4o Image· OpenAI

$0.012 / image

nano_banana

Nano Banana· Nano Banana

$0.024 / image

nano_banana_2

Nano Banana 2· Nano Banana

$0.040 / image

gemini-3.1-flash-image-preview

Gemini 3.1 Flash Image· Google

$0.050 / image

gemini-2.5-flash-image-preview

Gemini 2.5 Flash Image· Google

$0.024 / image

flux-kontext-pro

Flux Kontext Pro· Flux

$0.040 / image

flux-2-pro

Flux 2 Pro· Flux

$0.050 / image

doubao-seedream-4-0

Doubao Seedream 4.0· ByteDance

$0.057 / image

doubao-seedream-4-5

Doubao Seedream 4.5· ByteDance

$0.071 / image

doubao-seedream-5-0

Doubao Seedream 5.0· ByteDance

$0.063 / image

gemini-3.1-flash-image-preview-official

Gemini 3.1 Flash Image (Official)· Google

$0.064 / image

flux-kontext-max

Flux Kontext Max· Flux

$0.080 / image

gemini-2.5-flash-image-official

Gemini 2.5 Flash Image (Official)· Google

$0.098 / image

nano_banana_pro

Nano Banana Pro· Nano Banana

$0.100 / image

gemini-3-pro-image-preview

Gemini 3 Pro Image· Google

$0.100 / image

flux-2-flex

Flux 2 Flex· Flux

$0.140 / image

gemini-3-pro-image-preview-official

Gemini 3 Pro Image (Official)· Google

$0.216 / image

gemini-3-pro-image-preview-4K

Gemini 3 Pro Image 4K· Google

$0.190 / image

gemini-3.1-fast-imagen

Gemini 3.1 Fast Imagen· Google

$0.020 / image

gemini-3.1-thinking-imagen

Gemini 3.1 Thinking Imagen· Google

$0.020 / image

08 · 장기 실행

비디오 생성

동기식 long-poll 엔드포인트입니다 — 클립이 준비될 때까지 연결을 유지하십시오. HTTP 클라이언트 타임아웃을 600초로 설정하십시오. 과금은 초당(대부분 모델) 또는 클립당(Veo, Vidu, Grok)으로 이루어집니다.

POST/api/v1/videos/generate

요청 본문

model

string

필수비디오 모델 ID — 표를 참조하십시오.

prompt

string

필수클립을 묘사하는 텍스트 프롬프트입니다.

duration

number

선택초 단위, 1–60 (기본값 5). 다음 모델에만 의미가 있습니다: per_second 모델입니다.

aspect_ratio

string

선택예: 16:9, 9:16, 1:1. 공급자에 따라 다릅니다.Gemini Omni Flash accepts 16:9 or 9:16.

resolution

string

선택Forwarded when the selected model supports resolution. Gemini Omni Flash currently accepts 720P.

image_urls

string[]

선택For Gemini Omni Flash, pass 1-3 uploaded or public image URLs as visual references. Upload local images first and send the returned URL; direct base64 image payloads are not supported.

image_url

string

선택이미지-투-비디오 모델의 첫 프레임 키프레임입니다. 일부 모델은 다음도 받습니다: last_image_url 또는 image — 해당 모델에 대해서는 업스트림 문서를 참조하십시오.

참고: 이는 단일 장기 실행 POST입니다. 작업 ID 폴링은 없으며, 응답 본문에 업스트림 완료 시 렌더링된 비디오 URL이 포함됩니다. 서버 측 런타임(Node, 확장 지속 시간이 있는 edge)을 사용하십시오 — 브라우저와 대부분의 CDN은 5초 클립이 렌더링되기 전에 타임아웃됩니다.

curl — 텍스트+이미지-투-비디오bash

curl https://hypereal.cloud/api/v1/videos/generate \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini_omni_flash",
    "prompt": "a white cube rotating on a black background, clean product demo",
    "duration": 6,
    "aspect_ratio": "16:9",
    "resolution": "720P",
    "image_urls": [
      "https://example.com/product-reference.png"
    ]
  }'

Node — fetchts

const res = await fetch('https://hypereal.cloud/api/v1/videos/generate', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.HYPEREAL_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'gemini_omni_flash',
    prompt: 'a cat walking on the moon, cinematic, no text',
    duration: 6,
    aspect_ratio: '16:9',
    resolution: '720P',
    image_urls: ['https://example.com/cat-reference.png'],
  }),
});

const data = await res.json();
console.log(data.jobId, data.pollUrl); // poll /v1/jobs/{id} for the mp4

비디오 모델

모델 ID

레이블

가격

happyhorse-1.0

HappyHorse 1.0· Alibaba

$0.110 / 720p / second · $0.190 / 1080p / second

gemini_omni_flash

Gemini Omni Flash· Google

$0.180 / clip

wan2.6-flash

WAN 2.6 Flash· Alibaba

$0.060 / sec

kling-2-6

Kling 2.6· Kuaishou

$0.074 / sec

MiniMax-Hailuo-02

MiniMax Hailuo 02· MiniMax

$0.080 / sec

doubao-seedance-1-0-pro-fast

Doubao Seedance Pro Fast· ByteDance

$0.083 / sec

MiniMax-Hailuo-2.3

MiniMax Hailuo 2.3· MiniMax

$0.098 / sec

wan2.6

WAN 2.6· Alibaba

$0.100 / sec

kling-video-o1

Kling Video O1· Kuaishou

$0.134 / sec

kling-v3-omni

Kling V3 Omni· Kuaishou

$0.134 / sec

kling-v3

Kling V3· Kuaishou

$0.134 / sec

kling-v3-video

Kling V3 Video· Kuaishou

$0.134 / sec

doubao-seedance-1-0-pro-quality

Doubao Seedance Pro Quality· ByteDance

$0.208 / sec

doubao-seedance-2-0

Doubao Seedance 2.0· ByteDance

$0.200 / sec

doubao-seedance-2-0-fast

Doubao Seedance 2.0 Fast· ByteDance

$0.105 / sec

doubao-seedance-1-5-pro

Doubao Seedance 1.5 Pro· ByteDance

$0.216 / sec

Veo3.1-fast-official

Veo 3.1 Fast· Google

$0.160 / sec

Veo3.1-quality-official

Veo 3.1 Quality· Google

$0.320 / sec

veo3.1-fast

Veo 3.1 Fast· Google

$0.160 / clip

veo3.1-quality

Veo 3.1 Quality· Google

$1.20 / clip

vidu-q3-pro

Vidu Q3 Pro· Vidu

$0.020 / clip

grok-video-3

Grok Video 3· xAI

$0.160 / clip

09 · Fish Audio

오디오 — TTS, 음성 클로닝, ASR

세 개의 모델 ID가 하나의 엔드포인트를 공유합니다. 본문 및 응답의 형식은 호출하시는 모델에 따라 달라집니다. 공급자는 Fish Audio (ToAPI를 거치지 않고 직접 호출)이며, 요청당 과금됩니다.

POST/api/v1/audio/generations

model

"audio-tts" | "audio-clone" | "audio-asr"

필수수행할 작업을 선택합니다.

text

string

선택다음에 필수: audio-tts 및 audio-clone.

audio

string (URL)

선택다음에 필수: audio-asr (입력) 및 audio-clone (참조 음성, 10초 이상).

voice_id, format, sample_rate, …

any

선택추가 Fish Audio 매개변수는 그대로 전달됩니다.

응답 형식: data: [{ url }] 는 TTS / 클로닝용, text (+ 선택적 segments, duration)는 ASR용입니다.

TTSbash

curl https://hypereal.cloud/api/v1/audio/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "audio-tts",
    "text": "Welcome to Hypereal. One key, every model.",
    "voice_id": "en_male_calm"
  }'

음성 클로닝bash

curl https://hypereal.cloud/api/v1/audio/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "audio-clone",
    "text": "This is my cloned voice.",
    "audio": "https://example.com/reference-30s.mp3"
  }'

ASR (음성 → 텍스트)bash

curl https://hypereal.cloud/api/v1/audio/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "audio-asr",
    "audio": "https://example.com/recording.mp3"
  }'

오디오 모델

모델 ID

레이블

가격

audio-tts

Text to Speech· Fish Audio

$0.020 / request

audio-clone

Voice Clone· Fish Audio

$0.020 / request

audio-asr

Speech Recognition· Fish Audio

$0.010 / request

10 · Google 네이티브 형식

Gemini

동일한 엔드포인트에서 Gemini 네이티브(`contents` / `generationConfig` / `systemInstruction`)와 OpenAI 형식을 모두 받습니다. 엔드포인트는 내부적으로 OpenAI로 변환한 후 전달합니다. 대부분의 코드에서는 Gemini 모델 ID로 /v1/chat/completions를 사용하시는 편이 더 간단합니다.

POST/api/v1/gemini

model

string

필수모든 Gemini 모델 ID — 표를 참조하십시오.

contents

Content[]

선택Gemini 네이티브 메시지 배열입니다.

systemInstruction

Content

선택Gemini 형식의 선택적 시스템 메시지입니다.

generationConfig

object

선택temperature, maxOutputTokens 등.

messages

Message[]

선택OpenAI 형식이며, 다음 대신 사용 가능한 대안입니다: contents.

인증 헤더: x-goog-api-key: ck_..., ?key=ck_..., 또는 Authorization: Bearer ck_... 모두 작동합니다.

curl — Gemini 네이티브bash

curl "https://hypereal.cloud/api/v1/gemini" \
  -H "x-goog-api-key: ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3.5-thinking",
    "contents": [
      {"role": "user", "parts": [{"text": "Outline a launch plan."}]}
    ],
    "generationConfig": {"temperature": 0.6, "maxOutputTokens": 2048}
  }'

Node — fetchts

// The /v1/gemini endpoint accepts both Gemini-native and OpenAI shapes.
// For SDK use, the OpenAI client + /v1/chat/completions is simpler.
const res = await fetch('https://hypereal.cloud/api/v1/gemini', {
  method: 'POST',
  headers: {
    'x-goog-api-key': process.env.HYPEREAL_API_KEY!,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'gemini-3.5-fast',
    contents: [{ role: 'user', parts: [{ text: 'Hi' }] }],
  }),
});

console.log(await res.json());

Gemini 모델

모델 ID

레이블

입력 / 출력

gemini-3.5-thinking

Gemini 3.5 Thinking· Google

$0.900 / $5.40 per MTok

gemini-3.5-fast

Gemini 3.5 Fast· Google

$0.900 / $5.40 per MTok

gemini-3.1-pro-preview

Gemini 3.1 Pro Preview· Google

$0.390 / $1.74 per MTok

gemini-3-pro-preview

Gemini 3 Pro Preview· Google

$0.390 / $1.74 per MTok

gemini-3-flash-preview

Gemini 3 Flash Preview· Google

$0.050 / $0.290 per MTok

오류 및 속도 제한

모든 오류는 '{ error: { type, message } }' 형식의 JSON입니다. 속도 제한은 키별이 아닌 사용자별로 평가됩니다 — 여러 키가 동일한 할당량을 공유합니다.

401 authentication_error

JSON

선택키가 누락되었거나, 형식이 잘못되었거나( ck_ 접두사 없음), 만료되었거나, 비활성화된 경우입니다.

402 insufficient_credits

JSON

선택잔액이 200 크레딧($2) 미만이거나, 요청의 예상 비용이 잔액을 초과한 경우입니다.

403 access_denied

JSON

선택누적 충전 등급이 해당 모델을 해제하지 않은 경우입니다 (이미지/비디오/오디오는 $10 이상 필요, 일부 플래그십 LLM은 더 높은 등급 필요).

429 rate_limit_error / spending_limit_error

JSON

선택사용자별 시간당 한도 (채팅 1000/h, 이미지 500/h, 비디오 및 오디오 200/h) 또는 직접 설정하신 키별 지출 한도를 초과한 경우입니다. X-RateLimit-Limit, X-RateLimit-Remaining, 및 X-RateLimit-Reset 헤더가 속도 제한 응답에 반환됩니다.

400 invalid_request_error

JSON

선택다음 누락: model, 알 수 없는 모델 ID (응답에 다음 포함: available_models), 또는 형식에 잘못된 엔드포인트 (예: 다음에 사용된 Anthropic 모델: /chat/completions).

502 api_error

JSON

선택해당 모델에 대한 모든 업스트림이 실패했습니다. 메시지에는 마지막 업스트림의 오류 문자열이 포함됩니다.

DEVELOPER

ComfyUI as API

Deploy a ComfyUI container as a Hypereal-managed GPU endpoint. Same per-second billing, auto-scaling, webhook delivery as any other deployment — you control the workflow graph and the model weights.

Heads up — flow changed. The legacy /comfy workflow-JSON paster and /v1/comfy/* routes were retired. ComfyUI now ships as a regular Deployment — you bring a Docker image (e.g. runpod/worker-comfyui or your own), we mount it on real GPUs.

POST/v1/gpu/run/{slug}

Submits a job to your ComfyUI deployment. Async by default; pass "sync": true to wait inline up to 240s.

Submit a jobbash

curl -X POST https://hypereal.cloud/v1/gpu/run/my-comfy-workflow \
  -H "Authorization: Bearer $HYPEREAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": {
      "prompt": "a cinematic portrait of an astronaut",
      "seed": 42,
      "workflow_overrides": { "Sampler.steps": 30 }
    }
  }'

Submit responsejson

{
  "job_id": "K3uA7Pq9xLm4",
  "status": "queued",
  "provider_job_id": "..."
}

GET/v1/gpu/jobs/{id}

Poll for status. We live-poll the worker on each request so you see queued → running → succeeded in near real time. On succeeded credits settle to the actual GPU-seconds; on failed we refund the hold. Pin a webhookUrl on the deployment to skip polling.

Status responsejson

{
  "job_id": "K3uA7Pq9xLm4",
  "status": "succeeded",
  "output": { "images": ["data:image/png;base64,..."] },
  "executionMs": 18420,
  "creditsCharged": 56
}

See your deploymentsbash

# List
curl https://hypereal.cloud/v1/deployments \
  -H "Authorization: Bearer $HYPEREAL_API_KEY"

# Create (point at any ComfyUI worker image)
curl -X POST https://hypereal.cloud/v1/deployments \
  -H "Authorization: Bearer $HYPEREAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "slug": "my-comfy-workflow",
    "name": "My Comfy",
    "dockerImage": "runpod/worker-comfyui:dev-cuda12.1.1",
    "gpuTypes": "ADA_48_PRO,AMPERE_80"
  }'

Workflow setup

Open /infra/deployments/new: pick a GPU tier, point at your ComfyUI Docker image (custom builds with your weights and custom nodes pre-baked work fine), set min/max workers and idle timeout. Your endpoint goes live in 60s.

Full Infrastructure docs: /docs/infra — handler spec, pricing, webhook protocol, R2 storage for weights.

ENTERPRISE

Gateway features

Cost visibility, budget guardrails, request logs, multi-provider failover, and smart routing — all built into the same API key. No extra setup, no separate dashboard tier.

Cost Dashboard

Spend, by model, in real time

Per-model pie, daily cost trend, top-10 most expensive requests. Available on every account at /usage. Export the underlying logs to CSV at any time:

GET /api/api-usage/export?days=30
Authorization: session cookie

→ hypereal-usage-2026-05-10.csv

Budget Alerts

Per-key monthly cap, with email guardrails

Set spendingLimit on any API key. We email at 80% (heads up) and 100% (hard cap). Optional: auto-disable the key on overshoot so a runaway loop never costs you a four-figure invoice.

POST /api/api-keys
{
  "name": "prod-eu",
  "spendingLimit": 50000   // 500 USD / month
}

Request Logs

Every call, searchable

Every API call is indexed by endpoint, model, status code, latency, and cost. Filter and search at /usage, or pull the JSON directly:

GET /api/api-usage?days=30&limit=1000

{
  logs: [...],
  costByModel: [...],
  topExpensiveRequests: [...]
}

Multi-Provider Failover

Outages don't reach your users

Every supported model has a fallback chain. On 5xx, timeout, or 429 we transparently retry the next provider with exponential backoff. You always get a result or a single, clean error — never a flap.

primary:  seedance-2-0-turbo-t2v   (region us-east)
fallback: seedance-2-0-t2v         (region us-west)
fallback: seedance-2-0             (region eu-central)
retries:  1 per target, exp backoff

Smart Routing

Pick by intent, we pick the cheapest qualified model

Send intent instead of model and we'll route to the cheapest provider in that capability bucket — without giving up determinism: pin a model whenever you want and we'll honor it exactly.

POST /v1/images/generate
{
  "intent": "text-to-image-fast",   // ← we'll pick the cheapest qualified model
  "prompt": "a quiet sunrise over Mt Fuji"
}

# Or pin explicitly:
{ "model": "nano-banana-t2i", "prompt": "..." }

SERVERLESS

GPU models

Hosted serverless GPU inference at /v1/gpu/{slug}. One API key, credit billing, audit log, and webhooks. Same wallet and dashboard as your LLM calls.

1. Pick a model

Browse the live catalog at /gpu-recommend. Each model lists its slug, per-call or per-second credit cost, and the maximum execution time per call.

2. Sync invocation (small jobs)

Short-running models return the output inline.

curl -X POST https://api.hypereal.cloud/v1/gpu/sdxl \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{"input": {"prompt": "a tabby cat astronaut"}}'

→ { "id": "...",
    "status": "succeeded",
    "outputs": ["https://cdn.hypereal.cloud/gpu/.../out.png"],
    "costCredits": 50,
    "durationMs": 4210 }

3. Async invocation (long jobs)

Long-running models queue and return a job id immediately with a 202. Poll, or wait for our cron + webhook poller to settle the job.

# Submit
POST /v1/gpu/wan-video
{ "input": { "prompt": "drone over Tokyo, neon, rain", "seconds": 5 } }
→ 202 { "id": "abc...", "status": "queued", "pollUrl": "/v1/gpu/jobs/abc..." }

# Poll
GET /v1/gpu/jobs/abc...
→ { "id": "abc...",
    "status": "succeeded",
    "outputs": ["https://cdn.hypereal.cloud/gpu/.../clip.mp4"],
    "costCredits": 312,
    "durationMs": 156000 }

Failed and timed-out jobs auto-refund the credit reservation. Per-second billing reconciles on completion using the model's reported execution time, capped at the model'smaxSeconds.

ENTERPRISE

Teams, RBAC & SSO

Organizations, five built-in roles, SAML and OIDC single sign-on. Built so security and procurement can sign off without a custom rider.

Organizations

Org-scoped keys, audit log, billing

Every API key, webhook, ComfyUI workflow, and GPU template can belong to an organization instead of an individual. Teammates share one budget, one audit trail, and one invoice. Personal keys keep working alongside.

POST /api/orgs
{
  "name": "Acme Inc"
}
→ { id, slug, role: "owner" }

Five built-in roles

Owner · Admin · Developer · Billing · Viewer

Owner — everything, including delete-org
Admin — manage members, keys, SSO, webhooks
Developer — create/delete API keys, manage workflows + GPUs
Billing — view + manage payments and audit log
Viewer — read-only access to keys, billing, audit

SAML 2.0

Configure your IdP in 3 steps

Create a SAML app in Okta / Azure AD / Auth0 / Google.
Set ACS URL to https://hypereal.cloud/api/auth/sso/<providerId>
Paste the IdP metadata XML into /settings/organization → SSO.

Set the email-domain claim (e.g. acme.com) and the login form will auto-route corporate emails to your IdP — no password prompt.

OIDC

Issuer + client credentials

Drop in your issuer URL, client id, and client secret. We fetch the/.well-known/openid-configuration on save and surface a green check when the IdP is reachable.

POST /api/orgs/{id}/sso
{
  "type": "oidc",
  "issuer": "https://idp.acme.com",
  "clientId": "...",
  "clientSecret": "...",
  "domain": "acme.com"
}

요금 및 크레딧

단일 단위: 100 크레딧 = $1.00 USD. LLM은 각 모델의 입력 / 출력 단가를 사용하여 토큰당 과금됩니다. 미디어 모델은 이미지당, 초당, 또는 클립당 과금됩니다.

LLMs

토큰 × MTok당 단가입니다. 스트리밍 요청은 마지막 usage 청크 기준으로 과금됩니다.

이미지

생성당 정액 × 실제 반환된 n 반환됩니다.

비디오 및 오디오

초당(대부분의 비디오), 클립당(Veo, Vidu, Grok), 또는 요청당(Fish Audio)으로 과금됩니다.

Claude, GPT, Gemini 및 일부 이미지 모델 (GPT Image 2, Nano Banana)은 직판 공급자보다 낮은 가격으로 제공됩니다. 비디오, 오디오, 기타 미디어 모델은 표준 요율로 과금됩니다.

v1StableClaude / GPT / Gemini를 직판가 이하로

Hypereal API 레퍼런스

APITOKEN

Coding Credits · limited launch

Claude Sonnet 4.6 · GPT-5.5 · Gemini 3.5 — pay as you go, no subscription

Ends in 0d 00h 00m 00s

Enterprise API uses a separate managed API surface.

This page documents the standard API paths. For managed Enterprise API models, capacity controls, and insurance, use the Enterprise overview and Enterprise API docs.

Enterprise Enterprise API

01 · 90초 만에 시작

빠른 시작

1. 키 발급

최소 $2 (200 크레딧)을 충전하시고 다음에서 키를 생성하십시오: /manage-api-keys. 키는 다음으로 시작합니다: ck_.

2. 클라이언트 지정

베이스 URL: https://hypereal.cloud/api/v1

3. 요청 보내기

인증 헤더는 Authorization: Bearer ck_...입니다. 익숙하신 OpenAI 요청 본문을 그대로 사용하실 수 있습니다.

curlbash

curl https://hypereal.cloud/api/v1/chat/completions \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.5",
    "messages": [{"role": "user", "content": "Say hi in one word."}]
  }'

Node — OpenAI SDKts

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.HYPEREAL_API_KEY, // ck_...
  baseURL: 'https://hypereal.cloud/api/v1',
});

const completion = await client.chat.completions.create({
  model: 'gpt-5.5',
  messages: [{ role: 'user', content: 'Say hi in one word.' }],
});

console.log(completion.choices[0].message.content);

Cache support

SDK

Hypereal SDK

Install hypereal-sdk for typed access to chat, responses, image generation, video generation, audio, jobs and storage from Node.js 18+.

Install

Published as hypereal-sdk on npm.

Resources

Use client.images.generate(), chat, responses, jobs and storage.

Landing page

See the full SDK overview at /sdk.

Installbash

pnpm add hypereal-sdk

Quickstartts

import { Hypereal } from 'hypereal-sdk';

const client = new Hypereal({
  apiKey: process.env.HYPEREAL_API_KEY!,
});

const image = await client.images.generate({
  model: 'gemini-3-1-flash-t2i',
  prompt: 'A cinematic portrait in neon light',
  aspect_ratio: '16:9',
});

console.log(image);

Storage uploadts

const object = await client.storage.uploadFile(file, {
  filename: 'training-image.png',
  contentType: 'image/png',
  kind: 'dataset',
});

const listed = await client.storage.list({ kind: 'dataset' });

인증

모든 요청에는 ck_ 접두사 키가 필요합니다. 세 가지 헤더 형식을 지원하여 모든 SDK를 커버합니다.

Authorization

header

필수Bearer ck_... — OpenAI SDK, Codex CLI, Cursor에서 사용됩니다.

x-api-key

header

필수ck_... — Anthropic SDK 및 Claude Code에서 사용됩니다 (다음 환경에서): /v1/messages.

x-goog-api-key

header

필수ck_... — Google Gemini SDK / 네이티브 형식에 사용됩니다 (수신: /v1/gemini.?key=ck_... 도 작동합니다).

03 · OpenAI 호환

Chat Completions

주력 엔드포인트입니다. OpenAI Chat Completions 와이어 형식을 사용합니다. GPT, Gemini, Qwen, DeepSeek, GLM 및 Anthropic 이외의 모든 LLM에 사용됩니다.

POST/api/v1/chat/completions

요청 본문

model

string

필수Anthropic이 아닌 모든 모델 ID. 아래 표를 참조하십시오. Anthropic 모델은 400을 반환합니다 — 대신 다음을 사용하십시오: /v1/messages 대신 사용합니다.

messages

Message[]

필수표준 OpenAI 메시지 배열 (role, content).

stream

boolean

선택기본값은 false입니다. 다음일 때 SSE 스트림이 사용됩니다: true; usage는 마지막 청크에 포함됩니다.

max_tokens

number

선택업스트림으로 전달됩니다. 공급자별 기본값이 적용됩니다.

temperature, top_p, tools, …

any

선택기타 OpenAI 매개변수는 변경 없이 그대로 통과합니다.

요금

각 모델의 입력/출력 단가에 따라 토큰당 과금됩니다. 100 크레딧 = $1.00. 엔드포인트를 호출하기 위한 최소 잔액은 200 크레딧 ($2.00)입니다.

curl — 스트리밍bash

curl https://hypereal.cloud/api/v1/chat/completions \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.5",
    "messages": [
      {"role": "system", "content": "You are a terse assistant."},
      {"role": "user", "content": "Two-line haiku about caches."}
    ],
    "stream": true,
    "max_tokens": 256
  }'

Node — OpenAI SDK 스트리밍ts

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.HYPEREAL_API_KEY,
  baseURL: 'https://hypereal.cloud/api/v1',
});

const stream = await client.chat.completions.create({
  model: 'gpt-5.5',
  stream: true,
  messages: [{ role: 'user', content: 'Stream me a haiku.' }],
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? '');
}

OpenAI 및 공급자 호환 모델

모델 ID

레이블

입력 / 출력

gpt-5.5

GPT-5.5· OpenAI

$0.420 / $2.52 per MTok

gpt-5.5-instant

GPT-5.5 Instant· OpenAI

$0.420 / $2.52 per MTok

gpt-5.5-pro

GPT-5.5 Pro· OpenAI

$2.07 / $12.42 per MTok

gpt-5.4

GPT-5.4· OpenAI

$0.240 / $1.09 per MTok

gpt-5.4-mini

GPT-5.4 Mini· OpenAI

$0.040 / $0.250 per MTok

deepseek-v4-pro

DeepSeek V4 Pro· DeepSeek

$0.490 / $0.990 per MTok

deepseek-v4-flash

DeepSeek V4 Flash· DeepSeek

$0.160 / $0.330 per MTok

deepseek-v3.2

DeepSeek V3.2· DeepSeek

$0.230 / $0.920 per MTok

kimi-k2.6

Kimi K2.6· Moonshot

$1.07 / $4.44 per MTok

kimi-k2.5

Kimi K2.5· Moonshot

$0.460 / $2.42 per MTok

glm-5.1

GLM-5.1· Zhipu

$0.990 / $3.94 per MTok

glm-5

GLM-5· Zhipu

$0.460 / $2.07 per MTok

qwen3-max

Qwen 3 Max· Alibaba

$0.810 / $3.22 per MTok

qwen3.5-plus

Qwen 3.5 Plus· Alibaba

$0.460 / $2.76 per MTok

qwen3.5-flash

Qwen 3.5 Flash· Alibaba

$0.140 / $1.38 per MTok

MiniMax-M2.5

MiniMax M2.5· MiniMax

$0.240 / $0.970 per MTok

04 · Anthropic 호환

Messages

POST/api/v1/messages

요청 본문

model

string

messages

Message[]

필수이미지 및 tool_use 블록을 포함한 Anthropic 형식 메시지입니다.

max_tokens

number

필수Anthropic 사양에서 필수입니다.

cache_control

{ type: "ephemeral" }

선택Add it to stablesystem,tools, or text content blocks for Anthropic prompt caching. Hypereal defaults a cache breakpoint when omitted and reports cache usage in response metadata.

hypereal.cache

"auto" | false

선택Hypereal Cache is on by default. Use"auto" to make the default explicit for repeated requests, orfalse to bypass it for a request.

thinking

{ type: "enabled" | "adaptive", budget_tokens?: number }

stream, system, tools, …

any

선택Anthropic SDK에서와 동일하게 통과됩니다.

페일오버 업스트림으로 재시도할 때, 유효하지 않은 서명을 가진 오래된 thinking 블록은 자동으로 필터링됩니다 — 직접 처리하실 필요가 없습니다.

curl — 확장 사고bash

curl https://api.hypereal.cloud/v1/messages \
  -H "x-api-key: ck_..." \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "max_tokens": 1024,
    "system": [{
      "type": "text",
      "text": "You are a senior TypeScript refactoring assistant.",
      "cache_control": {"type": "ephemeral"}
    }],
    "messages": [
      {"role": "user", "content": "Plan a 3-step refactor of a Next.js app."}
    ],
    "hypereal": {"cache": "auto"}
  }'

Node — Anthropic SDKts

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({
  apiKey: process.env.HYPEREAL_API_KEY, // ck_...
  baseURL: 'https://api.hypereal.cloud',
});

const msg = await client.messages.create({
  model: 'claude-sonnet-4-6',
  max_tokens: 1024,
  system: [{
    type: 'text',
    text: 'You are a senior TypeScript refactoring assistant.',
    cache_control: { type: 'ephemeral' },
  }],
  hypereal: { cache: 'auto' },
  messages: [{ role: 'user', content: 'Hello, Claude.' }],
});

console.log(msg.content);

Anthropic 모델

모델 ID

레이블

입력 / 출력

claude-opus-4-7

Claude Opus 4.7· Anthropic

$3.40 / $16.96 per MTok

claude-opus-4-6

Claude Opus 4.6· Anthropic

$3.40 / $16.96 per MTok

claude-sonnet-4-6

Claude Sonnet 4.6· Anthropic

$0.680 / $3.40 per MTok

claude-haiku-4-5

Claude Haiku 4.5· Anthropic

$0.130 / $0.650 per MTok

managed-claude-opus-4-7-max

Claude Opus 4.7· Hypereal Managed

$5.25 / $26.25 per MTok

managed-claude-opus-4-6-max

Claude Opus 4.6 (1M)· Hypereal Managed

$5.25 / $26.25 per MTok

managed-claude-opus-4-5-max

Claude Opus 4.5· Hypereal Managed

$5.25 / $26.25 per MTok

managed-claude-sonnet-4-6-max

Claude Sonnet 4.6· Hypereal Managed

$3.15 / $15.75 per MTok

managed-claude-sonnet-4-5-max

Claude Sonnet 4.5· Hypereal Managed

$3.15 / $15.75 per MTok

managed-claude-haiku-4-5-max

Claude Haiku 4.5· Hypereal Managed

$1.05 / $5.25 per MTok

05 · OpenAI Responses API

Responses

POST/api/v1/responses

비고

Anthropic 모델은 400을 반환합니다 — 다음 엔드포인트에 속합니다: /v1/messages.
스트리밍과 비스트리밍 모두 다음에서 과금됩니다:response.usage.input_tokens / output_tokens.
일부 업스트림은 항상 SSE를 내보냅니다 — 엔드포인트가 이를 감지하여, 다음일 때라도 투명하게 스트리밍합니다: stream:false.
다중 업스트림 페일오버를 지원합니다. 클라이언트 타임아웃을 길게(300초 이상) 설정하십시오.

curlbash

curl https://hypereal.cloud/api/v1/responses \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.1-codex",
    "input": "Write a TypeScript function that debounces a callback.",
    "stream": true
  }'

Node — OpenAI SDK responses.createts

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.HYPEREAL_API_KEY,
  baseURL: 'https://hypereal.cloud/api/v1',
});

const response = await client.responses.create({
  model: 'gpt-5.3-codex',
  input: 'Refactor this file into smaller modules.',
});

console.log(response.output_text);

Codex 최적화 모델

모델 ID

레이블

입력 / 출력

gpt-5.3-codex

GPT-5.3 Codex· OpenAI

$0.090 / $0.680 per MTok

gpt-5.3-codex-spark

GPT-5.3 Codex Spark· OpenAI

$0.090 / $0.680 per MTok

06 · Codex CLI / Codex Desktop

Codex CLI

POST/api/v1/responses

~/.codex/config.tomltoml

# ~/.codex/config.toml
model_provider = "hypereal"
model = "gpt-5.3-codex"

[model_providers.hypereal]
name = "Hypereal"
base_url = "https://hypereal.cloud/api/v1"
wire_api = "responses"
env_key = "HYPEREAL_API_KEY"

그런 다음 키를 export 하십시오:
export HYPEREAL_API_KEY=ck_...

동일한 설정이 OpenCode, Claude Code (다음 사용: /v1/messages), Cursor (다음 사용: /v1/chat/completions), Gemini CLI (다음 사용: /v1/gemini)에서도 작동합니다.

이미지 생성

POST/api/v1/images/generations

요청 본문

model

string

필수이미지 모델 ID — 표를 참조하십시오.

prompt

string

필수텍스트 프롬프트입니다. 편집이 가능한 모델의 경우, 모델의 네이티브 매개변수를 통해 참조 이미지를 포함하십시오 (예: image, reference_images).

number

선택이미지 수, 1–10 (기본값 1).

size

string

선택그대로 전달됩니다 (예: 1024x1024, 1536x1024). 공급자에 따라 다릅니다.

quality, style, …

any

선택추가 매개변수는 업스트림으로 그대로 전달됩니다.

Use an image model ID here, not a chat model ID. Valid examples include gpt-image-2, nano_banana_pro, and gemini-3-1-flash-t2i. Use gpt-5.5 only with chat, messages, or responses endpoints.

curlbash

curl https://hypereal.cloud/api/v1/images/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nano_banana_pro",
    "prompt": "isometric studio shot of a tiny cyberpunk apartment, neon rim light",
    "n": 1,
    "size": "1024x1024"
  }'

Node — fetchts

const res = await fetch('https://hypereal.cloud/api/v1/images/generations', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.HYPEREAL_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'nano_banana_pro',
    prompt: 'a chrome teapot floating over the ocean at sunset',
    n: 1,
  }),
});

const { data } = await res.json();
console.log(data[0].url); // or data[0].b64_json depending on the model

Model

GPT Image 2 — text-to-image & image-to-image

size accepts 1024x1024, 1536x1024 (landscape), 1024x1536 (portrait), 2048x2048, 4096x4096. 2K and 4K are square only.
Reference images must be public HTTPS URLs (base64 is not accepted by this model). Up to 4 references per request.
Pricing is per-tier: 1K, 2K, and 4K each have their own credit cost — see the model table below.
Synchronous response: the call returns the final image URL (no polling needed). Allow up to ~120 s.

# Text-to-image (1K landscape)
curl https://hypereal.cloud/api/v1/images/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-image-2",
    "prompt": "a chrome teapot floating over the ocean at sunset",
    "size": "1536x1024"
  }'

# Image-to-image / edit
curl https://hypereal.cloud/api/v1/images/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-image-2",
    "prompt": "same character, snowy mountain background, golden hour",
    "size": "1024x1024",
    "reference_images": [
      "https://example.com/source.jpg"
    ]
  }'

Model

NanoBanana 2 — image-to-image & multimodal inputs

Supported aspect_ratio: 1:1, 3:2, 2:3, 4:3, 3:4, 16:9, 9:16, 21:9.
Supported resolution: 0.5K, 1K, 2K, 4K.
Reference images may be public HTTPS URLs or base64 data URLs.
Multi-reference works with a text prompt — combine, e.g., a character + outfit + scene reference and describe the final composition in the prompt.

# Multimodal: text + multiple reference images
curl https://hypereal.cloud/api/v1/images/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3-1-flash-t2i",
    "prompt": "Place the character (img 1) wearing the jacket (img 2) into the scene from img 3, cinematic light",
    "aspect_ratio": "16:9",
    "resolution": "2K",
    "image_urls": [
      "https://example.com/character.png",
      "https://example.com/jacket.png",
      "https://example.com/scene.png"
    ]
  }'

Hosting

Calling the API from a subdomain on shared hosting

No special setup is required. Our API accepts requests from any origin — there are no domain allowlists by default. Two things that catch shared-host users out, though:

Make API calls from your server, not the browser. Calling the API directly from client-side JavaScript would expose your ck_… key to every visitor. Always proxy through your own backend (PHP, Node, Python — whatever your subdomain runs).
Set a generous request timeout. Image and video calls can hold the connection open up to ~120 s (image) or ~300 s (video). Many shared hosts cap PHP/cURL at 30 s by default — raise max_execution_time, CURLOPT_TIMEOUT, and your reverse-proxy / FastCGI read timeout.
Lock keys to your subdomain (optional). In the dashboard you can scope an API key to a specific Origin or IP — recommended if your subdomain handles untrusted traffic.
Use HTTPS. Some shared-hosting subdomains default to HTTP — outbound HTTPS is required to reach the API.

# Minimal PHP server-side proxy (drop into /api/generate.php)
<?php
$body = file_get_contents('php://input');
$ch = curl_init('https://hypereal.cloud/api/v1/images/generations');
curl_setopt_array($ch, [
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_TIMEOUT        => 180,    // raise above shared-host default
  CURLOPT_POST           => true,
  CURLOPT_POSTFIELDS     => $body,
  CURLOPT_HTTPHEADER     => [
    'Authorization: Bearer ' . getenv('HYPEREAL_API_KEY'),
    'Content-Type: application/json',
  ],
]);
echo curl_exec($ch);

이미지 모델

모델 ID

레이블

가격

gpt-image-2

GPT Image 2· OpenAI

$0.030 / image

gpt-4o-image

GPT-4o Image· OpenAI

$0.012 / image

nano_banana

Nano Banana· Nano Banana

$0.024 / image

nano_banana_2

Nano Banana 2· Nano Banana

$0.040 / image

gemini-3.1-flash-image-preview

Gemini 3.1 Flash Image· Google

$0.050 / image

gemini-2.5-flash-image-preview

Gemini 2.5 Flash Image· Google

$0.024 / image

flux-kontext-pro

Flux Kontext Pro· Flux

$0.040 / image

flux-2-pro

Flux 2 Pro· Flux

$0.050 / image

doubao-seedream-4-0

Doubao Seedream 4.0· ByteDance

$0.057 / image

doubao-seedream-4-5

Doubao Seedream 4.5· ByteDance

$0.071 / image

doubao-seedream-5-0

Doubao Seedream 5.0· ByteDance

$0.063 / image

gemini-3.1-flash-image-preview-official

Gemini 3.1 Flash Image (Official)· Google

$0.064 / image

flux-kontext-max

Flux Kontext Max· Flux

$0.080 / image

gemini-2.5-flash-image-official

Gemini 2.5 Flash Image (Official)· Google

$0.098 / image

nano_banana_pro

Nano Banana Pro· Nano Banana

$0.100 / image

gemini-3-pro-image-preview

Gemini 3 Pro Image· Google

$0.100 / image

flux-2-flex

Flux 2 Flex· Flux

$0.140 / image

gemini-3-pro-image-preview-official

Gemini 3 Pro Image (Official)· Google

$0.216 / image

gemini-3-pro-image-preview-4K

Gemini 3 Pro Image 4K· Google

$0.190 / image

gemini-3.1-fast-imagen

Gemini 3.1 Fast Imagen· Google

$0.020 / image

gemini-3.1-thinking-imagen

Gemini 3.1 Thinking Imagen· Google

$0.020 / image

08 · 장기 실행

비디오 생성

POST/api/v1/videos/generate

요청 본문

model

string

필수비디오 모델 ID — 표를 참조하십시오.

prompt

string

필수클립을 묘사하는 텍스트 프롬프트입니다.

duration

number

선택초 단위, 1–60 (기본값 5). 다음 모델에만 의미가 있습니다: per_second 모델입니다.

aspect_ratio

string

선택예: 16:9, 9:16, 1:1. 공급자에 따라 다릅니다.Gemini Omni Flash accepts 16:9 or 9:16.

resolution

string

선택Forwarded when the selected model supports resolution. Gemini Omni Flash currently accepts 720P.

image_urls

string[]

선택For Gemini Omni Flash, pass 1-3 uploaded or public image URLs as visual references. Upload local images first and send the returned URL; direct base64 image payloads are not supported.

image_url

string

curl — 텍스트+이미지-투-비디오bash

curl https://hypereal.cloud/api/v1/videos/generate \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini_omni_flash",
    "prompt": "a white cube rotating on a black background, clean product demo",
    "duration": 6,
    "aspect_ratio": "16:9",
    "resolution": "720P",
    "image_urls": [
      "https://example.com/product-reference.png"
    ]
  }'

Node — fetchts

const res = await fetch('https://hypereal.cloud/api/v1/videos/generate', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.HYPEREAL_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'gemini_omni_flash',
    prompt: 'a cat walking on the moon, cinematic, no text',
    duration: 6,
    aspect_ratio: '16:9',
    resolution: '720P',
    image_urls: ['https://example.com/cat-reference.png'],
  }),
});

const data = await res.json();
console.log(data.jobId, data.pollUrl); // poll /v1/jobs/{id} for the mp4

비디오 모델

모델 ID

레이블

가격

happyhorse-1.0

HappyHorse 1.0· Alibaba

$0.110 / 720p / second · $0.190 / 1080p / second

gemini_omni_flash

Gemini Omni Flash· Google

$0.180 / clip

wan2.6-flash

WAN 2.6 Flash· Alibaba

$0.060 / sec

kling-2-6

Kling 2.6· Kuaishou

$0.074 / sec

MiniMax-Hailuo-02

MiniMax Hailuo 02· MiniMax

$0.080 / sec

doubao-seedance-1-0-pro-fast

Doubao Seedance Pro Fast· ByteDance

$0.083 / sec

MiniMax-Hailuo-2.3

MiniMax Hailuo 2.3· MiniMax

$0.098 / sec

wan2.6

WAN 2.6· Alibaba

$0.100 / sec

kling-video-o1

Kling Video O1· Kuaishou

$0.134 / sec

kling-v3-omni

Kling V3 Omni· Kuaishou

$0.134 / sec

kling-v3

Kling V3· Kuaishou

$0.134 / sec

kling-v3-video

Kling V3 Video· Kuaishou

$0.134 / sec

doubao-seedance-1-0-pro-quality

Doubao Seedance Pro Quality· ByteDance

$0.208 / sec

doubao-seedance-2-0

Doubao Seedance 2.0· ByteDance

$0.200 / sec

doubao-seedance-2-0-fast

Doubao Seedance 2.0 Fast· ByteDance

$0.105 / sec

doubao-seedance-1-5-pro

Doubao Seedance 1.5 Pro· ByteDance

$0.216 / sec

Veo3.1-fast-official

Veo 3.1 Fast· Google

$0.160 / sec

Veo3.1-quality-official

Veo 3.1 Quality· Google

$0.320 / sec

veo3.1-fast

Veo 3.1 Fast· Google

$0.160 / clip

veo3.1-quality

Veo 3.1 Quality· Google

$1.20 / clip

vidu-q3-pro

Vidu Q3 Pro· Vidu

$0.020 / clip

grok-video-3

Grok Video 3· xAI

$0.160 / clip

09 · Fish Audio

오디오 — TTS, 음성 클로닝, ASR

POST/api/v1/audio/generations

model

"audio-tts" | "audio-clone" | "audio-asr"

필수수행할 작업을 선택합니다.

text

string

선택다음에 필수: audio-tts 및 audio-clone.

audio

string (URL)

선택다음에 필수: audio-asr (입력) 및 audio-clone (참조 음성, 10초 이상).

voice_id, format, sample_rate, …

any

선택추가 Fish Audio 매개변수는 그대로 전달됩니다.

응답 형식: data: [{ url }] 는 TTS / 클로닝용, text (+ 선택적 segments, duration)는 ASR용입니다.

TTSbash

curl https://hypereal.cloud/api/v1/audio/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "audio-tts",
    "text": "Welcome to Hypereal. One key, every model.",
    "voice_id": "en_male_calm"
  }'

음성 클로닝bash

curl https://hypereal.cloud/api/v1/audio/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "audio-clone",
    "text": "This is my cloned voice.",
    "audio": "https://example.com/reference-30s.mp3"
  }'

ASR (음성 → 텍스트)bash

curl https://hypereal.cloud/api/v1/audio/generations \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "audio-asr",
    "audio": "https://example.com/recording.mp3"
  }'

오디오 모델

모델 ID

레이블

가격

audio-tts

Text to Speech· Fish Audio

$0.020 / request

audio-clone

Voice Clone· Fish Audio

$0.020 / request

audio-asr

Speech Recognition· Fish Audio

$0.010 / request

10 · Google 네이티브 형식

Gemini

POST/api/v1/gemini

model

string

필수모든 Gemini 모델 ID — 표를 참조하십시오.

contents

Content[]

선택Gemini 네이티브 메시지 배열입니다.

systemInstruction

Content

선택Gemini 형식의 선택적 시스템 메시지입니다.

generationConfig

object

선택temperature, maxOutputTokens 등.

messages

Message[]

선택OpenAI 형식이며, 다음 대신 사용 가능한 대안입니다: contents.

인증 헤더: x-goog-api-key: ck_..., ?key=ck_..., 또는 Authorization: Bearer ck_... 모두 작동합니다.

curl — Gemini 네이티브bash

curl "https://hypereal.cloud/api/v1/gemini" \
  -H "x-goog-api-key: ck_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3.5-thinking",
    "contents": [
      {"role": "user", "parts": [{"text": "Outline a launch plan."}]}
    ],
    "generationConfig": {"temperature": 0.6, "maxOutputTokens": 2048}
  }'

Node — fetchts

// The /v1/gemini endpoint accepts both Gemini-native and OpenAI shapes.
// For SDK use, the OpenAI client + /v1/chat/completions is simpler.
const res = await fetch('https://hypereal.cloud/api/v1/gemini', {
  method: 'POST',
  headers: {
    'x-goog-api-key': process.env.HYPEREAL_API_KEY!,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'gemini-3.5-fast',
    contents: [{ role: 'user', parts: [{ text: 'Hi' }] }],
  }),
});

console.log(await res.json());

Gemini 모델

모델 ID

레이블

입력 / 출력

gemini-3.5-thinking

Gemini 3.5 Thinking· Google

$0.900 / $5.40 per MTok

gemini-3.5-fast

Gemini 3.5 Fast· Google

$0.900 / $5.40 per MTok

gemini-3.1-pro-preview

Gemini 3.1 Pro Preview· Google

$0.390 / $1.74 per MTok

gemini-3-pro-preview

Gemini 3 Pro Preview· Google

$0.390 / $1.74 per MTok

gemini-3-flash-preview

Gemini 3 Flash Preview· Google

$0.050 / $0.290 per MTok

오류 및 속도 제한

모든 오류는 '{ error: { type, message } }' 형식의 JSON입니다. 속도 제한은 키별이 아닌 사용자별로 평가됩니다 — 여러 키가 동일한 할당량을 공유합니다.

401 authentication_error

JSON

선택키가 누락되었거나, 형식이 잘못되었거나( ck_ 접두사 없음), 만료되었거나, 비활성화된 경우입니다.

402 insufficient_credits

JSON

선택잔액이 200 크레딧($2) 미만이거나, 요청의 예상 비용이 잔액을 초과한 경우입니다.

403 access_denied

JSON

선택누적 충전 등급이 해당 모델을 해제하지 않은 경우입니다 (이미지/비디오/오디오는 $10 이상 필요, 일부 플래그십 LLM은 더 높은 등급 필요).

429 rate_limit_error / spending_limit_error

JSON

400 invalid_request_error

JSON

502 api_error

JSON

선택해당 모델에 대한 모든 업스트림이 실패했습니다. 메시지에는 마지막 업스트림의 오류 문자열이 포함됩니다.

DEVELOPER

ComfyUI as API

POST/v1/gpu/run/{slug}

Submits a job to your ComfyUI deployment. Async by default; pass "sync": true to wait inline up to 240s.

Submit a jobbash

curl -X POST https://hypereal.cloud/v1/gpu/run/my-comfy-workflow \
  -H "Authorization: Bearer $HYPEREAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": {
      "prompt": "a cinematic portrait of an astronaut",
      "seed": 42,
      "workflow_overrides": { "Sampler.steps": 30 }
    }
  }'

Submit responsejson

{
  "job_id": "K3uA7Pq9xLm4",
  "status": "queued",
  "provider_job_id": "..."
}

GET/v1/gpu/jobs/{id}

Status responsejson

{
  "job_id": "K3uA7Pq9xLm4",
  "status": "succeeded",
  "output": { "images": ["data:image/png;base64,..."] },
  "executionMs": 18420,
  "creditsCharged": 56
}

See your deploymentsbash

# List
curl https://hypereal.cloud/v1/deployments \
  -H "Authorization: Bearer $HYPEREAL_API_KEY"

# Create (point at any ComfyUI worker image)
curl -X POST https://hypereal.cloud/v1/deployments \
  -H "Authorization: Bearer $HYPEREAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "slug": "my-comfy-workflow",
    "name": "My Comfy",
    "dockerImage": "runpod/worker-comfyui:dev-cuda12.1.1",
    "gpuTypes": "ADA_48_PRO,AMPERE_80"
  }'

Workflow setup

Full Infrastructure docs: /docs/infra — handler spec, pricing, webhook protocol, R2 storage for weights.

ENTERPRISE

Gateway features

Cost visibility, budget guardrails, request logs, multi-provider failover, and smart routing — all built into the same API key. No extra setup, no separate dashboard tier.

Cost Dashboard

Spend, by model, in real time

Per-model pie, daily cost trend, top-10 most expensive requests. Available on every account at /usage. Export the underlying logs to CSV at any time:

GET /api/api-usage/export?days=30
Authorization: session cookie

→ hypereal-usage-2026-05-10.csv

Budget Alerts

Per-key monthly cap, with email guardrails

Set spendingLimit on any API key. We email at 80% (heads up) and 100% (hard cap). Optional: auto-disable the key on overshoot so a runaway loop never costs you a four-figure invoice.

POST /api/api-keys
{
  "name": "prod-eu",
  "spendingLimit": 50000   // 500 USD / month
}

Request Logs

Every call, searchable

Every API call is indexed by endpoint, model, status code, latency, and cost. Filter and search at /usage, or pull the JSON directly:

GET /api/api-usage?days=30&limit=1000

{
  logs: [...],
  costByModel: [...],
  topExpensiveRequests: [...]
}

Multi-Provider Failover

Outages don't reach your users

primary:  seedance-2-0-turbo-t2v   (region us-east)
fallback: seedance-2-0-t2v         (region us-west)
fallback: seedance-2-0             (region eu-central)
retries:  1 per target, exp backoff

Smart Routing

Pick by intent, we pick the cheapest qualified model

Send intent instead of model and we'll route to the cheapest provider in that capability bucket — without giving up determinism: pin a model whenever you want and we'll honor it exactly.

POST /v1/images/generate
{
  "intent": "text-to-image-fast",   // ← we'll pick the cheapest qualified model
  "prompt": "a quiet sunrise over Mt Fuji"
}

# Or pin explicitly:
{ "model": "nano-banana-t2i", "prompt": "..." }

SERVERLESS

GPU models

Hosted serverless GPU inference at /v1/gpu/{slug}. One API key, credit billing, audit log, and webhooks. Same wallet and dashboard as your LLM calls.

1. Pick a model

Browse the live catalog at /gpu-recommend. Each model lists its slug, per-call or per-second credit cost, and the maximum execution time per call.

2. Sync invocation (small jobs)

Short-running models return the output inline.

curl -X POST https://api.hypereal.cloud/v1/gpu/sdxl \
  -H "Authorization: Bearer ck_..." \
  -H "Content-Type: application/json" \
  -d '{"input": {"prompt": "a tabby cat astronaut"}}'

→ { "id": "...",
    "status": "succeeded",
    "outputs": ["https://cdn.hypereal.cloud/gpu/.../out.png"],
    "costCredits": 50,
    "durationMs": 4210 }

3. Async invocation (long jobs)

Long-running models queue and return a job id immediately with a 202. Poll, or wait for our cron + webhook poller to settle the job.

# Submit
POST /v1/gpu/wan-video
{ "input": { "prompt": "drone over Tokyo, neon, rain", "seconds": 5 } }
→ 202 { "id": "abc...", "status": "queued", "pollUrl": "/v1/gpu/jobs/abc..." }

# Poll
GET /v1/gpu/jobs/abc...
→ { "id": "abc...",
    "status": "succeeded",
    "outputs": ["https://cdn.hypereal.cloud/gpu/.../clip.mp4"],
    "costCredits": 312,
    "durationMs": 156000 }

Failed and timed-out jobs auto-refund the credit reservation. Per-second billing reconciles on completion using the model's reported execution time, capped at the model'smaxSeconds.

ENTERPRISE

Teams, RBAC & SSO

Organizations, five built-in roles, SAML and OIDC single sign-on. Built so security and procurement can sign off without a custom rider.

Organizations

Org-scoped keys, audit log, billing

POST /api/orgs
{
  "name": "Acme Inc"
}
→ { id, slug, role: "owner" }

Five built-in roles

Owner · Admin · Developer · Billing · Viewer

Owner — everything, including delete-org
Admin — manage members, keys, SSO, webhooks
Developer — create/delete API keys, manage workflows + GPUs
Billing — view + manage payments and audit log
Viewer — read-only access to keys, billing, audit

SAML 2.0

Configure your IdP in 3 steps

Create a SAML app in Okta / Azure AD / Auth0 / Google.
Set ACS URL to https://hypereal.cloud/api/auth/sso/<providerId>
Paste the IdP metadata XML into /settings/organization → SSO.

Set the email-domain claim (e.g. acme.com) and the login form will auto-route corporate emails to your IdP — no password prompt.

OIDC

Issuer + client credentials

Drop in your issuer URL, client id, and client secret. We fetch the/.well-known/openid-configuration on save and surface a green check when the IdP is reachable.

POST /api/orgs/{id}/sso
{
  "type": "oidc",
  "issuer": "https://idp.acme.com",
  "clientId": "...",
  "clientSecret": "...",
  "domain": "acme.com"
}

요금 및 크레딧

LLMs

토큰 × MTok당 단가입니다. 스트리밍 요청은 마지막 usage 청크 기준으로 과금됩니다.

이미지

생성당 정액 × 실제 반환된 n 반환됩니다.

비디오 및 오디오

초당(대부분의 비디오), 클립당(Veo, Vidu, Grok), 또는 요청당(Fish Audio)으로 과금됩니다.