01 · 90s में शुरू करें
Quickstart
Key mint करें, अपना client hypereal.build पर point करें, ship करें। Auth और request shapes OpenAI-compatible हैं — अधिकांश SDKs केवल base URL बदलकर काम कर जाते हैं।
कम से कम $2 (200 credits) top up करें और key बनाएँ — /manage-api-keys। Keys शुरू होती हैं ck_।
Base URL: https://hypereal.build/api/v1
Auth header है Authorization: Bearer ck_...। वही OpenAI request bodies जो आप पहले से जानते हैं।
curl https://hypereal.build/api/v1/chat/completions \
-H "Authorization: Bearer ck_..." \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.5",
"messages": [{"role": "user", "content": "Say hi in one word."}]
}'import OpenAI from 'openai';
const client = new OpenAI({
apiKey: process.env.NEOCLOUD_API_KEY, // ck_...
baseURL: 'https://hypereal.build/api/v1',
});
const completion = await client.chat.completions.create({
model: 'gpt-5.5',
messages: [{ role: 'user', content: 'Say hi in one word.' }],
});
console.log(completion.choices[0].message.content);02
Authentication
हर request को ck_-prefixed key चाहिए। तीन accepted header forms सभी SDKs cover करते हैं।
Bearer ck_... — OpenAI SDK, Codex CLI और Cursor द्वारा उपयोग।ck_... — Anthropic SDK और Claude Code पर उपयोग — /v1/messages।ck_... — Google Gemini SDK / native shape, accepted /v1/gemini.?key=ck_... भी काम करता है।03 · OpenAI-compatible
Chat Completions
Workhorse endpoint। OpenAI Chat Completions wire format। GPT, Gemini, Qwen, DeepSeek, GLM और हर non-Anthropic LLM के लिए उपयोग।
/api/v1/chat/completionsRequest body
/v1/messages उपयोग करें।role, content)।false। SSE stream तब जब true; usage final chunk में शामिल होता है।Pricing
हर model की input/output rate से per token bill। 100 credits = $1.00। Endpoint call करने के लिए minimum balance 200 credits ($2.00)।
curl https://hypereal.build/api/v1/chat/completions \
-H "Authorization: Bearer ck_..." \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.5",
"messages": [
{"role": "system", "content": "You are a terse assistant."},
{"role": "user", "content": "Two-line haiku about caches."}
],
"stream": true,
"max_tokens": 256
}'import OpenAI from 'openai';
const client = new OpenAI({
apiKey: process.env.NEOCLOUD_API_KEY,
baseURL: 'https://hypereal.build/api/v1',
});
const stream = await client.chat.completions.create({
model: 'gpt-5.5',
stream: true,
messages: [{ role: 'user', content: 'Stream me a haiku.' }],
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content ?? '');
}OpenAI और provider-compatible models
gpt-5gpt-5.1gpt-5.2gpt-5.3gpt-5.4gpt-5.5gpt-5.5-instantgpt-5.5-progpt-5.4-minigpt-5.4-nanogpt-5.4-officialgpt-5.4-pro-officialgpt-5.2-officialgpt-5-pro-officialgpt-realtime-1.5-officialgpt-audio-1.5-officialglm-5qwen3.5-plusqwen3.5-flashqwen3-maxdeepseek-v3.2kimi-k2.5MiniMax-M2.5nano-banana-204 · Anthropic-compatible
Messages
Extended thinking, multi-upstream failover और 15-second SSE keepalives के साथ Anthropic /v1/messages wire format। Claude Code, OpenCode, OpenClaw और official Anthropic SDK के लिए उपयोग।
/api/v1/messagesRequest body
claude-opus-4-6, claude-sonnet-4-6, या claude-haiku-4-5। पुराने Anthropic IDs (claude-sonnet-4-5-20250929, claude-3-5-sonnet-20241022, claude-3-5-haiku-20241022) latest equivalents पर auto-alias होते हैं।budget_tokens reasoning trace cap करता है। Endpoint लंबे thinking streams बंद होने से बचाने के लिए 15s SSE pings भेजता है।curl https://hypereal.build/api/v1/messages \
-H "x-api-key: ck_..." \
-H "anthropic-version: 2023-06-01" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-opus-4-6",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Plan a 3-step refactor of a Next.js app."}
],
"thinking": {"type": "enabled", "budget_tokens": 4000}
}'import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic({
apiKey: process.env.NEOCLOUD_API_KEY, // ck_...
baseURL: 'https://hypereal.build/api/v1',
});
const msg = await client.messages.create({
model: 'claude-sonnet-4-6',
max_tokens: 1024,
messages: [{ role: 'user', content: 'Hello, Claude.' }],
});
console.log(msg.content);Anthropic models
claude-opus-4-6claude-sonnet-4-6claude-haiku-4-505 · OpenAI Responses API
Responses
OpenAI का नया Responses API (Codex CLI के `wire_api = responses` mode और OpenAI Agents SDK द्वारा उपयोग)। chat/completions जैसा auth; request body `messages` के बजाय `input` उपयोग करती है।
/api/v1/responsesNotes
- Anthropic models 400 return करते हैं — वे यहाँ belong करते हैं
/v1/messages। - Streaming और non-streaming दोनों
response.usage.input_tokens/output_tokens। - कुछ upstreams हमेशा SSE emit करते हैं — endpoint यह detect करता है और transparently stream करता है, भले ही
stream:false। - Multi-upstream failover। लंबा client timeout set करें (300s+)।
curl https://hypereal.build/api/v1/responses \
-H "Authorization: Bearer ck_..." \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.1-codex",
"input": "Write a TypeScript function that debounces a callback.",
"stream": true
}'import OpenAI from 'openai';
const client = new OpenAI({
apiKey: process.env.NEOCLOUD_API_KEY,
baseURL: 'https://hypereal.build/api/v1',
});
const response = await client.responses.create({
model: 'gpt-5-codex',
input: 'Refactor this file into smaller modules.',
});
console.log(response.output_text);Codex-tuned models
gpt-5-codexgpt-5-codex-minigpt-5.1-codexgpt-5.1-codex-minigpt-5.1-codex-maxgpt-5.2-codexgpt-5.3-codexgpt-5.3-codex-sparkgpt-5.3-codex-official06 · Codex CLI / Codex Desktop
Codex CLI
Codex अपने `wire_api = responses` provider को /api/v1/codex/responses पर point करता है। CLI base URL में `/responses` prepend करता है, इसलिए base URL नीचे की तरह configure करें।
/api/v1/codex/responses# ~/.codex/config.toml model_provider = "hypereal" model = "gpt-5-codex" [model_providers.hypereal] name = "Hypereal" base_url = "https://hypereal.build/api/v1/codex" wire_api = "responses" env_key = "NEOCLOUD_API_KEY"
फिर अपनी key export करें:export NEOCLOUD_API_KEY=ck_...
Run codex आम तरह। Codex जो भी भेजे — full reasoning streams, tool calls, file edits — unchanged proxy होते हैं। Billing standard input_tokens / output_tokens usage block पर।
वही setup OpenCode, Claude Code (उपयोग करें /v1/messages), Cursor (उपयोग करें /v1/chat/completions) और Gemini CLI (उपयोग करें /v1/gemini)।
07
Image generation
OpenAI-compatible /images/generations shape। Synchronous — upstream complete होने पर endpoint image URLs (या base64) return करता है। Per image bill; `n` 1–10 तक clamp।
/api/v1/images/generationsRequest body
image, reference_images)।1024x1024, 1536x1024। Provider-dependent।creditsPerGeneration × n, endpoint 402 return करता है।curl https://hypereal.build/api/v1/images/generations \
-H "Authorization: Bearer ck_..." \
-H "Content-Type: application/json" \
-d '{
"model": "nano_banana_pro",
"prompt": "isometric studio shot of a tiny cyberpunk apartment, neon rim light",
"n": 1,
"size": "1024x1024"
}'const res = await fetch('https://hypereal.build/api/v1/images/generations', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.NEOCLOUD_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'gemini-3-pro-image-preview',
prompt: 'a chrome teapot floating over the ocean at sunset',
n: 1,
}),
});
const { data } = await res.json();
console.log(data[0].url); // or data[0].b64_json depending on the modelImage models
gpt-image-2gpt-4o-imagenano_banananano_banana_2gemini-3.1-flash-image-previewgemini-2.5-flash-image-previewflux-kontext-proflux-2-prodoubao-seedream-4-0doubao-seedream-4-5doubao-seedream-5-0gemini-3.1-flash-image-preview-officialflux-kontext-maxgemini-2.5-flash-image-officialnano_banana_progemini-3-pro-image-previewflux-2-flexgemini-3-pro-image-preview-officialgemini-3-pro-image-preview-4Kgemini-3.1-fast-imagengemini-3.1-thinking-imagen08 · long-running
Video generation
Synchronous long-poll endpoint — clip ready होने तक connection खुली रखें। अपना HTTP client timeout 600s set करें। Billing per second (अधिकांश models) या per clip (Veo, Vidu, Grok)।
/api/v1/video/generationsRequest body
per_second models।16:9, 9:16, 1:1। Provider-dependent।last_image_url या image — उस model के upstream docs देखें।curl https://hypereal.build/api/v1/video/generations \
-H "Authorization: Bearer ck_..." \
-H "Content-Type: application/json" \
-d '{
"model": "doubao-seedance-2-0",
"prompt": "drone shot flying over a foggy forest at dawn, cinematic",
"duration": 5,
"aspect_ratio": "16:9",
"image_url": "https://example.com/keyframe.jpg"
}'const res = await fetch('https://hypereal.build/api/v1/video/generations', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.NEOCLOUD_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'kling-v3',
prompt: 'a cat walking on the moon',
duration: 5,
aspect_ratio: '16:9',
}),
});
// Long-running: connection stays open until the upstream returns the clip.
// Set a generous timeout (300+ seconds).
const data = await res.json();
console.log(data); // contains url(s) to the rendered mp4Video models
wan2.6-flashkling-2-6MiniMax-Hailuo-02doubao-seedance-1-0-pro-fastMiniMax-Hailuo-2.3wan2.6kling-video-o1kling-v3-omnikling-v3kling-v3-videodoubao-seedance-1-0-pro-qualitydoubao-seedance-2-0doubao-seedance-2-0-fastdoubao-seedance-1-5-proVeo3.1-fast-officialVeo3.1-quality-officialveo3.1-fastveo3.1-qualityvidu-q3-progrok-video-309 · Fish Audio
Audio — TTS, voice cloning, ASR
तीन model IDs एक endpoint share करते हैं। Body और response का shape इस पर निर्भर है आप कौन-सा call करते हैं। Provider Fish Audio है (direct call, ToAPI से नहीं), per request bill।
/api/v1/audio/generationsaudio-tts और audio-clone।audio-asr (input) और audio-clone (reference voice ≥ 10s)।data: [{ url }] TTS / clone के लिए, text (+ optional segments, duration) ASR के लिए।curl https://hypereal.build/api/v1/audio/generations \
-H "Authorization: Bearer ck_..." \
-H "Content-Type: application/json" \
-d '{
"model": "audio-tts",
"text": "Welcome to Hypereal. One key, every model.",
"voice_id": "en_male_calm"
}'curl https://hypereal.build/api/v1/audio/generations \
-H "Authorization: Bearer ck_..." \
-H "Content-Type: application/json" \
-d '{
"model": "audio-clone",
"text": "This is my cloned voice.",
"audio": "https://example.com/reference-30s.mp3"
}'curl https://hypereal.build/api/v1/audio/generations \
-H "Authorization: Bearer ck_..." \
-H "Content-Type: application/json" \
-d '{
"model": "audio-asr",
"audio": "https://example.com/recording.mp3"
}'Audio models
audio-ttsaudio-cloneaudio-asr10 · Google native shape
Gemini
एक endpoint पर Gemini-native (`contents` / `generationConfig` / `systemInstruction`) और OpenAI दोनों shapes accept करता है। Endpoint forward करने से पहले internally OpenAI में convert करता है। अधिकांश code के लिए Gemini model ID के साथ /v1/chat/completions simpler है।
/api/v1/geminitemperature, maxOutputTokens, आदि।contents।Auth header: x-goog-api-key: ck_..., ?key=ck_..., or Authorization: Bearer ck_... सब काम करते हैं।
curl "https://hypereal.build/api/v1/gemini" \
-H "x-goog-api-key: ck_..." \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-3.1-pro",
"contents": [
{"role": "user", "parts": [{"text": "Outline a launch plan."}]}
],
"generationConfig": {"temperature": 0.6, "maxOutputTokens": 2048}
}'// The /v1/gemini endpoint accepts both Gemini-native and OpenAI shapes.
// For SDK use, the OpenAI client + /v1/chat/completions is simpler.
const res = await fetch('https://hypereal.build/api/v1/gemini', {
method: 'POST',
headers: {
'x-goog-api-key': process.env.NEOCLOUD_API_KEY!,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'gemini-3.1-fast',
contents: [{ role: 'user', parts: [{ text: 'Hi' }] }],
}),
});
console.log(await res.json());Gemini models
gemini-3-pro-officialgemini-3-pro-preview-officialgemini-3-flash-officialgemini-3-flash-preview-officialgemini-3.1-progemini-3.1-pro-preview-officialgemini-3.1-fastgemini-3.1-thinkinggemini-3.1-flash-lite-preview-officialgemini-2.5-pro-officialgemini-2.5-flash-officialgemini-2.5-flash-lite-officialgemini-2.0-flash-officialgemini-2.0-flash-lite-officialgemini-2.0-flash-vipgemini-2.5-flash-vipgemini-2.5-pro-vipgemini-3-flash-preview-vip11
Errors और rate limits
सभी errors JSON में { error: { type, message } } form में हैं। Rate limits per user evaluate होती हैं, key पर नहीं — multiple keys same quota share करती हैं।
ck_ prefix नहीं), expired या inactive key।X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset headers rate limit responses पर return होते हैं।model, unknown model ID (response include करता है available_models), या format के लिए ग़लत endpoint (जैसे Anthropic model इस पर — /chat/completions)।12
Pricing और credits
एक unit: 100 credits = $1.00 USD। LLMs हर model की input / output rate से per token bill करते हैं। Media models per image, per second या per clip bill करते हैं।
LLMs
Tokens × per-MTok rate। Streaming requests final usage chunk से bill होती हैं।
Images
Flat per generation × actual n return।
Video और audio
Per second (अधिकांश video), per clip (Veo, Vidu, Grok), या per request (Fish Audio)।
Claude, GPT, Gemini और चुनिंदा image models (GPT Image 2, Nano Banana) direct providers से कम price पर हैं। Video, audio और बाकी media models standard rates पर bill होते हैं।

