The Hypereal Platform

Enterprise AI Gateway + Compute Platform

One key. Every model, every GPU, every ComfyUI workflow. With the cost, audit, and reliability your CFO and CTO already asked about.

Get an API key Talk to sales

Used in production by teams shipping image, video, voice, and chat features to millions of end users.

Cost & Spend

Know what you spend before the invoice arrives.

Every gateway request is priced, attributed, and logged in real time. Per-model dashboards, monthly forecasts, and budget guardrails — out of the box, no Datadog dashboard required.

Cost Dashboard

Daily spend trend, per-model breakdown, top-10 most expensive requests. The view your CFO actually asked for.

Spend Forecast

Trailing burn rate projected to month-end so you can see overruns weeks before they hit your card.

Budget Alerts

Per-key monthly cap. Emails at 80% and 100% with a cooldown so you don't get spammed. Optional auto-pause kills runaway loops dead.

Signed Webhooks

HMAC-signed events for spend thresholds, key created, key revoked, generation failed. Wire them into PagerDuty, Slack, or your own ledger.

this month

apr 1 — apr 30

spend

$1,847.22

forecast

$2,640

requests

184,302

avg cost / call

$0.0100

gpt-4.1-mini

$702.94

seedance-1.5

$443.33

nano-banana

$313.93

flux-2-pro

$203.19

claude-haiku

$110.83

others

$73.00

Reliability

Outages happen. Your users shouldn't notice.

Multi-provider failover, regional fallback, and intent-aware routing turn a fragile single-vendor dependency into a redundant, self-healing layer.

Multi-Provider Failover

Configurable per-key timeouts and retry policy. On 5xx or timeout, traffic transparently rolls to the next provider in the chain.

Example fallback chain

POST /v1/chat/completions
  ├── primary    → openai/gpt-4.1-mini      [503 in 8s] ✗
  ├── fallback 1 → google/gemini-2.5-flash  [200 in 612ms] ✓
  └── fallback 2 → anthropic/claude-haiku   (skipped)

served 200 OK · upstream: gemini · total 624ms

Your request never sees the failed hop. Latency budget enforced end-to-end.

Smart Routing

Tell us the intent — fast chat, deep reasoning, image edit, long-form summarization — and we pick the cheapest qualified provider. Pin an exact model when you need to.

Regional Fallback

If a provider's US-East region is degraded, we try US-West, then EU, before failing the request. Region-stickiness is configurable per key.

Governance & Security

Per-key controls that satisfy a security review.

Scoped keys, granular rate limits, IP allowlists, immutable audit log, and CSV export. Designed for the questions your CTO and your auditor will both ask.

API Key Scoping

Per-key allow/deny on models, IP allowlist, daily and hourly spend caps. Rotate without redeploying.

Per-Key, Per-Model Rate Limits

RPM and TPM limits scoped to the key and the model. A staging key can't accidentally drain prod's quota.

Immutable Audit Log

Every key created, scope changed, budget moved, or revocation is recorded with actor, IP, and timestamp. SOC2-baseline by default.

Searchable Logs + CSV Export

Filter request logs by endpoint, model, status, latency, key. One-click CSV for finance, compliance, or post-mortem.

Compliance posture

TLS 1.2+ end-to-end. Keys hashed at rest, never logged in plaintext.
Per-tenant key + budget isolation. No cross-tenant data leakage.
Configurable log retention. Drop request bodies on demand for high-sensitivity workloads.
EU and US routing available on request for residency-sensitive deployments.
SOC2 controls in scope for 2026. Reach out if you need a current letter from our auditor.

Compatibility

Drop-in for the OpenAI SDK. Swap one base URL.

Hypereal speaks OpenAI Chat Completions, Images, Responses, and Anthropic Messages. Keep your SDK, your prompts, your tool definitions, your retries — change the base URL and the API key, ship.

curl

curl https://api.hypereal.cloud/v1/chat/completions \
  -H "Authorization: Bearer $HYPEREAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4.1-mini",
    "messages": [{ "role": "user", "content": "hi" }]
  }'

Node SDK

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.HYPEREAL_API_KEY,
  baseURL: "https://api.hypereal.cloud/v1",
});

const res = await client.chat.completions.create({
  model: "gpt-4.1-mini",
  messages: [{ role: "user", content: "hi" }],
});

Supported endpoints

POST /v1/chat/completions — OpenAI-compatible
POST /v1/messages — Anthropic-compatible
POST /v1/responses — OpenAI Responses API
POST /v1/images/generations — OpenAI-compatible
POST /v1/videos/generate — Hypereal video API
POST /v1/comfy/{slug} — ComfyUI workflow as API
POST /v1/gpu/{slug} — Serverless GPU passthrough

Compute

Beyond models: compute as a first-class API.

Every team eventually needs more than chat completions — a custom ComfyUI graph, a fine-tune, a one-off GPU job. Hypereal exposes those behind the same key, the same logs, the same budgets.

Serverless GPU Passthrough

Bring your own RunPod handler and call it as POST /v1/gpu/{slug}. We handle auth, metering, retries, and the bill. You write the handler.

ComfyUI Workflow as API

Upload any ComfyUI workflow JSON. We give you a versioned HTTP endpoint with typed inputs and outputs, billed per run. No more pasting graphs in Slack.

ComfyUI Library

A growing catalog of pre-built ComfyUI workflows — face restore, product shot, cinematic upscale — call them like any other model.

LoRA & Asset Repo

Private, versioned storage for LoRAs, checkpoints, embeddings, and reference images. Reference them by handle from any workflow or generation.

POST /v1/comfy/cinematic-upscale
{
  "inputs": { "image_url": "https://...", "strength": 0.8 },
  "version": "v3"
}

POST /v1/gpu/my-handler
{
  "input": { "prompt": "a cat", "steps": 28 }
}

Status & Trust

Numbers we publish. Not screenshots in a sales deck.

Live status page, transparent latency, and an incident history you can read without asking us first.

Public status page

Live per-endpoint status and incident timeline.

View status page

Transparent latency

Rolling p50 and p95 for every gateway endpoint, by region.

Uptime history

Trailing 30/90-day uptime, no marketing math. The number is the number.

Stop running 8 vendor dashboards.

One API key, one bill, one place to see what's happening. Get up and running in under five minutes.

Get an API key Talk to sales Read the docs

POST /v1/chat/completions ├── primary → openai/gpt-4.1-mini [503 in 8s] ✗ ├── fallback 1 → google/gemini-2.5-flash [200 in 612ms] ✓ └── fallback 2 → anthropic/claude-haiku (skipped) served 200 OK · upstream: gemini · total 624ms

curl https://api.hypereal.cloud/v1/chat/completions \ -H "Authorization: Bearer $HYPEREAL_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-4.1-mini", "messages": [{ "role": "user", "content": "hi" }] }'

import OpenAI from "openai"; const client = new OpenAI({ apiKey: process.env.HYPEREAL_API_KEY, baseURL: "https://api.hypereal.cloud/v1", }); const res = await client.chat.completions.create({ model: "gpt-4.1-mini", messages: [{ role: "user", content: "hi" }], });