One control plane for GPU endpoints, pods, clusters, and storage.
A complete GPU cloud infrastructure surface under Hypereal branding: deploy serverless workers, rent dedicated pods, plan Mercury clusters, attach persistent volumes, browse public endpoints, and operate everything with one API key and one bill.
RunPod-class capabilities, Hypereal control plane.
Every major GPU cloud workflow has a dashboard entry, API surface, docs entry, and billing model.
Serverless endpoints
Deploy Docker images as autoscaling GPU endpoints with scale-to-zero, sync or async jobs, logs, health, public/private routing, and per-second billing.
Dedicated GPU pods
Launch long-running GPU instances with SSH, public IP, exposed ports, persistent disk, network volumes, live logs, stop, resume, and terminate controls.
Mercury GPU clusters
Create multi-node H100, H200, B200, A100, and L40S clusters with topology planning, placement, scheduler policy, NCCL hints, runtime runbooks, and capacity gates.
Network volumes
Create persistent GPU-side storage, attach it to pods and serverless workers, keep data after compute is deleted, and inspect mount paths in one place.
Public endpoints
Browse ready-to-call model endpoints for image, video, audio, text, and workflow APIs. Use one key and one bill across curated and user-published endpoints.
Jobs and observability
Track every request with status, latency, cost, endpoint, logs, output payloads, retries, and exportable history for finance and incident review.
Reusable templates
Standardize Docker images, ports, environment variables, storage settings, GPU selections, and deployment recipes for repeated launches.
API keys and usage
Use scoped API keys for SDK and CLI workflows, inspect request history, set usage guardrails, and route all infrastructure through one account.
Start small, promote to production, scale to clusters.
Users can move through the same lifecycle expected from a GPU cloud without leaving Hypereal.
Prototype on a pod
- 1Pick a GPU and image.
- 2Attach a network volume.
- 3SSH into the pod.
- 4Promote the image to serverless when traffic is predictable.
Ship a serverless model
- 1Package a handler.
- 2Set workers, idle timeout, and GPU types.
- 3Call the stable endpoint URL.
- 4Watch jobs, logs, health, and spend.
Scale to a Mercury cluster
- 1Quote topology before creation.
- 2Choose network, orchestrator, and model size.
- 3Inspect placement, NCCL, torchrun, and runbook.
- 4Launch, monitor events, then terminate when done.
The complete GPU cloud map.
This page links each user-facing capability to the live console, API, or sales-assisted path.
| Capability | Status | Entry point |
|---|---|---|
| Autoscaling serverless GPU endpoints | Available | /infra/deployments |
| Dedicated hourly GPU pods | Available | /infra/pods |
| Multi-node instant clusters | Available through Mercury | /infra/clusters |
| Reserved large clusters | Sales-assisted | mailto:sales@hypereal.cloud |
| Network volumes and persistent storage | Available | /infra/storage |
| Public model endpoints | Available | /infra/explore |
| Custom templates and reusable configs | Available | /templates |
| Job logs, status, and usage tracking | Available | /infra/jobs |
| API key and dashboard auth | Available | /manage-api-keys |
| Docs and API examples | Available | /docs/infra |
Everything visible in the console is callable by API.
Dashboard users and API users share the same resource model: deployments, pods, clusters, volumes, jobs, keys, logs, and usage.
curl -X POST https://hypereal.cloud/api/v1/gpu/clusters/quote \
-H "Authorization: Bearer ck_..." \
-H "Content-Type: application/json" \
-d '{
"gpuCount": 64,
"gpuSku": "h100-sxm-80gb",
"network": "nvlink-ib400",
"workload": "training"
}'