API Relay for OpenAI, Claude, Gemini, DeepSeek

One endpoint for model routing and revenue control.

Launch a branded LLM API relay with usage metering, upstream failover, client keys, routing policy, and request logs. The first screen sells the product. The second half shows operators exactly how the system behaves.

Start for free See live console

Today Requests 1.28M Across 37 client keys and 4 upstream pools

Success Rate 99.42% Auto failover and retry orchestration enabled

Gross Margin 31.6% Usage markup and route-level cost optimization

relaycloud.io / production

Auto / Balanced healthy

Primary route for chat completions

GPT-4.1 stable

Low hallucination lane

Claude Sonnet warm

Long-context fallback

DeepSeek degraded

SG pool removed from primary rotation

Smoke Test Playground

latency 1.84s

Auto Route GPT-4.1 Claude Gemini

Summarize a SaaS refund policy in concise Chinese and return a JSON object with title, bullets, and tone.

Decision Trace live

policy=balanced · region=sg · json=true · margin floor=26%

Run Cost $0.0018

472 tokens · schema pass · no fallback triggered

Capabilities

Developer product on top, operator tools underneath.

The point is not just to proxy requests. A viable relay needs margin visibility, route controls, per-key limits, and enough logs to debug upstream instability without forcing your users to understand your provider stack.

Unified Endpoint

Expose one base URL to your customers while swapping models, regions, and upstream vendors behind policy rules.

Routing Policies

Choose balanced, lowest-cost, premium, or client-specific routing with fallback chains for partial outages and quota pressure.

Billing Control

Price above token cost, watch margin by route, and isolate high-volume clients before they eat your cheap lane.

Console

Homepage promise backed by real operator surfaces.

This lower section drops into the control plane. It keeps the information density of an infrastructure dashboard while still living inside the same branded landing page.

Request Playground

test request

Auto Route JSON Mode Vision Streaming

Create a concise support response explaining subscription cancellation terms, then produce a JSON payload for frontend rendering.

{
  "route": "gpt-4.1",
  "status": "ok",
  "latency_ms": 1840,
  "tokens": 472,
  "cost_usd": 0.0018,
  "fallback_used": false
}

Client Keys

3 examples

rk_live_web_01 active

Quota 600k req/day · last used 12:31

rk_enterprise_cn allowlist

Dedicated billing lane · CN/HK routing

rk_test_playground sandbox

No image or batch access · low quota

Route Metrics

p95 / success / cost

GPT-4.1 1.8s · 99.8%

$2.82 / 1M tok

Claude Sonnet 2.9s · 99.1%

Warm lane due to temporary throttle

Gemini 1.4s · 99.6%

Used for regional enterprise fallback

Recent Traffic

last 5 requests

Time Client Model Status Latency Tokens Cost

12:31:08 playground gpt-4.1 200 ok 1.84s 472 $0.0018

12:30:41 web-app auto -> claude 200 ok 2.91s 1,204 $0.0063

12:30:15 sdk-node deepseek 502 upstream 0.62s 0 $0.0000

12:29:58 enterprise-cn gemini 200 ok 1.42s 388 $0.0015

12:29:14 web-app auto -> gpt-4.1 429 retry 2.27s 519 $0.0020

Pricing

Sell the gateway without hiding the mechanics.

The homepage can still close the user: simple plan framing, usage-based language, and a clear path from docs to console. It should feel commercial, not like an internal admin dump.

Starter

For evaluation, single route setup, and lightweight testing.

1 relay endpoint

3 client keys

7-day request logs

Growth

$99

For shipping a paid relay with routing logic and client segmentation.

Route policies and fallback chains

Per-key quotas and rate limits

Margin and token cost views

Enterprise

Custom

For multi-region traffic, private deployment, and dedicated billing lanes.

Tenant isolation

Audit logging and exports

Regional traffic controls

Quick API Shape

single endpoint

POST /v1/chat/completions
Authorization: Bearer rk_live_web_01
Content-Type: application/json

{
  "model": "auto",
  "messages": [
    { "role": "user", "content": "Explain our refund policy briefly." }
  ],
  "response_format": { "type": "json_object" }
}

Why this direction

homepage + console