R
RelayCloud LLM API Gateway for routing, observability, and billing control
API Relay for OpenAI, Claude, Gemini, DeepSeek

One endpoint for model routing and revenue control.

Launch a branded LLM API relay with usage metering, upstream failover, client keys, routing policy, and request logs. The first screen sells the product. The second half shows operators exactly how the system behaves.

Today Requests 1.28M Across 37 client keys and 4 upstream pools
Success Rate 99.42% Auto failover and retry orchestration enabled
Gross Margin 31.6% Usage markup and route-level cost optimization
relaycloud.io / production

Smoke Test Playground

latency 1.84s
Auto Route GPT-4.1 Claude Gemini

Summarize a SaaS refund policy in concise Chinese and return a JSON object with title, bullets, and tone.

Decision Trace live
policy=balanced · region=sg · json=true · margin floor=26%
Run Cost $0.0018
472 tokens · schema pass · no fallback triggered
Capabilities

Developer product on top, operator tools underneath.

The point is not just to proxy requests. A viable relay needs margin visibility, route controls, per-key limits, and enough logs to debug upstream instability without forcing your users to understand your provider stack.

Unified Endpoint

Expose one base URL to your customers while swapping models, regions, and upstream vendors behind policy rules.

Routing Policies

Choose balanced, lowest-cost, premium, or client-specific routing with fallback chains for partial outages and quota pressure.

Billing Control

Price above token cost, watch margin by route, and isolate high-volume clients before they eat your cheap lane.

Console

Homepage promise backed by real operator surfaces.

This lower section drops into the control plane. It keeps the information density of an infrastructure dashboard while still living inside the same branded landing page.

Request Playground

test request
Auto Route JSON Mode Vision Streaming

Create a concise support response explaining subscription cancellation terms, then produce a JSON payload for frontend rendering.

{
  "route": "gpt-4.1",
  "status": "ok",
  "latency_ms": 1840,
  "tokens": 472,
  "cost_usd": 0.0018,
  "fallback_used": false
}

Client Keys

3 examples
rk_live_web_01 active
Quota 600k req/day · last used 12:31
rk_enterprise_cn allowlist
Dedicated billing lane · CN/HK routing
rk_test_playground sandbox
No image or batch access · low quota

Route Metrics

p95 / success / cost
GPT-4.1 1.8s · 99.8%
$2.82 / 1M tok
Claude Sonnet 2.9s · 99.1%
Warm lane due to temporary throttle
Gemini 1.4s · 99.6%
Used for regional enterprise fallback

Recent Traffic

last 5 requests
Time Client Model Status Latency Tokens Cost
12:31:08 playground gpt-4.1 200 ok 1.84s 472 $0.0018
12:30:41 web-app auto -> claude 200 ok 2.91s 1,204 $0.0063
12:30:15 sdk-node deepseek 502 upstream 0.62s 0 $0.0000
12:29:58 enterprise-cn gemini 200 ok 1.42s 388 $0.0015
12:29:14 web-app auto -> gpt-4.1 429 retry 2.27s 519 $0.0020
Pricing

Sell the gateway without hiding the mechanics.

The homepage can still close the user: simple plan framing, usage-based language, and a clear path from docs to console. It should feel commercial, not like an internal admin dump.

Starter

$0

For evaluation, single route setup, and lightweight testing.

1 relay endpoint
3 client keys
7-day request logs

Growth

$99

For shipping a paid relay with routing logic and client segmentation.

Route policies and fallback chains
Per-key quotas and rate limits
Margin and token cost views

Enterprise

Custom

For multi-region traffic, private deployment, and dedicated billing lanes.

Tenant isolation
Audit logging and exports
Regional traffic controls