v1 · routing 4 providers · 30 models

One API for every frontier model.

Sylica is an OpenAI-compatible gateway to OpenAI, Anthropic, xAI, and Google. Prepaid credits or BYOK, streaming by default, fallbacks when an upstream fails.

GET/v1/chat/completionsrequest pipeline
Your appopenai.chat…01AuthAPI key02Rate limittoken bucket03Routerscore & pick04AdapterOpenAI · Anthropic · xAI · Google05Creditsmeter & debitUpstream providersServer-Sent Events back to the caller
live selection mapsubtle by design
dynamic routing fabriclow-latency path selected
OpenAI
GPT-5, 4.1, o3, o4-mini
Anthropic
Claude Opus 4.5, Sonnet 4.5, Haiku 4.5
xAI
Grok 4, 4 Fast, Code Fast
Google
Gemini 2.5 Pro, Flash
01 · SDK

If you speak OpenAI, you speak Sylica.

One base URL. Same request and response shapes, including streaming and tool use.

Drop-in SDK compatibility

Keep your existing OpenAI-compatible client code and switch only your base URL and API key. Sylica preserves request/response semantics so migration takes hours, not weeks.

No SDK rewrite
OpenAI shape in/out
Streaming-first
SSE tokens as they arrive
Provider portable
Switch models without refactor

Integration flow

  1. 1

    Set baseURL to https://api.sylicaai.com/v1

  2. 2

    Use your existing chat.completions request shape

  3. 3

    Turn on stream=true for low-latency UX

  4. 4

    Read x-sylica-request-id for debugging and support

TypeScriptcURL
POST /v1/chat/completions
typescript
import OpenAI from "openai";

const sylica = new OpenAI({
  baseURL: "https://api.sylicaai.com/v1",
  apiKey: process.env.SYLICA_API_KEY,
});

const stream = await sylica.chat.completions.create({
  model: "sylica/auto",
  messages: [{ role: "user", content: "Write me a haiku about routing." }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}
bash
curl https://api.sylicaai.com/v1/chat/completions \
  -H "Authorization: Bearer $SYLICA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-opus-4.5",
    "stream": true,
    "messages": [{ "role": "user", "content": "hello" }]
  }'
Realtime response preview
~/apps/worker — node 22.22$ node index.mjsconnecting api.sylicaai.com …key sk-sylica-…4c1a (prod)routed: openai/gpt-5 (p95 420ms) Tokens fall from the sky, each request a quiet path— Sylica decides.done · 2.4s · $0.0021
02 · Pipeline

Five stages between your SDK and the model.

Auth, rate limit, route, adapter, and credits -- on a p50 routing overhead under 100 ms.

01schema

Unified schema

OpenAI chat completions in, OpenAI chat completions out. Adapters normalize every provider.

02stream

SSE, end to end

No buffering. Tokens hit your socket the moment the upstream emits them.

03router

Meta-models

Ask for sylica/auto, sylica/cheap, or sylica/fast. The router picks a concrete model per request.

04credits

Per-token billing

Debits are atomic against your org's credit balance. 429 when empty -- never a surprise invoice.

05auth

Rate limits + keys

Token-bucket per key, per model. Scoped keys with deny-lists for production safety.

06otel

OpenTelemetry

Every request emits spans with provider, TTFB, and cost. Export to your own backend.

03 · Routing

The router is a small, legible scoring function.

Every eligible model gets a score from cost, latency, and live health. The top score wins. If it fails before the first byte, the next score wins -- automatically, on the same stream.

POST · model: sylica/autoweights · cost 0.4 · lat 0.4 · health 0.2
MODELCOSTLATENCYHEALTHSCOREopenai/gpt-5827410086PICKEDanthropic/claude-opus-4.5708810084xai/grok-4768010082google/gemini-2.5-pro90728078
Timeline · single streaming request
0.0s0.6s1.2s1.8s2.4sPOST /v1/chat/completionsroute decidedupstream openfirst tokenstream done
Routing
32 ms
TTFB
180 ms
Total
2.4 s
04 · Observability

Every token is accounted for.

Per-key usage, per-model breakdowns, live p50/p95 latency, and spend -- as a dashboard, an API, or OTLP spans.

dashboard.sylicaai.comOverviewPlaygroundKeysModelsBillingREQUESTS12,430INPUT8.4MOUTPUT2.1MSPEND$38.22REQUESTS · LAST 7 DAYSTOP MODELSopenai/gpt-542%anthropic/claude-opus-4.531%google/gemini-2.5-pro18%
  • Request log
    Every call is stored for 30 days with model, provider, tokens, cost, and latency.
  • Usage API
    GET /v1/usage returns JSON aggregates. Stream to your data warehouse on a cron.
  • OpenTelemetry
    Enable OTEL_EXPORTER_OTLP_ENDPOINT and Sylica will emit traces and metrics alongside your own.
  • Alerting
    Optional webhooks on spend thresholds, error spikes, or per-key rate-limit hits.
05 · Security

BYOK, with real encryption -- not just a claim.

Provider keys are sealed with AES-256-GCM using a master key held outside Postgres, decrypted only on the request path, and never logged.

Dashboardsk-ant-…● ● ● ● ● ●AES-256-GCMper-row IV · sealedbyok_provider_keysPostgres · encrypted at rest0x9f1d…ea0c1 · user pastes key2 · encrypted with BYOK_MASTER_KEY3 · decrypted only per-request
At rest
AES-256-GCM · per-row IV
In transit
TLS 1.3 only · HSTS
Isolation
Org-scoped keys · zero cross-tenant reads
06 · Catalog

Ten of the 30+ models we serve today.

Prices are per 1M tokens. The full list -- including context windows, tool-use, and vision flags -- lives in the dashboard.

ModelContextInput / 1MOutput / 1MTraits
openai/gpt-5
400k$2.50$10.00reasoningtoolsvision
openai/gpt-4.1
1,047.576k$2.00$8.00toolsvision
openai/o3
200k$2.00$8.00reasoningtoolsvision
anthropic/claude-opus-4.5
200k$5.00$25.00reasoningtoolsvision
anthropic/claude-sonnet-4.5
200k$3.00$15.00reasoningtoolsvision
anthropic/claude-haiku-4.5
200k$1.00$5.00toolsvision
xai/grok-4
256k$3.00$15.00reasoningtoolsvision
xai/grok-4-fast
2,000k$0.20$0.50toolsvision
google/gemini-2.5-pro
1,048.576k$1.25$10.00reasoningtoolsvision
google/gemini-2.5-flash
1,048.576k$0.30$2.50reasoningtoolsvision
07 · Pricing

Two ways to pay. Neither involves a sales call.

Prepaid credits settle at published per-token rates. BYOK passes through at 0%.

Credits

Prepay any amount. Spend it across every model. Top up when empty.

  • Per-token metered billing
  • Auto-topup thresholds
  • Receipts + Stripe portal
BYOK
Zero markup

Bring your OpenAI/Anthropic/xAI/Google keys. Pay the providers directly.

  • 0% markup from Sylica
  • AES-256-GCM at rest
  • Mix with credits per-request

Start with a single curl.

Keys are issued instantly. No card required for the first $5 of credits.