v1 · routing 4 providers · 30 models

One API for every frontier model.

Sylica is an OpenAI-compatible gateway to OpenAI, Anthropic, xAI, and Google. Prepaid credits or BYOK, streaming by default, fallbacks when an upstream fails.

GET/v1/chat/completionsrequest pipeline

live selection mapsubtle by design

OpenAI

GPT-5, 4.1, o3, o4-mini

Anthropic

Claude Opus 4.5, Sonnet 4.5, Haiku 4.5

xAI

Grok 4, 4 Fast, Code Fast

Google

Gemini 2.5 Pro, Flash

01 · SDK

If you speak OpenAI, you speak Sylica.

One base URL. Same request and response shapes, including streaming and tool use.

Drop-in SDK compatibility

Keep your existing OpenAI-compatible client code and switch only your base URL and API key. Sylica preserves request/response semantics so migration takes hours, not weeks.

No SDK rewrite

OpenAI shape in/out

Streaming-first

SSE tokens as they arrive

Provider portable

Switch models without refactor

Integration flow

1
Set baseURL to https://api.sylicaai.com/v1
2
Use your existing chat.completions request shape
3
Turn on stream=true for low-latency UX
4
Read x-sylica-request-id for debugging and support

TypeScriptcURL

POST /v1/chat/completions

typescript

import OpenAI from "openai";

const sylica = new OpenAI({
  baseURL: "https://api.sylicaai.com/v1",
  apiKey: process.env.SYLICA_API_KEY,
});

const stream = await sylica.chat.completions.create({
  model: "sylica/auto",
  messages: [{ role: "user", content: "Write me a haiku about routing." }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}

bash

curl https://api.sylicaai.com/v1/chat/completions \
  -H "Authorization: Bearer $SYLICA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-opus-4.5",
    "stream": true,
    "messages": [{ "role": "user", "content": "hello" }]
  }'

Realtime response preview

02 · Pipeline

Five stages between your SDK and the model.

Auth, rate limit, route, adapter, and credits -- on a p50 routing overhead under 100 ms.

01schema

Unified schema

OpenAI chat completions in, OpenAI chat completions out. Adapters normalize every provider.

02stream

SSE, end to end

No buffering. Tokens hit your socket the moment the upstream emits them.

03router

Meta-models

Ask for sylica/auto, sylica/cheap, or sylica/fast. The router picks a concrete model per request.

04credits

Per-token billing

Debits are atomic against your org's credit balance. 429 when empty -- never a surprise invoice.

05auth

Rate limits + keys

Token-bucket per key, per model. Scoped keys with deny-lists for production safety.

06otel

OpenTelemetry

Every request emits spans with provider, TTFB, and cost. Export to your own backend.

03 · Routing

The router is a small, legible scoring function.

Every eligible model gets a score from cost, latency, and live health. The top score wins. If it fails before the first byte, the next score wins -- automatically, on the same stream.

POST · model: sylica/autoweights · cost 0.4 · lat 0.4 · health 0.2

Timeline · single streaming request

Routing: 32 ms
TTFB: 180 ms
Total: 2.4 s

04 · Observability

Every token is accounted for.

Per-key usage, per-model breakdowns, live p50/p95 latency, and spend -- as a dashboard, an API, or OTLP spans.

Request log
Every call is stored for 30 days with model, provider, tokens, cost, and latency.
Usage API
GET /v1/usage returns JSON aggregates. Stream to your data warehouse on a cron.
OpenTelemetry
Enable OTEL_EXPORTER_OTLP_ENDPOINT and Sylica will emit traces and metrics alongside your own.
Alerting
Optional webhooks on spend thresholds, error spikes, or per-key rate-limit hits.

05 · Security

BYOK, with real encryption -- not just a claim.

Provider keys are sealed with AES-256-GCM using a master key held outside Postgres, decrypted only on the request path, and never logged.

At rest

AES-256-GCM · per-row IV

In transit

TLS 1.3 only · HSTS

Isolation

Org-scoped keys · zero cross-tenant reads

06 · Catalog

Ten of the 30+ models we serve today.

Prices are per 1M tokens. The full list -- including context windows, tool-use, and vision flags -- lives in the dashboard.

Model	Context	Input / 1M	Output / 1M	Traits
openai/gpt-5	400k	$2.50	$10.00	reasoningtoolsvision
openai/gpt-4.1	1,047.576k	$2.00	$8.00	toolsvision
openai/o3	200k	$2.00	$8.00	reasoningtoolsvision
anthropic/claude-opus-4.5	200k	$5.00	$25.00	reasoningtoolsvision
anthropic/claude-sonnet-4.5	200k	$3.00	$15.00	reasoningtoolsvision
anthropic/claude-haiku-4.5	200k	$1.00	$5.00	toolsvision
xai/grok-4	256k	$3.00	$15.00	reasoningtoolsvision
xai/grok-4-fast	2,000k	$0.20	$0.50	toolsvision
google/gemini-2.5-pro	1,048.576k	$1.25	$10.00	reasoningtoolsvision
google/gemini-2.5-flash	1,048.576k	$0.30	$2.50	reasoningtoolsvision

07 · Pricing

Two ways to pay. Neither involves a sales call.

Prepaid credits settle at published per-token rates. BYOK passes through at 0%.

Credits

Prepay any amount. Spend it across every model. Top up when empty.

Per-token metered billing
Auto-topup thresholds
Receipts + Stripe portal

BYOK

Zero markup

Bring your OpenAI/Anthropic/xAI/Google keys. Pay the providers directly.

0% markup from Sylica
AES-256-GCM at rest
Mix with credits per-request

Start with a single curl.

Keys are issued instantly. No card required for the first $5 of credits.