← Back to blogs
architecturereliability

How Fallback Routing Works in Sylica

A practical walkthrough of how Sylica handles retryable failures and preserves streaming semantics.

Fallbacks only work safely before data is emitted to the caller. Once the first token is streamed, swapping providers would break response continuity.

Sylica evaluates retryability using upstream status and error class. Timeouts, 429, and most 5xx failures are treated as retryable.

Routing attempts are precomputed from model policy and provider constraints. The first healthy candidate is selected; later candidates stay warm for quick failover.

This approach keeps behavior predictable while still improving reliability under provider throttling or transient outages.