This section is based on public statements from Synthetic staff and community observations. Details may change as Synthetic adds hardware or changes providers.
Synthetic runs on a mix of B200 (Blackwell) and H200 (Hopper) GPUs from multiple providers. As of April 2026, they have onboarded a 4th GPU provider after a series of outages.
A single B200 node costs roughly $400k. GLM-5/5.1 requires 2×B200s per replica (4 GPUs at tp4). Breakeven on a B200 pair is roughly 400 subscribers at $30/mo.
Synthetic uses both SGLang and vLLM depending on the model:
Both engines have bugs that Synthetic regularly patches. The main differentiators for Synthetic are:
Synthetic’s approach is to run standard NVFP4/FP8 quants on standard inference stacks with targeted patches, rather than trying fancy cost-cutting tricks that other inference providers use. “Other companies try too-fancy stuff to cut costs.” — matt
Synthetic uses Eagle3 speculative decoding for Kimi K2.5:
Synthetic uses a mix of self-hosted and proxied models:
General rule: Synthetic self-hosts newer/frontier models and proxies older models. When a newer model replaces an older one, the older model is typically proxied. Proxy duration depends on load — people usually switch quickly, so load is low and proxies can stay around for a while.
Proxied models forward the price Synthetic pays the underlying inference provider, which may differ from self-hosted pricing.
As of April 2026, Synthetic uses 4 GPU providers. Previously, all models were on reserved GPUs from TogetherAI. The provider landscape has been unstable — in mid-April 2026, all 3 original providers experienced simultaneous outages.
Synthetic has experimented with sharding models across multiple nodes:
This is particularly relevant for fine-grained MoE models where KV cache is the bottleneck.
Synbad is Synthetic’s internal testing tool that runs payloads against inference engines to detect bugs. It’s used as a repository of test payloads to validate against new SGLang/vLLM releases, since many bugs are payload-specific. Synbad Proxy can also be used by users to intercept and capture failing payloads for debugging.