Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| limits [2026/04/10 01:06] – xenolandscapes | limits [2026/04/19 22:16] (current) – Added rate-limit evolution and v3 experiment details gwyntel | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| - | ===== Synthetic | + | ===== Synthetic Limits and Pricing ===== |
| ==== Subscription Pricing ==== | ==== Subscription Pricing ==== | ||
| Line 7: | Line 7: | ||
| **A base subscription is $30/mo**, and corresponds to one " | **A base subscription is $30/mo**, and corresponds to one " | ||
| - | This subscription provides a fix set of **per-hour requests** | + | This subscription provides a fixed set of **5-hour requests**, **weekly token limits**, and **concurrency**. |
| - | **If you add a '' | + | **If you add a '' |
| **You can do this for up to 5 packs in total.** | **You can do this for up to 5 packs in total.** | ||
| + | |||
| + | === Founder' | ||
| + | |||
| + | <WRAP center round info> | ||
| + | **Founder' | ||
| + | </ | ||
| ==== Subscription Limits ==== | ==== Subscription Limits ==== | ||
| Line 20: | Line 26: | ||
| | **5-Hour Requests** | 500 requests (weighted by output token cost) | 5% every 3 minutes | Prevents hardware/ | | **5-Hour Requests** | 500 requests (weighted by output token cost) | 5% every 3 minutes | Prevents hardware/ | ||
| | **Weekly Tokens** | $24.00 worth of compute (uses [[https:// | | **Weekly Tokens** | $24.00 worth of compute (uses [[https:// | ||
| + | | **Concurrency** | 1 at a time (2 for "small models: Nemotron 3 Super & GLM 4.7 Flash" | ||
| | **Price** | $30.00 / month | N/A | Subsidizes "power users" | | | **Price** | $30.00 / month | N/A | Subsidizes "power users" | | ||
| - | These limits deplete and recharge like a stamina or mana bar in a video game, instead of emptying out completely and then only refreshing at the end of the time limit, to avoid completely locking you out. The percentage each recharges by is calibrated to let a steady, not-too-small trickle of work continue to get done just based on the recharge increment if you're truly desperate. | + | Founder' |
| + | |||
| + | These limits deplete and recharge like a stamina or mana bar in a video game, instead of emptying out completely and only refreshing at the end of the time limit, to avoid completely locking you out. The percentage each recharges by is calibrated to let a steady, not-too-small trickle of work continue to get done just based on the recharge increment if you're truly desperate. | ||
| <WRAP tip center round> | <WRAP tip center round> | ||
| Line 30: | Line 39: | ||
| The purpose of each of these bars is different: | The purpose of each of these bars is different: | ||
| - | - The hourly | + | - The 5-hour |
| - The weekly limit is designed to ensure that nobody is able to use so much compute — which costs money, via electricity prices — that their subscription becomes //too// unprofitable for Synthetic. | - The weekly limit is designed to ensure that nobody is able to use so much compute — which costs money, via electricity prices — that their subscription becomes //too// unprofitable for Synthetic. | ||
| + | |||
| + | ==== History: Rate Limit Changes ==== | ||
| + | |||
| + | <WRAP center round> | ||
| + | Synthetic overhauled rate limits in mid-April 2026 after a 3-week opt-in experiment. Key changes: | ||
| + | </ | ||
| + | |||
| + | - **Replaced daily tool call quota with weekly token-based quota** — the old system counted tool calls separately (500/day), which meant you could exhaust your tool calls while still having regular request budget unused. Token-based counting is more flexible and fairer: less load = less limiting, more load = more limiting. | ||
| + | - **5-hour request limit increased from 135 to 500 per pack** — previously max ~1,148 requests/ | ||
| + | - **Weekly token quota introduced at $24/week per pack** — guaranteed to always be better value than PAYG API pricing. Previously, some usage patterns were actually cheaper on PAYG than subscription. | ||
| + | - **Continuous regeneration** — hitting your weekly quota doesn' | ||
| + | |||
| + | ==== Cache Hit Reporting ==== | ||
| + | |||
| + | <WRAP info round> | ||
| + | Cache hits and misses are now reported in API responses using the standard OpenAI and Anthropic response formats. This lets you see exactly how many of your input tokens were served from cache (and thus discounted 80%). | ||
| + | </ | ||
| + | |||
| + | The 80% cache-read discount on the weekly token quota is **subscription-only for now**. Synthetic has stated they plan to roll cache discounting out to PAYG in the future, but currently PAYG users pay full price for all tokens, including cache hits. | ||
| + | |||
| + | See [[: | ||
| + | |||
| + | ==== Request cost scaling ==== | ||
| + | Synthetic scales the " | ||
| ==== Profitability ==== | ==== Profitability ==== | ||
| Line 60: | Line 93: | ||
| * [[https:// | * [[https:// | ||
| - | * [[https:// | + | * [[https:// |
| * [[https:// | * [[https:// | ||
| + | |||
| + | ===== Additional Limits ===== | ||
| + | |||
| + | Synthetic also provide // | ||
| + | |||
| + | |||
| + | ===== Rate-Limit Evolution ===== | ||
| + | |||
| + | <WRAP center round info 60%> | ||
| + | Based on public statements from Synthetic staff. The rate limiting system has gone through several iterations due to abuse vectors. | ||
| + | </ | ||
| + | |||
| + | ^ Phase ^ System ^ Problem ^ | ||
| + | | **v1** | X requests per 5 hours + free tool calls | Users formatted any request as a tool call to get free requests. 3 users consumed >1/3 of total capacity. | | ||
| + | | **v2** | Tool calls count as percentage of requests (e.g. 10%) | Percentage-based discount could still be abused for ~10x the quota. | | ||
| + | | **v3** (current) | Weekly token quota ($24/week per pack) + 500 requests per 5 hours | Token-based weekly limit eliminates tool call abuse. Requests weighted by output token cost. | | ||
| + | |||
| + | The **rate-limit-v3 experiment** launched on April 7, 2026 after three weeks of opt-in testing. Key changes: | ||
| + | |||
| + | - **5-hour requests**: 500 per pack (up from 135), weighted by output token cost | ||
| + | - **Weekly tokens**: $24.00 worth of compute per pack (replaces daily tool call limits) | ||
| + | - **Tool calls**: No longer separately counted — all usage flows through the weekly token quota | ||
| + | - **Founder' | ||
| + | - **Concurrency**: | ||
| + | |||
| + | <WRAP center round tip> | ||
| + | The weekly token quota means you don't need to think about " | ||
| + | </ | ||
| + | |||
| + | === Why Request-Based, | ||
| + | |||
| + | Synthetic chose request-based limits over pure token-based limits for simplicity: | ||
| + | |||
| + | - Token-based pricing encourages gaming (deleting conversation history to save quota, splitting context) | ||
| + | - Request count follows a predictable pattern relative to cost | ||
| + | - With the weekly token quota, the worst of both approaches is mitigated | ||