Organic

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
limits [2026/04/19 05:05] – update limits with payg caveat gwyntellimits [2026/04/19 22:16] (current) – Added rate-limit evolution and v3 experiment details gwyntel
Line 101: Line 101:
 Synthetic also provide //unlimited// access to [[models:nomic-embed-text-15|Nomic Embed Text 1.5]], an embedding model, and two LORAs they trained for [[harnesses/octofriend]], ''[[https://huggingface.co/syntheticlab/diff-apply|hf:syntheticlab/diff-apply]]'' and ''[[https://huggingface.co/syntheticlab/fix-json|hf:syntheticlab/fix-json]]''. Like any model from Synthetic, they're usable in any harness. ''diff-apply'' applies find-replace style diffs to code and ''fix-json'' fixes broken JSON tool calls. They require particular input/output formats, so be sure to check the model cards. Synthetic also provide //unlimited// access to [[models:nomic-embed-text-15|Nomic Embed Text 1.5]], an embedding model, and two LORAs they trained for [[harnesses/octofriend]], ''[[https://huggingface.co/syntheticlab/diff-apply|hf:syntheticlab/diff-apply]]'' and ''[[https://huggingface.co/syntheticlab/fix-json|hf:syntheticlab/fix-json]]''. Like any model from Synthetic, they're usable in any harness. ''diff-apply'' applies find-replace style diffs to code and ''fix-json'' fixes broken JSON tool calls. They require particular input/output formats, so be sure to check the model cards.
  
 +
 +===== Rate-Limit Evolution =====
 +
 +<WRAP center round info 60%>
 +Based on public statements from Synthetic staff. The rate limiting system has gone through several iterations due to abuse vectors.
 +</WRAP>
 +
 +^ Phase ^ System ^ Problem ^
 +| **v1** | X requests per 5 hours + free tool calls | Users formatted any request as a tool call to get free requests. 3 users consumed >1/3 of total capacity. |
 +| **v2** | Tool calls count as percentage of requests (e.g. 10%) | Percentage-based discount could still be abused for ~10x the quota. |
 +| **v3** (current) | Weekly token quota ($24/week per pack) + 500 requests per 5 hours | Token-based weekly limit eliminates tool call abuse. Requests weighted by output token cost. |
 +
 +The **rate-limit-v3 experiment** launched on April 7, 2026 after three weeks of opt-in testing. Key changes:
 +
 +  - **5-hour requests**: 500 per pack (up from 135), weighted by output token cost
 +  - **Weekly tokens**: $24.00 worth of compute per pack (replaces daily tool call limits)
 +  - **Tool calls**: No longer separately counted — all usage flows through the weekly token quota
 +  - **Founder's packs**: 50% more ($36/week tokens, 750 requests/5 hours)
 +  - **Concurrency**: 1 at a time per pack (2 for "small models": Nemotron 3 Super & GLM-4.7-Flash)
 +
 +<WRAP center round tip>
 +The weekly token quota means you don't need to think about "saving" tool calls vs regular requests anymore. Everything is just tokens. Unused 5-hour requests may eventually get "refunded" into the weekly token quota (proposed but not yet confirmed).
 +</WRAP>
 +
 +=== Why Request-Based, Not Token-Based? ===
 +
 +Synthetic chose request-based limits over pure token-based limits for simplicity:
 +
 +  - Token-based pricing encourages gaming (deleting conversation history to save quota, splitting context)
 +  - Request count follows a predictable pattern relative to cost
 +  - With the weekly token quota, the worst of both approaches is mitigated