Organic

Why am I burning through my credits or requests so quickly?

Step 1: Check Your Tools

The software you use with Synthetic has a massive impact on your token “burn rate.”

1. Are you using Claude Code?

Recommendation: Use any other agent. We strongly recommend against using Claude Code as a coding harness with Synthetic. Its underlying infrastructure is notoriously inefficient, causing excessive token bloat. Using Claude Code will drain your credits much faster than almost any other option.

2. Are you using OpenCode (with oh-my-opencode/oh-my-openagent)?

While OpenCode is significantly better than Claude Code, it is still not optimized for efficiency. If you are using the oh-my-opencode/openagent (OMO) extension, the problem is significantly worse: OMO launches unnecessary, poorly designed subagent workflows that bloat every single prompt with redundant context, leading to a “death by a thousand tokens” scenario.

3. Are you using Zed?

Zed is a powerful editor, but its real-time “live-edit” feature comes at a cost. Zed utilizes a two-step process for edits:

  1. Intent to Edit: The main chat (the one you’re talking to) sends a tool call declaring that it wants to edit a file and defining the goal of the edit.
  2. Execution: Zed recieves that tool call and runs a separate request with the same chat history to generate the code in a streaming manner (for the live-diff to work).

Because of this “intent to edit” system, you are essentially using 2x the input tokens and 2x the requests for every single edit. If this workflow is necessary for you, you may need more packs to sustain it.

4. Are you using OpenClaw?

OpenClaw (and similar harnesses) have prompts set on cron jobs, as well as regularly scheduled heartbeats, that can cause them to use a lot of tokens behind your back, without you doing anything. Additionally, these tasks are often relatively long horizon, involving a lot of MCPs and tool calls, and the OpenClaw system prompt is not exactly small, so these impacts can be variably-big.

Step 2: Optimize Your Workflow

If your tools aren’t known for being wasteful but your usage remains high, follow these steps (roughly in order of recommendation) to reduce token bloat:

1. Reduce Frequency

Reduce the frequency of automated workflow runs. For example, if you use OpenClaw, review your current tasks in OpenClaw’s “heartbeat” function and increase the interval between checks.

2. Prompt Efficiency

Refine your input tokens. Use concise system prompts and AGENTS.md files.

3. Model Efficiency

Switch to a cheaper model for simpler tasks; Kimi and GLM don’t need to be running for every single prompt.

4. Serial Model Orchestration

Let your top level chat be with a more expensive model like Kimi or GLM, but have it orchestrate subagents in series (not in parallel) using cheaper models like MiniMax or even Nemotron to execute scoped, specified-in-detail tasks such as editing code or learning about the codebase.

This allows you to get the superior planning, problem solving, prompt and project understanding, and code review capabilities of the better model, but avoid burning tokens having the bigger model grep around your codebase or iterate on a piece of code to satisfy the compiler/linter/test suite, while still having access to the bigger model for problems the smaller ones can’t handle.

5. Limit "Thinking"

For less complex tasks, reduce the model’s budget for thinking. This forces the model to be more direct and prevents it from using tokens on unnecessary internal reasoning.

6. Increase Packs

If your workflow is already lean but you still hit limits, it may be time to upgrade your number of packs to match your professional output.