Context Budget
The context budget is the finite working memory available to a model during a task. It is consumed by the system prompt, user messages, tool schemas, tool call results, and prior conversation turns. When the budget is exhausted, the model can no longer access earlier work — and if it is poorly managed, quality degrades long before it runs out.
Why it’s both a cost and a quality concern
Token management is reasoning quality management. A context window filled with stale conversation history, redundant tool outputs, and verbose system instructions becomes noisy — the model must attend to an increasingly diluted signal. This is context rot: the reasoning quality decline caused by accumulated irrelevant history.
Context rot is silent. The model continues producing responses, but their relevance and accuracy drift as the useful signal-to-noise ratio falls.
Consumed by
- System prompt and persistent instructions
- File contents loaded into context
- Tool schemas (can be large if many tools are active)
- Tool call results returned to the model
- Prior conversation turns and intermediate outputs
Management practices
Before the task:
- Write lean system prompts — avoid encyclopedic instructions
- Activate only tools needed for this workflow
- Limit files to the smallest folder containing the relevant work
During the task:
- Use plan mode for complex or risky work before running actions
- Insert checkpoints to verify direction before continuing
- Watch for signs of context rot (unexpected direction changes, repetition)
After a task segment:
- Use
/clearor equivalent to reset context when starting different work - Convert repeatable instructions into reusable skills rather than repeating them each session
- Store session IDs and results externally — do not rely on context as the only record
Prompt caching
When stable context (system prompts, large documents) is reused across multiple calls, prompt caching can reduce cost by avoiding re-encoding the same tokens. This is an optimization for workflows with a large, unchanging context prefix.
Related
- AgentLoop — each loop iteration consumes from the context budget
- ClaudeCowork — the Cowork product where context budget management is a primary operating discipline
- ClaudeAgentSDK — SDK sessions preserve context across loop iterations