Automatic Prompt Token Budgeting
Sidecar automatically reduces prompt size during task prompt compilation to lower token usage while preserving prompt quality.
Beta: Prompt token budgeting is currently in beta and may evolve as we tune trimming behavior and metadata.
What it does
When Sidecar compiles an execution brief (sidecar prompt compile or
task execution flow), it:
- Estimates prompt size.
- Keeps required execution-critical sections.
- Deduplicates repeated items.
- Trims optional high-volume context sections only when needed.
- Adds concise overflow references (for example:
+ 12 more ... (see task packet for full list)).
Required sections always preserved
- Task
- Objective
- Why this matters
- Constraints
- Validation
- Definition of done
- Runner guidance
- Final response format
Optional sections that may be trimmed
- In scope
- Out of scope
- Read these first (
files_to_read) - Avoid changing (
files_to_avoid) - Linked decisions
- Linked notes
Budget behavior (current defaults)
- Target prompt budget:
1200estimated tokens - Safety ceiling:
1500estimated tokens
If the baseline prompt is under budget, no trimming happens.
Where to see optimization results
Prompt optimization metadata is attached to run records and prompt compile output:
prompt_tokens_estimated_beforeprompt_tokens_estimated_afterprompt_budget_targetprompt_trimmed_sections
Example output (JSON)
prompt_optimization json
{
"prompt_optimization": {
"estimated_tokens_before": 5881,
"estimated_tokens_after": 774,
"budget_target": 1200,
"budget_max": 1500,
"trimmed_sections": [
"in_scope",
"out_of_scope",
"files_to_read",
"related_decisions",
"related_notes"
]
}
}