Automatic Prompt Token Budgeting

Sidecar automatically reduces prompt size during task prompt compilation to lower token usage while preserving prompt quality.

Beta: Prompt token budgeting is currently in beta and may evolve as we tune trimming behavior and metadata.

What it does

When Sidecar compiles an execution brief (sidecar prompt compile or task execution flow), it:

Estimates prompt size.
Keeps required execution-critical sections.
Deduplicates repeated items.
Trims optional high-volume context sections only when needed.
Adds concise overflow references (for example: + 12 more ... (see task packet for full list)).

Required sections always preserved

Task
Objective
Why this matters
Constraints
Validation
Definition of done
Runner guidance
Final response format

Optional sections that may be trimmed

In scope
Out of scope
Read these first (files_to_read)
Avoid changing (files_to_avoid)
Linked decisions
Linked notes

Budget behavior (current defaults)

Target prompt budget: 1200 estimated tokens
Safety ceiling: 1500 estimated tokens

If the baseline prompt is under budget, no trimming happens.

Where to see optimization results

Prompt optimization metadata is attached to run records and prompt compile output:

prompt_tokens_estimated_before
prompt_tokens_estimated_after
prompt_budget_target
prompt_trimmed_sections

Example output (JSON)

 prompt_optimization json 
 {
  "prompt_optimization": {
    "estimated_tokens_before": 5881,
    "estimated_tokens_after": 774,
    "budget_target": 1200,
    "budget_max": 1500,
    "trimmed_sections": [
      "in_scope",
      "out_of_scope",
      "files_to_read",
      "related_decisions",
      "related_notes"
    ]
  }
}