Runners & Replay

Sidecar's run surface executes tasks against AI coding agents and records a run with the compiled prompt, runner output, changed files, and validation results. The basic form is stable; advanced pipelines and replay ship on the beta channel.


Starting a run Stable

Run a task packet by id:

$ sidecar run T-001
$ sidecar run T-001 --runner claude --agent-role builder-app
$ sidecar run T-001 --dry-run

Flags:

  • --runner codex|claude — override the default runner.
  • --agent-role <role> — pass an agent role into the compiled prompt.
  • --dry-run — compile the prompt and create the run record without executing.
  • --json — machine-readable envelope output.

Inspecting runs Stable

$ sidecar run list --task T-001
$ sidecar run show R-007
$ sidecar run queue
$ sidecar run start-ready --dry-run
  • run list — list runs, optionally filtered to a task.
  • run show — full run detail: prompt path, status, lifecycle, validation results, changed files.
  • run queue — the ready-to-run queue from the current project.
  • run start-ready — start every task with status ready.

Dual-runner pipelines Beta

Pass a comma-separated list to --runner and Sidecar runs each runner sequentially on the same task, feeding the previous run's summary into the next runner's compiled prompt as linked context.

$ sidecar run T-001 --runner codex,claude
$ sidecar run T-001 --runner codex,claude --agent-role builder-app --dry-run

Each step produces its own run record linked back to the first run via parent_run_id. Use pipelines to pair a planner/builder runner with a reviewer runner, or to compare runners head-to-head on the same task. The CLI prints a one-line summary per step (runner, agent role, run id, status, duration, changed file count) and the full pipeline envelope is available with --json.

Replaying a run Beta

When a run finishes — ok, blocked, or failed — you can start a fresh run against the same task with a link back to the original via sidecar run replay. It's the fastest way to retry a different runner, re-run after fixing a blocker, or fork a green run into an experiment without losing the audit trail.

$ sidecar run replay R-001 --reason "retry with claude after codex blocker"
$ sidecar run replay R-001 --runner claude --agent-role builder-app --edit-prompt
$ sidecar run replay R-001 --reason "dry-run with new validation" --dry-run

Flags:

  • --runner codex|claude — override the parent's runner (defaults to parent's runner).
  • --agent-role <role> — override the parent's agent role.
  • --reason "<text>" — stored on the new run as replay_reason; surfaced in CLI and UI.
  • --edit-prompt — opens the compiled prompt in $VISUAL/$EDITOR before the runner starts.
  • --dry-run — compile the prompt and create the run record without executing the runner.

The new run carries parent_run_id: "R-001" plus your replay_reason. sidecar run show <id> renders the lineage both ways: the parent shows Replayed as: with each child run id, and each child shows Replay of: with the parent id and reason. The UI's Run Detail panel mirrors this with clickable links for one-hop navigation through a lineage.


Validation kinds Beta

Task packets describe post-run validation commands under execution.commands.validation. Each entry is tagged with a kind so Sidecar can route the result intelligently, apply a sensible default timeout, and surface the outcome in the UI and CLI.

Kind Default timeout Intended for
typecheck3 mintsc --noEmit, mypy, pyright
lint3 mineslint, ruff, golangci-lint
test10 minunit/integration suites
build10 minbundlers, compilers, image builds
custom5 minanything else (the legacy default)

Authoring

On the CLI, prefix a command with kind:. Entries without a prefix default to custom.

$ sidecar task create \
    --title "Add import flow" \
    --summary "..." --goal "..." \
    --validate-cmds "typecheck:tsc --noEmit,lint:eslint .,test:npm test"

In a task packet JSON file, use the object form:

"execution": {
  "commands": {
    "validation": [
      { "kind": "typecheck", "command": "tsc --noEmit" },
      { "kind": "test", "command": "npm test", "timeout_ms": 900000 },
      "bash scripts/smoke.sh"
    ]
  }
}

String entries are still accepted for back-compat and promoted to { kind: "custom", command } on load.

Auto-approve on all-green Beta

When every typed validation step passes for a run, Sidecar can auto-approve the run so you don't have to click through the review queue for a strictly-green outcome. It's opt-in:

.sidecar/preferences.json json
{ "review": { "autoApproveOnAllGreen": true } }

Behavior when enabled:

  • The run's review_state flips to approved, reviewed_by is set to sidecar:auto, and the review note records how many steps passed.
  • Runs with zero configured validation steps are not auto-approved — a runner-only success still requires a human click.
  • Any failing step blocks the run as today (task moves to blocked, blocker message includes the kind).

Related