Supervisor
Why. Autonomous Kanban pipeline driving 5 agents (Researcher β Architect β TestDesigner β Developer β Auditor) through GitHub Project board status transitions. Creates worktrees, manages quality gates, creates PRs. Full development lifecycle automation.
How it works. Triggered via /supervisor <issue-number>. Reads config from .pi/settings.json. Fetches GitHub issue, filters by trusted codeowners. Research dedup gate β if ## Research Findings already exists in issue comments, researcher is skipped. Creates git worktree at ../<branch-prefix><issue-number>/. Dispatches agents per board status. Structured JSON agent output with action, findings, commentBody, and targetStatus for feedback loops. Posts results as GitHub comments, moves board cards. Pre-transition quality gates between Implementation and Audit. Auditor approves/rejects with structured findings across 8 audit dimensions. Score computed deterministically β must meet auditScoreThreshold (default 0.75). Rejected issues cycle back to Implementation. Creates PR on approval.
Quick start
- Ensure
.pi/settings.jsonhas asupervisorblock (see Configuration) - Open an issue on your repo
- Assign it to the Backlog column on your project board
- Type
/supervisor <issue-number>in pi
The pipeline runs agent-by-agent, posting GitHub comments at each stage. You watch it live in the chat. When done, a PR is waiting.
Pipeline overview
The pipeline processes issues through 6 stages, each mapped to a project board column:
Backlog β Research β Architecture β TestDesign β Implementation β Audit β Done
ββββββββββ ββββββββββββ βββββββββββ ββββββββββββββ ββββββββββ
βResearchβ βArchitect β βTestDes. β βImplement. β βAuditor β
βββββ¬βββββ βββββ¬βββββββ ββββββ¬βββββ βββββββ¬βββββββ βββββ¬βββββ
β β β β β
βΌ βΌ βΌ β βΌ
GitHub Comment Comment Comment β GitHub Comment
## Research ## Architecture ## Test Plan β ## Audit (approve/reject)
β β β β β
ββββββββββββββ΄ββββββββββββββββ΄βββββββββββββββ β
βββββββΌβββββββββββ
β QUALITY GATES β
β CI β TSC β LSP β
β Dead β Dup β β
β Traceability β
βββββββββ¬βββββββββ
β
βββββββββΌβββββββββ
β PR creation β
β (audit report β
β as body) β
ββββββββββββββββββ
Each stage dispatches a dedicated agent. The agent outputs structured JSON, which the pipeline parses to decide the next status. On approval, a PR is created with the audit report as body.
Agents
| Stage | Agent | What it does | Output |
|---|---|---|---|
| Research | researcher |
Gathers context: codebase docs, web search, existing patterns | GitHub comment with ## Research Findings |
| Architecture | architect |
Designs architecture, identifies components, plans changes | GitHub comment with ## Architecture |
| TestDesign | test-designer |
Writes test plan | GitHub comment with ## Test Plan |
| Implementation | developer |
Writes code, commits and pushes to worktree branch | Git commit + push |
| Audit | auditor |
Reviews code against 8 dimensions, approves or rejects | GitHub comment + structured findings |
Research dedup gate
If the issue already has a ## Research Findings heading in any comment or body, the researcher is skipped entirely. The pipeline infers Architecture as next status directly. This prevents redundant research on re-queued issues.
Agent output protocol
Every agent outputs a final JSON message:
// Non-auditor agents
{ "action": "COMPLETE", "agentName": "developer",
"summary": "...", "commentBody": "## Implementation\n\n..." }
// Auditor
{ "action": "APPROVED" | "REJECTED", "agentName": "auditor",
"commentBody": "## Audit Approved\n\n...",
"auditScore": { "passing": 10, "total": 10 },
"findings": [
{ "severity": "critical", "dimension": "code-quality", "message": "..." }
],
"prTitle": "feat(#N): title", "prBody": "## PR Description\n\n..." }
The pipeline parses the JSON from code fences or brace pairs. It falls back to text markers (ARCHITECTURE_COMPLETE, FEEDBACK_RESEARCH, etc.) and section headings (## Audit Approved) for backward compatibility.
targetStatus β Any agent can emit { "targetStatus": "Research" } to override the normal forward flow. This enables feedback loops (see below).
Agent definitions
Each agentβs capabilities are defined in .pi/extensions/supervisor/agents/<agent>.md with YAML frontmatter:
---
tools: [read, bash, structural_search, ripgrep_search]
extensions: [agent-harness, caveman, piignore, ripgrep-search, scrapling, structural-analyzer, web-search]
skills: [extension-spec]
model: opencode-go/deepseek-v4-flash
thinking: high
entryMarker: Architecture
outputFormat: structured-json
---
Quality gates (pre-transition hooks)
Before moving from Implementation β Audit, 6 gates run in sequence. Any gate failure sends the issue back to Implementation with a combined failure note. The developer sees everything wrong at once.
flowchart LR
A[Implementation β Audit] --> B[CI polling]
B --> C[TSC --noEmit]
C --> D[LSP diagnostics]
D --> E[Dead code check]
E --> F[Duplicate code check]
F --> G[Requirements traceability]
G -- all pass --> H[Proceed to Audit]
G -- any fail --> I[Back to Implementation]
1. CI gating
Polls GitHub check runs for the worktree branch via gh api repos/<repo>/commits/<sha>/check-runs. Polls every 15s up to ciGatingTimeoutSec (default 300s).
| Result | Action |
|---|---|
| All checks pass | Proceed |
| Any check fails | Block β Implementation |
| Checks still pending at timeout | Warning, proceed to audit |
| No checks configured | Skip silently |
gh api error |
Warning, proceed to audit |
If the branch SHA isnβt on remote yet (push still in flight), the gate attempts push recovery: it pushes the branch from worktree, then retries SHA resolution.
2. TSC checkpoint
Runs npx tsc --noEmit on the worktree. Caches diagnostics across calls. Tracks error trend (regression/improvement/stable).
3. LSP pre-audit
Runs real LSP diagnostics on modified files only (files changed since defaultBranch). Groups by extension: .ts β typescript-language-server, .py β pylsp, .rs β rust-analyzer, .go β gopls. Retries up to 3 times.
4. Dead code gate
Runs knip on the full worktree, filters to changed files. Detects unused exports, orphaned imports, dead branches, zombie dependencies. If knip isnβt installed, degrades gracefully with a no_knip status.
5. Duplicate code gate
Runs jscpd on the full worktree, filters to changed files. Classifies clones as exact (Type 1), renamed (Type 2), or near-miss (Type 3). If jscpd isnβt installed, degrades gracefully with a no_jscpd status.
6. Requirements traceability gate
Cross-references the issue checklist against the implementation diff. Runs 5 deterministic checks: checklist keyword coverage, test-file parity, imperative verb direction, file count sanity, comment referencing.
Gate failure notes
- Gate failures do NOT count as Auditor rejections. The
maxRejectionscounter tracks only explicit auditor REJECT decisions. A gate-bounced issue restarts Implementation with full context, not a consumed rejection slot. - All gates run regardless of prior failures. The combined note includes every failing gate, so the developer fixes everything at once.
- When gates block, the user sees a chat message like
"π΄ Pre-Transition Gates Blocked β Returning to Developer"with the failure details.
Feedback loops
The pipeline supports two feedback loops:
Architect β Research
If research findings are insufficient, the architect can send the pipeline back:
- Text: output
FEEDBACK_RESEARCH - JSON: output
{ "action": "COMPLETE", "targetStatus": "Research" }
The researcher is re-dispatched with the architectβs feedback as context. This prevents bad research from contaminating TestDesign, Implementation, and Audit.
Auditor β Implementation
When the auditor REJECTs, the issue goes back to Implementation for fixes. The developer receives the auditorβs structured findings as gateFailureContext in their next prompt. This continues until the auditor approves or maxRejections is exceeded.
Loop 1: Developer β gates β Auditor REJECTS
β
Loop 2: Developer (fixes) β gates β Auditor APPROVES β PR created
Worktree lifecycle
Each pipeline run gets its own isolated git worktree:
βββββββββββββββββββ
β Create Worktree β git worktree add -b <branch> <path> <base>
β β Fallbacks: branch exists β add without -b
β β dir exists β use existing
ββββββββββ¬βββββββββ
βΌ
βββββββββββββββββββ
β Setup β .pi/git/ copied β git submodule update --init
β β β npm ci (2 attempts, non-blocking)
ββββββββββ¬βββββββββ
βΌ
βββββββββββββββββββ
β Agents run here β All 5 agents execute in worktree CWD
β (up to 20 loops) β
ββββββββββ¬βββββββββ
βΌ
βββββββββββββββββββ
β Post-pipeline β Merge conflict check β worktree cleanup
β β (preserved if debug + PR failure)
βββββββββββββββββββ
The worktree is the sandbox. Agents can write, edit, and commit freely without affecting the main repo. On crash or signal, the worktree is cleaned up automatically.
Configuration
Settings live under the supervisor key in .pi/settings.json:
{
"supervisor": {
"repo": "owner/repo",
"projectNumber": 1,
"statusField": "Status",
"statusMapping": {
"Research": "researcher",
"Architecture": "architect",
"TestDesign": "test-designer",
"Implementation": "developer",
"Audit": "auditor"
},
"defaultBranch": "main",
"branchPrefix": "worktree-git-issue-",
"codeowners": ["your-github-username"],
"maxRejections": 3,
"auditScoreThreshold": 0.75,
"ciGatingTimeoutSec": 300,
"agentTokenBudget": 300000,
"maxToolCalls": 0,
"agentTimeoutsMin": {},
"bellOnComplete": false,
"enableExperimentalFeatures": false
}
}
Required fields
| Field | What it does |
|---|---|
repo |
GitHub repo (owner/name) for issue fetching and PR creation |
projectNumber |
GitHub Project board number |
statusMapping |
Maps board column names to agent .md files |
codeowners |
List of trusted GitHub usernames. Only issues by these authors are processed |
Optional fields
| Field | Default | What it does |
|---|---|---|
defaultBranch |
main |
Base branch for worktree and PR target |
branchPrefix |
worktree-git-issue- |
Prefix for worktree branch names |
maxRejections |
3 |
Max auditor rejections before pipeline stops |
auditScoreThreshold |
0.75 |
Minimum passing/total ratio for audit score gate |
ciGatingTimeoutSec |
300 |
Seconds to poll CI before giving up. 0 = skip CI gate entirely |
agentTokenBudget |
0 |
Soft token cap per agent dispatch. 0 = unlimited |
maxToolCalls |
0 |
Hard cap on tool invocations per agent. 0 = unlimited |
agentTimeoutsMin |
{} |
Per-agent timeouts in minutes: { "developer": 30 } |
bellOnComplete |
false |
Ring terminal bell when pipeline finishes |
enableExperimentalFeatures |
false |
When false, only core pipeline stages run |
Audit score
The auditor evaluates code across 8 dimensions:
architecture-complianceticket-fulfillmenttest-qualitycorrectness-safetycode-qualitycompletenessduplicate-coderesearch-incorporation
Each finding has a severity and dimension. A dimension fails if it has any finding with severity critical or warning (suggestions donβt count). The score is passing / total β must meet auditScoreThreshold (default 0.75).
If the researcher was skipped by the dedup gate, research-incorporation is excluded (7 dimensions).
When the score gate fails, a ## Audit Score Gate Rejected comment is posted on the issue and the pipeline returns to Implementation.
Detailed pipeline flow
Phase 0: Entry and gating
handleSupervisorCommand(args, ctx, pi)
1. Trust check β ctx.isProjectTrusted()
2. Parse args β /supervisor [--debug] <issue-number>
3. Load config β .pi/settings.json β SupervisorConfig
4. Fetch issue β gh issue view <N> --json ...
5. Filter by owner β only codeowners' issues pass
6. Read board β find project, column, status field
7. Check deps β blocked by open PRs/issues?
8. Create worktree β git worktree add ...
9. Setup β .pi/git copy β git submodule init β npm ci
10. Register crash β SIGTERM/SIGINT cleanup handlers
Phase 1: Main loop
The pipeline loops through stages (max 20 iterations):
for each iteration:
1. Resolve current status from board
2. Backlog? β auto-move to Research, continue
3. Done? β break, pipeline complete
4. Load agent β agents/<agent>.md
5. Build task β inject per-agent context
6. Execute agent β spawn pi --mode json subprocess
7. Post-agent β comment (non-developer) or commit+push (developer)
8. Parse output β structured JSON or text markers
9. Calculate next status β forward, backward, or stop
10. Pre-transition hooks β quality gates (if ImplementationβAudit)
11. Status transition β move board card
Phase 2: Post-pipeline
After the loop ends (Done, error, or max rejections):
1. Merge conflict check β if PR has conflicts, attempt auto-merge
2. Worktree cleanup β git worktree remove + branch -D
3. Checkpoint delete
Phase 3: Summary
A summary message is sent with agent stats (duration, tokens, tools, model), PR link, stop reason, and gate failure history.
Edge cases
Configuration
| Problem | What happens |
|---|---|
No .pi/settings.json |
Pipeline throws |
Missing supervisor.repo |
Pipeline throws |
Missing supervisor.codeowners |
Pipeline throws |
Invalid auditScoreThreshold |
Pipeline throws |
GitHub API
| Problem | What happens |
|---|---|
| Issue not found | βIssue #N not foundβ β stops |
| Token missing project scope | βGitHub token missing βprojectβ scopeβ β stops |
| Issue not on board | βIssue #N not on project board #Pβ β stops |
gh api errors mid-flight |
Warnings in error collector, pipeline continues |
Agent execution
| Problem | What happens |
|---|---|
Agent .md not found |
Pipeline stops |
| Agent subprocess times out | SIGTERM β killed, pipeline stops |
| Budget exceeded | Kill subprocess. Researcher: graceful degradation (partial findings posted). Others: pipeline stops |
| Agent produces no output | "Output is empty" β stops |
| Agent refuses | "Agent refused: ..." β stops |
| Agent fails with no explicit marker | Pipeline stops (Bug #643 fix: prevents crash-loop) |
Developer-specific
| Problem | What happens |
|---|---|
commitAndPush fails |
Pipeline stops |
| No changes to commit | Pipeline continues (no-op commit skipped) |
| DeveloperβAudit: worktree has no commits | Issue closed as βalready resolvedβ |
Quality gates
| Gate | Problem | What happens |
|---|---|---|
| CI | Checks fail | Block β Implementation |
| CI | Checks pending > timeout | Warning β proceed to audit |
| CI | Unconfigured | Skip |
| Dead code | Found | Block β Implementation + uncommit |
| TSC | Error | Block β Implementation |
| LSP | Error | Block β Implementation |
| Dup code | Results | Stored for auditor context (non-blocking) |
| Package safety | Blocked packages | Warning β auditor may flag |
| Traceability | Gaps | Warning β auditor reviews |
Crash recovery
| Problem | What happens |
|---|---|
| Pipeline process SIGTERM | Cleanup worktree, delete branch, exit(0) |
| Pipeline process SIGINT | Same as SIGTERM |
| Double signal (SIGTERM + SIGINT) | exit(1) immediately |
| Stale checkpoint found on restart | git worktree prune + delete + rm -rf |
| Checkpoint older than 1h | Treated as stale |
Workflow config
The pipeline is driven by a data structure in config/workflow.ts:
[
{ status: "Backlog", builtIn: "backlog" },
{ status: "Research", agentName: "researcher",
markerMap: { RESEARCH_COMPLETE: "Architecture" } },
{ status: "Architecture", agentName: "architect",
markerMap: { ARCHITECTURE_COMPLETE: "TestDesign",
FEEDBACK_RESEARCH: "Research" },
canLoopBackTo: ["Research"] },
{ status: "TestDesign", agentName: "test-designer",
markerMap: { TEST_PLAN_COMPLETE: "Implementation" } },
{ status: "Implementation", agentName: "developer",
markerMap: { IMPLEMENTATION_COMPLETE: "Audit" },
hooks: ["ci", "tsc", "lsp", "dup", "trace"] },
{ status: "Audit", agentName: "auditor",
markerMap: { "AUDIT_DECISION: APPROVED": "Done",
"AUDIT_DECISION: REJECTED": "Implementation",
AUDIT_APPROVED: "Done",
AUDIT_REJECTED: "Implementation" },
canLoopBackTo: ["Implementation"],
maxRejections: 5 },
{ status: "Done", builtIn: "done" },
]
Each step specifies:
- status β Board column that triggers this step
- agentName β Which agent
.mdfile to load - markerMap β Text markers in agent output that determine next status
- canLoopBackTo β Valid backward transitions (feedback loops)
- hooks β Pre-transition quality gates
- maxRejections β Max times this step can loop back before human intervention
- builtIn β For non-agent steps (backlog auto-move, terminal state)
File structure
.pi/extensions/supervisor/
βββ index.ts # Entry: command registration, pipeline lifecycle
βββ pipeline/ # Pipeline orchestration
β βββ handler.ts # Main loop: agent dispatch, status transitions
β βββ stages.ts # Stage-level operations (comment posting, commits)
β βββ audit.ts # Pre-transition quality gates
β βββ execute-agent.ts # Agent subprocess management
β βββ worktree.ts # Worktree creation and setup
β βββ merge.ts # Post-pipeline merge conflict resolution
β βββ pr-creation.ts # PR creation flow
β βββ output.ts # Pipeline output helpers
β βββ notifications.ts # Summary and warnings messages
β βββ state-checkpoint.ts # Crash recovery checkpoints
β βββ crash-cleanup.ts # Signal handlers for graceful shutdown
β βββ helpers.ts # Shared utilities
β βββ error-collector.ts # Warning/error aggregation
βββ agents/ # Agent definitions (MD files with YAML frontmatter)
βββ config/
β βββ config.ts # Settings loading and validation
β βββ types.ts # Type definitions
β βββ workflow.ts # Stage transitions, marker resolution, audit scoring
βββ github/ # GitHub API wrappers
βββ agent/ # Agent subprocess runner and output parser
βββ checks/ # Quality gate implementations
β βββ ci-gating.ts # GitHub check runs polling
β βββ duplicate-code.ts # jscpd integration
β βββ dead-code.ts # knip integration
β βββ package-safety.ts # npm package age verification
β βββ requirements-traceability.ts
β βββ audit-gate-decision.ts
βββ session/ # Session management, result types
βββ subagent/ # Sub-agent dispatch utilities
βββ lib/ # Shared utilities
βββ test/ # Integration tests
Testing
Integration tests cover:
- Full pipeline with mock GitHub API
- Agent dispatch with structured JSON output parsing
- Quality gates (each gate tested independently)
- Worktree creation and cleanup lifecycle
- Status transition edge cases (backward transitions, max rejections, dedup gates)
- Submodule handling with matched-branch pattern