Claude
Skills
Sign in
Back

autonomous-workflow

Included with Lifetime
$97 forever

Execute development workflows through Explore-Plan-Code-Verify phases with task-driven tracking, Tier 1/2/3 action classification, decision journaling, and bounded debug loops. Use when executing any development workflow autonomously or orchestrating multi-step implementation tasks. This skill MUST be consulted because skipping phases causes rework, and unbounded verification loops cause agents to loop forever on unsolvable problems.

Productivity

What this skill does


# Autonomous Workflow

Foundation skill governing how Claude executes development workflows autonomously.

## Iron Law

**NO SKIPPING PHASES. Explore before Plan, Plan before Code, Code before Verify. Every phase produces an artifact.**

Jumping to code without exploration is the #1 cause of rework. Jumping to "done" without verification is the #1 cause of bugs reaching review.

## Explore > Plan > Code > Verify

Every multi-step workflow follows this loop:

1. **EXPLORE**: Gather context. Use Agent(Explore) subagents for unfamiliar code. Parallel Bash for independent queries (git status, issue details, task list). Read referenced files. When LSP is available: use `goToDefinition` to trace code paths from issue keywords to implementation, and `findReferences` to assess the impact of planned changes — this enhances text-based grep searches with semantic understanding.
2. **PLAN**: Decompose work. TaskCreate for each deliverable. Set dependencies with addBlockedBy. Display the plan for visibility.
3. **CODE**: Execute tasks. TaskUpdate(in_progress) before starting. Implement. Commit incrementally (Tier 1). TaskUpdate(completed) after verification. When LSP is available: use `hover` to understand types and signatures of existing code before modifying it.
4. **VERIFY**: Prove it works. Four mandatory verification layers:
   a. **Static**: Run quality commands (lint, test, typecheck) in parallel. When LSP diagnostics are available (`lsp.diagnosticsAsQuality`), collect them as an additional quality signal — errors are P1, warnings are P2. LSP diagnostics complement, never replace, CLI-based checks.
   b. **Runtime**: Build the project, start it, verify at runtime. If anything fails, enter the debug-fix-retest loop (bounded by `closedLoop.maxDebugIterations`).
   c. **Review**: Self-review with fix-forward — fix P1/P2 findings immediately, don't just report them.
   d. **Verdict**: Independent judgment — dispatch verdict-judge agent (when `verdict.enabled`) with acceptance criteria + evidence bundle. The judge has no access to code-writing rationale, diff, or decision journal. It evaluates outcomes, not process. Each criterion receives PASS/FAIL/NEEDS-HUMAN-REVIEW. FAIL verdicts trigger fix loops; NEEDS-HUMAN-REVIEW escalates to user.

## Task-Driven Progress

Use Task tools as first-class workflow primitives:

| Tool | When |
|------|------|
| TaskCreate | Start of PLAN phase — one task per deliverable |
| TaskUpdate(in_progress) | Before starting work on a task |
| TaskUpdate(completed) | After task passes verification |
| TaskList | At checkpoints to confirm progress |
| TaskGet | Before working on a task to get full context |

Tasks have clear subjects (imperative form) and descriptions with acceptance criteria.

## Per-Task Verification Gate

A task may NOT be marked completed (`TaskUpdate(taskId, status: "completed")`) until ALL of the following conditions are met:

1. **All tests pass** — both existing tests and any new tests written for this task. If any test fails, the task enters the debug-fix-retest loop and remains `in_progress` until tests pass or the user is escalated to.

2. **Verification evidence captured** — the verification command from the task description has been run and its output recorded as evidence for this task's acceptance criterion. Evidence must be collected at task-completion time, not deferred to VERIFY phase.

3. **No out-of-context files** — all files modified during this task have been classified. Any out-of-context files must be resolved (moved to a separate commit, removed, or explicitly approved by the user) before the task completes.

4. **TDD cycle completed** — when `settings.json` → `testing.tddMode` is `enforce` (the default), the full RED-GREEN-REFACTOR cycle must be observed:
   - RED: A failing test was written before implementation
   - GREEN: The simplest code was written to make the test pass
   - REFACTOR: Code was cleaned up with tests still passing

If any condition is not met, `TaskUpdate(completed)` is blocked. The workflow must not advance to the next task. This gate is the primary quality enforcement point — the VERIFY phase provides independent confirmation, not first-pass verification.

## Three-Tier Action Classification

| Tier | Actions | Behavior |
|------|---------|----------|
| **Tier 1** (Autonomous) | Commits, branch creation, file edits, staging | Execute without asking. Local and reversible. |
| **Tier 2** (Journal) | Push, PR creation, issue assignment | Execute and log to decision journal. Team-visible but recoverable. |
| **Tier 3** (Confirm) | Merge, release, force operations | Always require human confirmation. Non-negotiable. |

Tier configuration is in `settings.json` under `tiers`. Actions can be promoted (journal→confirm) but never demoted (confirm→journal).

## AskUserQuestion Tool Enforcement

When a command or skill says "use the AskUserQuestion tool", you MUST invoke the AskUserQuestion tool — do not substitute plain text output. The tool provides structured selectable options that plain text cannot replicate. Supply contextual options appropriate to the situation.

## Decision Journal Protocol

- **Init**: Create `{journal-dir}/issue-{N}.md` at branch creation
- **Log**: PostToolUse hooks auto-log file changes and commits
- **Structured entries**: Skills add timestamped entries with category, decision, rationale, risk
- **Summarize**: Condense journal for PR body (public entries only, internal redacted)

Journal dir defaults to `.decisions/`, configurable in settings.

**Anti-estimation guard for journal entries:** journal entries MUST NOT include calendar-time estimates (weeks, days, hours, sprints, ETAs, "by Friday"). Use t-shirt sizing (S/M/L) only when the user has explicitly asked for size context. Describe work in terms of artifacts and tool calls, not wall-clock duration. See `skills/llm-operator-principles/SKILL.md`. This guard exists because journal entries are the most common surface where calendar-time framings leak through and anchor downstream deferral.

## Parallel Execution

Dispatch independent operations in a single message:

- Multiple Bash calls for independent git queries
- Multiple Agent calls for independent review facets
- Never parallelize operations that depend on each other's output

## Bounded Verification

Quality check loops have max iterations from `settings.json`. These ceilings are safety nets against true infinite loops, NOT planned stop points — see `skills/llm-operator-principles/SKILL.md`:

1. Run quality commands
2. If failures, fix and re-run
3. Approaching `qualityCheckMaxIterations` without convergence is a signal to re-check understanding (are two findings in tension? are you fixing the wrong thing?), not a budget to stop at. Continue iterating until convergence.
4. Only halt for **genuine non-convergence**: the same failure persists across the last 3 iterations with no progress AND the ceiling is actually reached. In that case, file a six-field Proactive-Autonomy escalation citing "genuinely ambiguous architecture decision" — NOT finding-triage.
5. Never loop indefinitely past the ceiling without surfacing the non-convergence diagnostic.

## Stop Conditions

| Trigger | Action |
|---------|--------|
| Genuine non-convergence (same findings persist 3+ iterations AND ceiling reached) | File a six-field Proactive-Autonomy escalation per `skills/llm-operator-principles/SKILL.md` § Genuine non-convergence. Do NOT silently exit the loop. |
| Plan has >10 tasks for a single issue | Decompose the issue first. One PR should not span 10 tasks. |
| EXPLORE phase yields contradictory signals | Stop. Ask the user for clarification before planning. |
| >5 files modified without staging or committing | Stop. What you have should be committable. If not, the tasks are too large. |

## Sensitivity Classification

- **public**: Safe for PR bodies, comments, logs
- **internal**: Security rationale, credential handling, vulnerability details
- Never inc

Related in Productivity