autonomous-execution

Included with Lifetime

$97 forever

Patterns for autonomous project execution with minimal human intervention. Use this skill when executing well-defined tasks autonomously, including validation strategies, stop conditions, and quality gates. Don't use when doing a single task interactively with the user, or when the project lacks a tasks document.

Productivity

What this skill does


# Autonomous Execution

This skill provides patterns for executing projects autonomously while maintaining quality and knowing when to stop and ask for help.

## Prerequisites for Autonomous Execution

Before starting autonomous work, verify:

### Project Readiness
- [ ] **Tech Plan exists** - Architecture and sequencing documented
- [ ] **Tasks Document exists** - Granular tasks with acceptance criteria
- [ ] **Success criteria defined** - Clear "done" definition
- [ ] **Tests exist** - Automated validation available
- [ ] **Quality gates configured** - Pre-commit hooks, CI checks

### Documentation Readiness
- [ ] **Project docs available** - CLAUDE.md, relevant docs
- [ ] **Troubleshooting guide exists** - Common issues documented
- [ ] **Stop conditions clear** - When to pause and ask

## Execution Patterns

### Pattern 1: Task-by-Task Execution

**When to use**: Standard execution for most projects

**Process**:
1. **Read task** - Understand requirements and acceptance criteria
2. **Plan approach** - Identify implementation strategy
3. **Implement** - Write code following project standards
4. **Self-validate** - Run tests, check acceptance criteria
5. **Update status** - Mark task complete, update progress
6. **Proceed to next** - Move to next task

**Validation Checkpoints**:
- After each task: Run relevant test suite
- After each phase: Run full test suite
- Before completion: Complete quality gate checklist

### Pattern 2: Phase-by-Phase Execution

**When to use**: Projects with clear phases (migrations, refactors)

**Process**:
1. Complete all tasks in current phase
2. Run phase-specific validation
3. Verify quality gates pass
4. Document learnings
5. Proceed to next phase only after validation

### Pattern 3: Incremental Validation

**When to use**: Large projects or risky changes

**Process**:
1. Make small, focused change
2. Validate immediately
3. Commit only if validation passes
4. Repeat with next small change

### Checkpoint Integration (Mandatory)

Write session checkpoints — context compaction is inevitable in long runs, not an edge case:

- **Write checkpoints** every 3 completed tasks — this is not optional
- **Write to** `docs/checkpoints/latest.md`
- **Include**: current task, decisions made, next steps, hot files
- **Also write before stopping** for any reason (blockers, session end, user interrupt)
- **Why**: Context compaction destroys working memory. Checkpoints are the only mechanism that preserves decisions and rationale across compaction events. Treat them as mandatory infrastructure, not a nice-to-have.

See the `session-checkpoint` skill for the full checkpoint format.

### Worktree Isolation

When running multiple independent tasks, consider whether to use git worktrees for isolation:

| Scenario | Recommendation | Reason |
|----------|---------------|--------|
| Sequential tasks in one subsystem | Stay on branch | Low isolation benefit, worktree overhead not worth it |
| Independent tasks across 2+ subsystems | Parallel worktrees | Tasks can't interfere with each other, enables parallel agents |
| Background maintenance tasks (lint fixes, doc updates) | Always use worktree | Keeps primary branch clean for feature work |
| Risky or experimental changes | Use worktree | Easy to discard without affecting main work |

**When NOT to use worktrees:**
- Tasks that depend on each other's output
- Tasks that modify shared state (same config files, same database schema)
- When the project is small enough that all tasks touch the same files

## Self-Validation Strategies

### Task-Level Validation

After completing each task:

1. **Run relevant tests**:
   - Unit tests for logic changes
   - Integration tests for API/service changes
   - E2E tests for user-facing changes

2. **Check acceptance criteria**:
   - Review each criterion
   - Verify all can be marked complete
   - Document any deviations

3. **Run quality checks**:
   - Type checking
   - Linting
   - Build verification

4. **Update task status**:
   - Mark acceptance criteria complete
   - Add completion notes
   - Update status to "Complete"

### Instrumented-Task Verification Gate (do NOT close on a static check)

If a task's acceptance criterion includes **"event fires", "metric captured", "instrumentation added", or "tracked in <analytics tool>"**, the task **cannot be marked complete on a static/import check** ("the code calls `capture()`", "imports are clean"). Code-is-wired ≠ behavior-verified.

Before marking such a task ✅, require ONE of:
1. **Runtime evidence** — fire the event yourself (synthetic click in a browser / incognito + DevTools network filter) and confirm it lands in the truth surface (PostHog/analytics/the metric query), OR
2. **An explicit unverified flag** surfaced to the user: *"Code is wired but I could not verify the event fires at runtime — needs a runtime check before this is truly done."*

Never silently equate the two. This failure mode has recurred across three projects (Memory Phase 1 gates closed with `null` metrics; acquisition T10/T11/T12 closed with zero/deprecated events — `recipe_cta_clicked` logged **zero events for 12 days** after being marked done on "imports are clean"). The symptom of a miss is "No data recorded" for a *shipped* event — treat that as instrumentation-suspect, not "no traffic," and fire a synthetic event to disambiguate.

### Acknowledge Method Substitution (don't silently downgrade verification)

If a task names a **specific verification method** (e.g., "browser automation / E2E test of the full flow") and you cannot or do not perform it, you must **say so explicitly** — never substitute a weaker method (a code trace, a line-number citation, a manual read) and report it as if the named method was performed. State: *"The task asked for an E2E run; I did a code trace instead because <reason>. This is weaker — the browser test still needs to run."* For E2E specs specifically, "wired into the test runner's `testMatch`/suite + seen running in CI" is part of done — a spec file that exists but never runs is zero coverage.

### Phase-Level Validation

After completing each phase:

1. Run phase-specific validation (if script exists)
2. Run full test suite multiple times (catch flakiness)
3. Verify all quality gates pass
4. Document any learnings or issues

## Stop Conditions

**CRITICAL**: Stop and ask for help when encountering:

### Immediate Stop (Blockers)
| Condition | Action |
|-----------|--------|
| Tests fail 3+ times | Stop, document issue |
| Architectural decision needed | Stop, ask for guidance |
| External dependency blocked | Stop, report blocker |
| Quality gate failure (can't fix) | Stop, seek help |
| Ambiguous requirements | Stop, ask for clarification |
| Same tool fails 2+ times | Switch strategy, don't brute-force |
| Unsure about platform capability | Say "I'm not sure, let me verify" — don't state as fact |

### Pause and Evaluate
| Condition | Action |
|-----------|--------|
| Unexpected complexity | Assess scope, consider asking |
| Breaking changes detected | Evaluate impact, may need input |
| Performance regression | Investigate, may need guidance |
| Security concerns | Always ask before proceeding |

### "Sequenced later" is NOT "blocked"

Distinguish a **soft sequencing preference you set** ("do the blog last", "warm outreach after polish") from a **hard external dependency** (missing API key, unmerged upstream PR, a gate that genuinely can't be evaluated yet). A self-imposed "later" is something you can revisit and propose to pull forward; it is not a blocker. When you find yourself idle because of a soft "later", **propose the next viable action** rather than reporting a block and waiting. (Found: agent reported "blog is blocked" and idled until the user said *"We're sitting here idle. Just answer."* — it had mistaken its own sequencing note for an external dependency.)

### How to Stop Gracefully

When stopping:

1. **Document current state**:
   - What was attempted
   - What fa

Files: 1

Size: 12.7 KB

Complexity: 18/100

Category: Productivity

Source: https://github.com/daviswhitehead/product-playbook-for-agentic-coding-plugin/tree/main/plugins/product-playbook-for-agentic-coding/skills/autonomous-execution

Related in Productivity

gitea-workflow

Included

Orchestrate agile development workflows for Gitea repositories using the tea CLI. Use when working with Gitea-hosted repos and asking to 'run the workflow', 'continue working', 'what's next', 'complete the task cycle', 'start my day', 'end the sprint', 'implement the next task', or wanting guided step-by-step development assistance. Keywords: workflow, orchestrate, agile, task cycle, sprint, daily, implement, review, PR, standup, retrospective, gitea, tea.

Productivityscripts

microsoft-graph-gateway

Included

Route Microsoft Graph work in this workspace. Use when users want to read or write Outlook mail, calendar events, contacts, OneDrive or SharePoint files, Teams, Planner, To Do, users, groups, directory data, or arbitrary Microsoft Graph endpoints from VS Code. Prefer WorkIQ for common read scenarios. Use Microsoft Graph for write actions and gap-read scenarios that need exact Graph properties, filters, permissions, or endpoints.

Productivityscripts

copilotkit

Included

Use when building with CopilotKit — setup, development, integrations, debugging, upgrading, or contributing. Routes to the appropriate specialized skill based on the task.

Productivityscripts

wordly-wisdom

Included

Provides calibrated decision analysis using Charlie Munger-style multiple mental models, inversion, incentive mapping, circle-of-competence checks, misjudgment audits, second-order effects, and forecast updates. Use when the user asks for an oracle take, a hard call, a decision memo, a premortem, an outside view, a red-team, a sanity-check, what am I missing, think this through, or wants a strategy, hire, investment, plan, product, partnership, or major life choice analysed. Avoid for simple factual lookups or time-sensitive legal, medical, or market questions without fresh evidence.

Productivityscripts

swain-session

Included

Session management and project status dashboard. Owns the full session lifecycle (start/work/close/resume), focus lane, bookmarks, worktree detection, and tab naming. Also serves as the project status dashboard — shows active epics, progress, actionable next steps, blocked items, tasks, GitHub issues, and recommendations. Worktree creation is deferred to swain-do task dispatch (SPEC-195). Triggers on: 'session', 'status', 'what's next', 'dashboard', 'overview', 'where are we', 'what should I work on', 'show me priorities', 'bookmark', 'focus on', 'session info'.

Productivityscripts

gandi

Included

Comprehensive Gandi domain registrar integration for domain and DNS management. Register and manage domains, create/update/delete DNS records (A, AAAA, CNAME, MX, TXT, SRV, and more), configure email forwarding and aliases, check SSL certificate status, create DNS snapshots for safe rollback, bulk update zone files, and monitor domain expiration. Supports multi-domain management, zone file import/export, and automated DNS backups. Includes both read-only and destructive operations with safety controls.

Productivityscripts

Use when building with CopilotKit — setup, development, integrations, debugging, upgrading, or contributing. Routes to the appropriate specialized skill based on the task.

Productivityscripts

wordly-wisdom

Included

Productivityscripts

swain-session

Included

Productivityscripts

gandi

Included

Productivityscripts