Claude
Skills
Sign in
Back

execute-tdd-tasks

Included with Lifetime
$97 forever

Execute TDD task pairs autonomously with RED-GREEN-REFACTOR verification. Orchestrates wave-based execution with strategic parallelism, routing TDD tasks to tdd-executor agents and non-TDD tasks to standard task-executor. Use when user says "execute tdd tasks", "run tdd tasks", "start tdd execution", or wants to execute TDD-paired tasks from create-tdd-tasks.

Productivity

What this skill does


# Execute TDD Tasks Skill

This skill orchestrates autonomous execution of TDD task pairs generated by `/create-tdd-tasks`. It is the TDD counterpart to the standard `execute-tasks` skill, reusing its session management, wave infrastructure, and execution context sharing while adding TDD-specific agent routing, RED-GREEN-REFACTOR verification, and per-task compliance reporting.

The key difference from standard `execute-tasks`: this skill routes TDD tasks to the `tdd-executor` agent (from `tdd-tools`) which runs a 6-phase TDD workflow, while routing non-TDD tasks to the standard `task-executor` agent. It verifies TDD compliance (RED verified, GREEN verified, refactored) per task pair and reports aggregate results.

**CRITICAL: Complete ALL 9 steps.** The workflow is not complete until Step 9: Update CLAUDE.md is evaluated. After completing each step, immediately proceed to the next step without waiting for user prompts (except Step 4 which requires user confirmation).

## Plugin Context

This skill is part of the `tdd-tools` plugin and uses agents from the same plugin:
- **tdd-executor** agent (Opus) -- 6-phase TDD workflow per task
- **test-writer** agent (Sonnet) -- parallel test generation (used by tdd-executor internally)

For non-TDD tasks, this skill routes to the `task-executor` agent from `sdd-tools` (soft cross-plugin dependency). Since TDD tasks are always generated from SDD tasks via `/create-tasks`, the `sdd-tools` plugin is expected to be installed when this skill runs.

## Core Principles

### 1. TDD Compliance First

Every TDD task pair must complete the RED-GREEN-REFACTOR cycle:
- **RED**: Tests are written and verified to fail before any implementation exists
- **GREEN**: Implementation is written that makes all tests pass with zero regressions
- **REFACTOR**: Code is cleaned up while keeping all tests green

### 2. Strategic Parallelism

Maximize execution throughput without violating TDD sequencing:
- **PARALLEL**: Multiple test-writing tasks (RED phase) run simultaneously across features
- **SEQUENTIAL**: Within a single TDD pair, RED must complete before GREEN can start (enforced by dependencies)

### 3. Reuse execute-tasks Infrastructure

Session management, wave execution, context sharing, and progress tracking all reuse the same patterns from `execute-tasks`. See `references/tdd-execution-workflow.md` for TDD-specific extensions.

### 4. Honest TDD Reporting

Report per-task compliance with the full RED-GREEN-REFACTOR cycle:
- `red_verified`: Whether tests failed as expected before implementation
- `green_verified`: Whether all tests pass after implementation
- `refactored`: Whether code was cleaned up while maintaining green tests
- `coverage_delta`: Change in test coverage percentage (if measurable)

## Orchestration Workflow

This skill orchestrates TDD task execution through a 9-step loop that mirrors the standard `execute-tasks` orchestration with TDD-specific extensions. See `references/tdd-execution-workflow.md` for the full TDD wave execution details and `references/tdd-verification-patterns.md` for TDD phase verification rules.

### Step 1: Load References

Read the TDD-specific reference files:

```
Read: ${CLAUDE_PLUGIN_ROOT}/skills/execute-tdd-tasks/references/tdd-execution-workflow.md
Read: ${CLAUDE_PLUGIN_ROOT}/skills/execute-tdd-tasks/references/tdd-verification-patterns.md
```

Parse arguments from the invocation:
- `--task-group <group>` -- Filter tasks to a specific group
- `--max-parallel <n>` -- Override max concurrent agents per wave
- `--retries <n>` -- Override retry attempts per task (default: 3)

### Step 2: Load and Classify Tasks

Use `TaskList` to retrieve all tasks. If `--task-group` was provided, filter to tasks where `metadata.task_group` matches.

**Classify each task by type:**

| Detection | Type | Agent | Source |
|-----------|------|-------|--------|
| `metadata.tdd_mode == true` AND `metadata.tdd_phase == "red"` | TDD test task | `tdd-executor` | tdd-tools (same plugin) |
| `metadata.tdd_mode == true` AND `metadata.tdd_phase == "green"` | TDD implementation task | `tdd-executor` | tdd-tools (same plugin) |
| No `tdd_mode` metadata or `tdd_mode == false` | Non-TDD task | `task-executor` | sdd-tools (cross-plugin, soft dependency) |

**Count and report:**
- Total tasks (pending + in_progress + completed)
- TDD pairs identified (test + implementation tasks)
- Non-TDD tasks
- Already completed tasks

**Handle edge cases:**
- **No tasks found**: Report "No tasks found for group '{group}'. Use `/create-tdd-tasks` to generate TDD task pairs from your SDD tasks." and stop.
- **All completed**: Report a summary of completed tasks including TDD compliance and stop.
- **No unblocked tasks**: Report which tasks exist and what's blocking them.

### Step 3: Build Execution Plan

Resolve `max_parallel` using precedence:
1. `--max-parallel` CLI argument (highest priority)
2. `max_parallel` in `.claude/agent-alchemy.local.md`
3. Default: 5

Resolve `retries` using precedence:
1. `--retries` CLI argument (highest priority)
2. Default: 3

Read `.claude/agent-alchemy.local.md` if it exists, for TDD-specific settings:
- `tdd.strictness` -- `strict`, `normal` (default), or `relaxed`
- `tdd.coverage-threshold` -- Minimum coverage target (default: 80)

Build the dependency graph from all pending tasks (TDD and non-TDD):

1. Collect all pending tasks and their `blockedBy` relationships
2. Run topological sort to assign dependency levels
3. Assign tasks to waves by dependency level (Wave 1 = no dependencies, Wave 2 = depends only on Wave 1, etc.)
4. Sort within waves by priority: critical > high > medium > low > unprioritized
5. Break ties by "unblocks most others"
6. Cap each wave at `max_parallel` tasks

**Annotate waves with TDD phase labels:**

The dependency structure from `create-tdd-tasks` naturally produces alternating test/implementation waves:

```
Wave 1: [Test-A, Test-B, Test-C]         -- RED phase (parallel test generation)
Wave 2: [Impl-A, Impl-B, Impl-C]         -- GREEN phase (parallel implementation)
Wave 3: [Test-D, Test-E, Non-TDD-F]      -- RED phase + non-TDD tasks (mixed)
Wave 4: [Impl-D, Impl-E]                  -- GREEN phase
```

**Detect circular dependencies:** If tasks remain unassigned after topological sorting, they form a cycle. Report the cycle and attempt to break at the weakest link.

**Validate TDD pair cross-references:** For each TDD task, verify its `paired_task_id` references a valid task. Log warnings for orphaned pairs.

### Step 4: Present Execution Plan and Confirm

Display the TDD execution plan:

```
EXECUTION PLAN (TDD Mode)

Tasks to execute: {count} ({tdd_pairs} TDD pairs, {non_tdd} non-TDD tasks)
Retry limit: {retries} per task
Max parallel: {max_parallel} per wave
TDD Strictness: {strict|normal|relaxed}

WAVE 1 ({n} tasks -- RED phase):
  1. [{id}] Write tests for {subject} (RED, paired: #{impl_id})
  2. [{id}] Write tests for {subject} (RED, paired: #{impl_id})

WAVE 2 ({n} tasks -- GREEN phase):
  3. [{id}] {subject} (GREEN, paired: #{test_id})
  4. [{id}] {subject} (GREEN, paired: #{test_id})

WAVE 3 ({n} tasks -- mixed):
  5. [{id}] {subject} (non-TDD)
  6. [{id}] Write tests for {subject} (RED, paired: #{impl_id})

{Additional waves...}

BLOCKED (unresolvable dependencies):
  [{id}] {subject} -- blocked by: {blocker ids}

COMPLETED:
  {count} tasks already completed
```

Use `AskUserQuestion` to confirm:

```yaml
questions:
  - header: "Confirm TDD Execution"
    question: "Ready to execute {count} tasks in {wave_count} waves (max {max_parallel} parallel) with TDD enforcement ({strictness} mode)?"
    options:
      - label: "Yes, start TDD execution"
        description: "Proceed with the TDD execution plan above"
      - label: "Cancel"
        description: "Abort without executing any tasks"
    multiSelect: false
```

If the user selects **"Cancel"**, report "Execution cancelled. No tasks were modified." and stop.

### Step 5: Initialize Execution Direct

Related in Productivity