Claude
Skills
Sign in
Back

autobrowse

Included with Lifetime
$97 forever

Self-improving browser automation via the auto-research loop. Iteratively runs a browsing task, reads the trace, and improves the navigation skill (strategy.md) until it reliably passes. Supports parallel runs across multiple tasks using sub-agents. Use when you want to build or improve browser automation skills for specific website tasks.

Productivityscripts

What this skill does


# AutoBrowse — Self-Improving Browser Skill

Build reliable browser automation skills through iterative experimentation. An inner agent browses the site (`evaluate.ts`). You — the outer agent — read what happened and improve the instructions (`strategy.md`). Repeat until it passes consistently.

## Entry Points

Invocation is flexible — both explicit flags and free-form natural language work:

```
/autobrowse --task google-flights
/autobrowse --task google-flights --iterations 10 --env remote
/autobrowse --task google-flights --browser-trace
/autobrowse --tasks google-flights,amazon-add-to-cart
/autobrowse --all

# Also fine — parse freely:
/autobrowse https://flights.google.com/
/autobrowse book a flight on delta.com
/autobrowse fix the existing google-flights skill
```

`--browser-trace` (default off, remote-only): pairs each iteration with the sibling `browser-trace` skill — wraps the inner agent in a CDP capture for per-page network/console/page-lifecycle evidence. Implies `--env remote`; errors if combined with `--env local`. Requires the sibling `browser-trace` skill present at `${CLAUDE_SKILL_DIR}/../browser-trace/`, and the `BROWSERBASE_API_KEY` env var.

When the user drops a URL or free-form instruction instead of `--task <name>`:
- If an existing task in `${WORKSPACE}/tasks/` clearly matches the site/intent, use it.
- Otherwise, pick a short kebab-case name, create `${WORKSPACE}/tasks/<name>/task.md` from `${CLAUDE_SKILL_DIR}/references/example-task.md`, fill in the URL/goal based on what the user said, and proceed. Tell the user the chosen name in one line.

---

## How to run

### Step 1 — Parse arguments and orient

Check what was passed:
- `--task <name>` → single task mode
- `--tasks a,b,c` or `--all` → multi-task mode (spawn sub-agents)
- `--iterations N` → how many evaluate → improve cycles (default: 5)
- `--env local|remote` → browser environment (default: local; use remote for bot-protected sites)
- `--browser-trace` → opt in to the browser-trace integration (default off). Implies `--env remote`. If `--env local --browser-trace` are both passed explicitly, error with: `browser-trace requires Browserbase; drop --env local or drop --browser-trace.`

If the user passed free-form text instead, map it to one of the above before continuing.

### Step 2 — Set up the workspace

All training artifacts (task definitions, strategy iterations, traces, reports) live in a workspace directory in the **current working directory** — NOT inside `~/.claude/skills/`. This keeps the inner agent's file writes out of Claude's home dir and away from permission friction.

Default workspace: `${CWD}/autobrowse/`

```bash
mkdir -p ./autobrowse/tasks ./autobrowse/traces ./autobrowse/reports
```

If the task directory (`./autobrowse/tasks/<task>/task.md`) doesn't exist yet, scaffold it:

```bash
mkdir -p ./autobrowse/tasks/<task>
cp ${CLAUDE_SKILL_DIR}/references/example-task.md ./autobrowse/tasks/<task>/task.md
# Then edit task.md to describe the URL, inputs, steps, and expected JSON output
```

The skill source at `${CLAUDE_SKILL_DIR}` stays read-only — only `./autobrowse/` in CWD gets written to during training. Graduation (final step) writes a single file to `~/.claude/skills/<task>/SKILL.md`.

List available tasks:
```bash
ls ./autobrowse/tasks/
```

### Step 3 — Multi-task: spawn parallel sub-agents

If running multiple tasks, use the Agent tool to spawn one sub-agent per task simultaneously. Each sub-agent receives a self-contained prompt to run the full autobrowse loop for its task:

> "You are running the autobrowse skill for task `<name>`. Workspace: `<absolute-path-to-workspace>` (e.g. `/path/to/project/autobrowse`). Run `<N>` iterations of: evaluate → read trace → improve strategy.md → repeat. Use `--env <env>`. Pass `--workspace <workspace>` to every evaluate.mjs invocation. If the parent invocation used `--browser-trace`, you MUST use the traced-path block of the SKILL.md loop for every iteration (pre-create session, attach bb-capture, pass `--connect-url` to evaluate.mjs, stop+bisect, release) — do not fall back to the default single-command path. Follow the autobrowse loop instructions exactly.
>
> When graduating, install the skill to `~/.claude/skills/<task-name>/SKILL.md` with proper agentskills frontmatter (name + description). Do not just copy strategy.md — write a self-contained skill.
>
> At the end, output a structured summary with: task name, pass/fail on final run, total cumulative cost, iterations completed, per-iteration table (iter number, turns, cost, status, hypothesis tested), and 2-3 bullet key learnings."

Spawn all sub-agents in parallel, wait for all to complete, then collect their summaries and write the session report.

**For single task**, skip this step and run the loop directly below.

---

## The Loop (run this for each task)

### Iteration start

Check that `./autobrowse/tasks/<task>/task.md` exists (scaffold it from the template if not — see Step 2). `strategy.md` is auto-created empty by the harness on first run.

### Requirements

- `ANTHROPIC_API_KEY` must be in the environment (or in a `.env` file in CWD — `evaluate.mjs` auto-loads it). If missing, the harness prints a clear error and exits; don't hunt for keys in other paths.

### Run the inner agent

**Default path (no `--browser-trace`)** — single command, no orchestration:

```bash
node ${CLAUDE_SKILL_DIR}/scripts/evaluate.mjs --task <task-name> --workspace ./autobrowse
# or for bot-protected sites:
node ${CLAUDE_SKILL_DIR}/scripts/evaluate.mjs --task <task-name> --workspace ./autobrowse --env remote
```

This runs the browser session and writes a full trace to `./autobrowse/traces/<task>/latest/`.

**Traced path (`--browser-trace`, remote only)** — the outer harness pre-creates a Browserbase session, attaches `bb-capture` as a passive observer, and passes the session's `connectUrl` to `evaluate.mjs` so every inner `browse` call uses `--cdp $connectUrl --session autobrowse-main` (the canonical browser-trace pattern that gives observers full Network/Console events). Run this block once per iteration with `$N` set to the 1-indexed iteration number:

```bash
# Preflight — fail fast if browser-trace isn't installed alongside autobrowse.
BT_DIR="${CLAUDE_SKILL_DIR}/../browser-trace"
if [ ! -f "$BT_DIR/scripts/bb-capture.mjs" ]; then
  echo "ERROR: --browser-trace requires the browser-trace skill at $BT_DIR." >&2
  echo "Install it by cloning github.com/browserbase/skills and copying skills/browser-trace/" >&2
  echo "into the same parent directory as autobrowse (e.g. ~/.claude/skills/browser-trace/)." >&2
  exit 1
fi

# a. SESSION SETUP — pre-create the keep-alive session and derive its connectUrl
sid=$(browse cloud sessions create --keep-alive --verified --proxies \
  | node -e "let s='';process.stdin.on('data',c=>s+=c).on('end',()=>process.stdout.write(JSON.parse(s).id))")
connect_url=$(browse cloud sessions get "$sid" \
  | node -e "let s='';process.stdin.on('data',c=>s+=c).on('end',()=>process.stdout.write(JSON.parse(s).connectUrl))")

RUN_ID="run-$(printf '%03d' "$N")"
TRACE_ROOT="./autobrowse/traces/<task-name>/$RUN_ID"
mkdir -p "$TRACE_ROOT"
export O11Y_ROOT="$TRACE_ROOT/.o11y"   # park browser-trace output inside the autobrowse run dir
export O11Y_RUN_ID="$RUN_ID"           # tells the browse CLI which run dir to write descriptors.ndjson into

# b. ATTACH BROWSER-TRACE — passive observer; runs in background
node ${CLAUDE_SKILL_DIR}/../browser-trace/scripts/bb-capture.mjs "$sid" "$RUN_ID" &
sleep 2

# c. RUN AUTOBROWSE — connectUrl flag tells evaluate.mjs to inject --cdp/--session
#    into every inner browse call. The inner agent never sees --remote.
node ${CLAUDE_SKILL_DIR}/scripts/evaluate.mjs \
  --task <task-name> --workspace ./autobrowse --env remote \
  --connect-url "$connect_url" --run-number "$N"

# d. STOP + BISECT + UNIFY — order matters; bisect needs the session to still
#    exist, and unify-trace joins the bisect output with autobrowse's 
Files: 24
Size: 127.6 KB
Complexity: 87/100
Category: Productivity

Related in Productivity