AI Coding Agents

Claude Code subagents: the 2026 production playbook

Francesc12 min read

Lead agent fans out to five parallel Claude Code subagent lanes

For work that runs on a schedule instead of a developer's keystroke, see our Anthropic Managed Agents production playbook; managed agents extend the subagent pattern with Anthropic-owned scheduling, dreaming, and outcome grading.

Claude Code subagents are isolated Claude instances that the main session spawns to work in parallel, each with its own context window, its own tool permissions, and its own model. In June 2026 Anthropic upgraded the pattern with Dynamic Workflows, where the lead agent can plan and fan out tens to hundreds of parallel subagents in a single session, and Performance Outcomes, where a separate grader sends each subagent back to revise until its result meets a rubric. This guide is the 2026 production playbook for using Claude Code subagents: when to spawn one, how to scope its tools, how to gate its output with a SubagentStop hook, and how to combine subagents with Skills and Hooks for an agent setup that holds up on real production code. It is the third piece in our trilogy of Claude Code production playbooks, following the Claude Code Skills production playbook and the Claude Code hooks production playbook.

Quick Answer

  • A Claude Code subagent is a fresh Claude instance the main agent spawns, with its own context window, its own tools, and its own model. The main agent gets only the subagent's final summary back.
  • Subagents are right for parallel research, parallel codebase modifications that touch independent files, and isolated tasks where you do not want the lead agent's context polluted by intermediate work.
  • The 2026 release lets the lead plan dynamically and fan out tens to hundreds of subagents in one session, with a grader that scores each result against a rubric and forces a revision when the result misses the bar.
  • Pair subagents with a SubagentStop hook to enforce non-negotiables (tests pass, no secrets in diff, no out-of-scope file writes) before the lead folds a result back in.
  • The decision rule we ship: Skill teaches the how, Hook enforces the rule, Subagent isolates the work. Use all three together for production agent setups.

For the broader context around how subagents combine with the new Claude Code Desktop, Routines, and Dynamic Workflows surfaced at Code with Claude Tokyo Day 1, see our Code with Claude Tokyo 2026 recap.

What a Claude Code subagent actually is

A subagent is not a thread, a fork, or a sub-routine. It is a fresh Claude instance that the lead agent spawns inside the same session. Each subagent gets:

  • Its own context window, so the lead's already-crowded transcript stays clean.
  • Its own tool permissions, so a research subagent cannot accidentally write files and a writer subagent cannot accidentally hit production APIs.
  • Its own system prompt, so you can specialise behaviour (a security-review subagent acts differently than a feature-writer subagent).
  • Optionally its own model, so a cheap fast model can do triage and a stronger model can do the hard work.

The lead never sees the subagent's intermediate steps. It receives a single summary back, the way a tech lead receives a pull-request description from a delegate. That asymmetric information flow is the whole point: it keeps the lead's reasoning room free for orchestration, not for re-reading grep output.

In practice you create a subagent definition by dropping a markdown file into .claude/agents/ with frontmatter for name, description, tools, model, and a system-prompt body. The lead reads the description string to decide when to delegate, the same way it reads tool descriptions to decide when to call a tool.

When to use a subagent (and when not to)

Subagents are a heavy mechanism. Each one costs a full context window and adds round-trip overhead. Use them when the savings outweigh the cost. Skip them when a Skill or a tool call would do the same job.

Situation Use a subagent Use a Skill or tool instead
Five independent investigations (auth, db, api, infra, frontend) Yes, one per area, in parallel No, one transcript would interleave them
One file edit, one test run No, lead can do it Yes, no isolation benefit
Long-running scan that returns a 200-line report Yes, summary-only return is the win No, the lead does not need to read the scan
Repeated procedure (e.g. "always lint before commit") No, this is a Skill Yes, a Skill makes the lead consistent
Hard rule (e.g. "never push to main") No, this is a Hook Yes, a Hook enforces it deterministically
Refactor that touches 30 files in 6 modules Yes, one subagent per module No, single-context refactors lose track

Three to five concurrent subagents is the sweet spot for most jobs. Beyond that you spend more time merging summaries than you save by running them in parallel. The 2026 Dynamic Workflows release lets the lead push that ceiling to tens or hundreds for tasks that genuinely fan out, like running a benchmark suite across 80 model-prompt combinations, but for everyday work the three-to-five rule still holds.

The trinity: Skills, Hooks, and Subagents in 2026

Cycles 33 and 34 of this blog covered the first two pieces. Subagents close the loop. The decision rule we use internally at Totalum:

Primitive What it does When the lead reaches for it
Skill Teaches the lead a procedure "I know what good looks like for this task, I want the lead to follow it every time"
Hook Enforces a rule outside the model "There is a non-negotiable I cannot trust the lead to remember"
Subagent Isolates a unit of work "The lead should not see the inside of this work, only the outcome"

A Skill alone is a polite suggestion. A Hook alone is a rule with no taught procedure. A Subagent alone is delegation without enforcement. The combination is what makes the setup production-grade.

A concrete example. We have a Skill that teaches the lead our database-migration procedure. We have a Hook that runs SubagentStop and fails the subagent's return if the test suite did not pass. We have a Subagent definition for migration-writer that spawns whenever the lead is asked for a schema change. The lead delegates the work, the subagent follows the procedure, the hook enforces the gate, and the lead only sees a clean "migration ready" summary or a clean "tests failed at step 3" failure. Nothing in the middle.

Spawning subagents in parallel

The single phrase that matters is "in parallel using separate subagents." If you do not say parallel, Claude Code will sometimes run the work sequentially, which defeats the purpose. Explicit prompt patterns we use:

  • "Research the auth, database, and API modules in parallel using separate subagents. Return a one-paragraph summary per module."
  • "Refactor the 6 service files in the billing/ directory in parallel using separate subagents. Each subagent owns exactly one file."
  • "Run the benchmark suite across these 8 prompt variants in parallel using separate subagents. Score each on accuracy and latency."

Inside Dynamic Workflows the lead can decide the parallelism itself. You can write a prompt like "decide how to parallelise the refactor and execute it" and the lead will plan, fan out, collect, and verify. The Performance Outcomes grader sits on top: you supply a rubric ("all tests pass, no new TODOs introduced, no public API changes"), and each subagent's result is graded in a separate context window. A failure sends the subagent back to revise. This is what bumped Anthropic's reported task-success rate by up to 10 points on the hardest internal benchmarks.

Three rules of thumb for parallel spawning that hold even with Dynamic Workflows:

  1. Independent units only. Two subagents writing to the same file at the same time will race. Plan the partition before you fan out.
  2. Per-subagent scope, per-subagent tools. A research subagent gets read-only file tools and web search. A writer subagent gets Edit and Bash but not network. Tight scopes mean one bad subagent cannot poison the rest.
  3. Bounded summaries. Tell each subagent the exact shape of the summary you want back. The lead has to merge them, so consistent shapes save a second pass.

Gating subagent output with a SubagentStop hook

The SubagentStop hook fires when a subagent finishes and before its summary reaches the lead. In 2026 the hook payload includes hookSpecificOutput.additionalContext, so you can extend the subagent's turn with new context instead of treating the hook as a binary error. Common patterns:

  • Test gate. Run the test suite. If it fails, return exit 2 with a message; the subagent is sent back to fix it. If it passes, return exit 0 with an additionalContext message that says "tests passed at commit abc123" so the lead has audit-trail context.
  • Secret scrubbing. Grep the subagent's diff for API keys, JWTs, and connection strings. Block the return if any are found.
  • Out-of-scope write block. If the subagent was meant to edit billing/ but its diff touches auth/, block the return and ask the subagent to restate its plan.
  • Style enforcement. Run ruff format and eslint --fix. Re-run the diff check. Block if files were modified outside the subagent's stated scope.

A minimal SubagentStop hook in shell:

#!/usr/bin/env bash
set -euo pipefail
if ! pytest -q; then
  echo "tests failed; subagent must fix before returning" >&2
  exit 2
fi
if git diff --cached | grep -E 'sk-[A-Za-z0-9]{20,}|AKIA[0-9A-Z]{16}'; then
  echo "secret detected in subagent diff" >&2
  exit 2
fi
echo '{"hookSpecificOutput":{"additionalContext":"tests passed, no secrets in diff"}}'

Pair that file path with the subagent name in your hooks configuration. The lead never sees the test output, just the gate result and the additional-context line.

A real subagent definition we use at Totalum

Here is a trimmed code-reviewer subagent definition we run internally. It is the kind of thing that pays for itself the first week.

---
name: code-reviewer
description: Review a pull request for security issues, dead code, and rule violations. Use after the feature subagent finishes its edits.
tools: Read, Grep, Bash
model: claude-haiku-4-5
---

You review pull requests for one purpose: catch what a tired tech lead would miss. Focus on:
1. Security: secrets, SSRF surfaces, SQLi, unsafe deserialization.
2. Rule violations: forbidden imports, missing migrations, schema drift.
3. Dead code: functions added in this diff that are never called.

Return a single markdown block:
- VERDICT: pass | fail
- Findings: bulleted, file:line specific
- Suggested fixes: one line per finding
Do not fix anything yourself. The lead decides what to apply.

We spawn this subagent automatically whenever the lead is about to mark a feature subagent's work as done. The feature subagent does the work, the reviewer subagent catches the issues, the SubagentStop hook enforces the test gate, and the lead sees three clean lines: feature done, reviewer pass, gate green.

Subagents and the Claude Agent SDK

If you are building your own agent surface (a SaaS embed, a CLI, an internal tool), the same primitives are available through the Claude Agent SDK. You declare subagent specs the same way, you supply your own hook scripts, you control the model selection per subagent, and you get the same Dynamic Workflows fan-out for free if the lead is a 4.7-class model.

A common SDK pattern we see in production: the parent application boots a lead agent for an end-user request, the lead spawns three parallel subagents (research, draft, review), each subagent has scoped tools matching what the user has paid for, and a SubagentStop hook gates anything that touches billable resources. Everything billed is gated, everything gated is logged, and the lead's context stays focused on coordination.

For Claude Code itself, the SDK is what powers Cline, Codex, and the other surfaces. If you have been comparing tools, the Cline vs Claude Code breakdown covers how subagent support differs across surfaces.

Subagents and MCP: parallel app builds with Totalum

A pattern we use ourselves: spawn one subagent per Totalum app build, each with its own MCP connection to the Totalum MCP server. The lead supplies a project list, each subagent owns one project end to end, and a SubagentStop hook verifies the deploy succeeded before returning the URL.

The fan-out shape is the obvious win. Five client apps that used to ship in five sequential sessions ship in parallel inside one. The lead reads five short summaries instead of sitting through five full build sessions. The work that used to take half a day takes the time of the longest single build.

The hook is what makes this safe. The SubagentStop hook hits Totalum's deploy-status endpoint, refuses to return until the deploy is green, and writes the URL into hookSpecificOutput.additionalContext. The lead can then update a tracker, post to Slack, or hand the URL back to the user without having to re-derive it.

Common failure modes and how to avoid them

We have spent enough time running subagents to know the patterns that go wrong.

  • Too many subagents. Past five concurrent for everyday work, the lead spends more time summarising than the parallelism saved. Plan the partition first.
  • Shared state. Two subagents writing to the same file race. If you need cross-subagent state, write it to a designated path and let a third subagent merge it.
  • Loose tool scopes. A research subagent with Edit permission will, eventually, edit. Give every subagent only the tools it needs.
  • No SubagentStop gate. Without a gate, a bad subagent return becomes the lead's problem to discover. With a gate, the bad return is forced into a fix-it loop before the lead ever sees it.
  • No rubric for Dynamic Workflows. If you use Dynamic Workflows without a Performance Outcomes rubric, the lead has no way to grade the fan-out and you lose the verification benefit.
  • Context bloat in the subagent system prompt. Subagents are cheap on the lead's context but not on their own. Keep their system prompts tight and put procedure in a Skill instead.

FAQ

Are Claude Code subagents the same as the Task tool?

On the surface yes. The Task tool is the lead's way of spawning a subagent, and the subagent runs the work in its own context window. The difference is in how you scope it. A subagent definition in .claude/agents/ is a re-usable specialist with a fixed system prompt and tools. A bare Task call is ad-hoc and inherits whatever the lead happens to be doing. Use definitions for repeat work, bare Task calls for one-offs.

How many subagents can run in parallel in 2026?

For everyday code work, three to five is the sweet spot. With Dynamic Workflows the lead can plan and run tens to hundreds in one session for tasks that fan out cleanly (benchmark suites, programmatic edits across many independent files). Beyond that, summary-merging overhead eats the parallel gains.

Do subagents share context with the lead?

No. Each subagent has its own context window. The lead sees only the summary the subagent returns. That is the whole point: it keeps the lead's transcript clean for orchestration.

When should I use a Hook instead of a Subagent?

Use a Hook when the rule is deterministic and must always fire. Use a Subagent when the work is non-trivial and benefits from isolation. They compose: a Subagent does the work, a SubagentStop Hook gates the result. See the Claude Code hooks production playbook for the hook side.

What is the Performance Outcomes feature?

A grading layer over Dynamic Workflows. You supply a rubric of what "good" looks like. Each subagent's result is graded in its own context window by a separate evaluator. A failure sends the subagent back to revise until the result meets the rubric. Anthropic reported up to a 10-point lift on the hardest internal benchmarks.

Can subagents call MCP servers?

Yes. Subagents inherit MCP access from the project the lead is running in, subject to the tool scope you set in the subagent definition. A subagent restricted to read-only MCP tools cannot accidentally mutate state, which is the safe default for research subagents.

Ready to build with Totalum

Totalum is an AI app builder for humans and for agents. The same primitives Claude Code uses, Skills, Hooks, Subagents, MCP, Totalum supports natively, so the agent setup you build for code carries over to the apps you build for users. If you want to try the parallel-build pattern in this post, register at totalum.app and connect your Claude Code session to the Totalum MCP server. The free tier is enough to ship a real app.

Francesc

Writes for the Totalum blog about AI app building, no-code development, and product engineering.

Related posts

Start building with Totalum

Create your web app with AI in minutes. No code needed.

Try Totalum for free