AI Coding Agents

Claude Agent SDK in 2026: What It Is, When To Use It, and How To Ship It With Totalum

FrancescMay 31, 202615 min read

Claude Agent SDK and Totalum integration diagram showing auth, payments and deploy

The Agent SDK now sits underneath Anthropic Managed Agents in 2026, which add a scheduler, dreaming pass, and rubric-based outcome grading on top of the same primitives.

The Claude Agent SDK is Anthropic's library for building production AI agents on top of the same harness that powers Claude Code. It was renamed in September 2025 from the Claude Code SDK and ships in Python and TypeScript flavors, with a bundled CLI binary, subagents, sessions, MCP support, and a hosted execution model that bills against the same plans you already pay for. In 2026 it has become the default way developers wire Claude into long-running, tool-using workflows, and search demand for "claude agent sdk" has gone from 50 monthly searches in May 2025 to 14,800 in April 2026, a near 50,000 percent year over year jump. This guide is for developers and indie hackers who want to know exactly what the Claude Agent SDK is, when to pick it over the alternatives, what the June 15, 2026 metering change actually does, and how to ship Agent SDK projects into production faster by pairing the SDK with Totalum, the AI app builder, with a separate deep-dive available in our Claude Agent SDK credits 2026 breakdown covering the official Anthropic credit pool amounts and overage behavior for humans and agents. For the recipe layer that lives next to the SDK (folder-based Skills Claude Code discovers on each session), see our complete guide to Claude Code Skills in 2026.

If you are wiring subagents through the SDK, the Claude Code subagents production playbook covers the parallel orchestration patterns and the SubagentStop gating we use in production.

To enforce contracts on what those subagents return (sources cited, test diffs included, no leftover background tasks), see our Claude Code hooks production playbook and the SubagentStop pattern.

Quick Answer

The Claude Agent SDK is Anthropic's official library for building autonomous AI agents that can read files, run shell commands, search the web, edit code, and call MCP servers, using the same harness that powers Claude Code.
It was renamed from "Claude Code SDK" in September 2025 to signal that it is not only for coding agents.
Pick the Claude Agent SDK when you need a real agent loop with tool use, subagents, and sessions, on top of Sonnet 4.6 or Opus 4.7, with the option to run on top of your existing Claude Pro, Max, or Anthropic API credits.
From June 15, 2026, Claude Agent SDK and Claude Code GitHub Actions usage is metered separately from interactive Claude Code, which makes Agent SDK costs predictable but pushes more agentic work into per-token billing. See our Claude Code pricing 2026 breakdown for the full table.
Pair the Agent SDK with Totalum when you want the agent's output to become a real, deployable Next.js application with built-in auth, payments, file storage, database, and a custom domain, without writing the production scaffolding yourself.

What the Claude Agent SDK actually is in 2026

The Claude Agent SDK is a library, not a model. It exposes Anthropic's internal agent harness as code you can call. You give it a system prompt, a set of tools (built-in and MCP), and a goal, and it runs the autonomous loop: the model decides what tool to call, the SDK executes the call, the result is fed back into context, and the loop continues until the model returns a final answer or hits a stop condition.

A few facts worth getting right, because the SERP is full of stale articles:

The SDK was originally shipped in September 2025 as the Claude Code SDK, then renamed to Claude Agent SDK later that month so that teams would stop assuming it was coding only. The underlying harness is the same one Claude Code uses.
It is published as claude-agent-sdk on npm and PyPI. The Python package is claude-agent-sdk-python, and the TypeScript package is @anthropic-ai/claude-agent-sdk.
The Python package bundles the Claude Code CLI automatically, so you do not need a separate Claude Code install to use the SDK from Python.
Out of the box you get: file editing tools, bash execution, web search, web fetch, a tool-use loop with optional human-in-the-loop checkpoints, subagents (delegated child agents with their own context), persistent sessions, and first-class Model Context Protocol (MCP) client support.
It runs against any current Claude model. As of May 2026 that includes Sonnet 4.6, Opus 4.7, and the freshly-released Opus 4.8, which Anthropic published the same week as this article.

In one sentence, the Claude Agent SDK is the infrastructure layer of Claude Code, exposed as a library you can point at any problem. That framing came from the ksred.com explainer in March 2026 and it is the cleanest mental model.

> Model status note, June 13, 2026: Claude Fable 5 access was suspended globally on June 12. Until restored, default the SDK's model parameter to claude-opus-4-8. Our Claude Fable 5 production playbook describes the three integration patterns and the Opus 4.8 fallback path.

When to use the Claude Agent SDK

The Claude Agent SDK is not the only way to call Claude. Here is the practical decision matrix.

Claude Agent SDK vs claude.ai

Use claude.ai (the chat product) when a human is in the loop, the work is one or two turns long, and you do not need programmatic control. It is the right tool for ad-hoc analysis, writing help, and exploration. It is the wrong tool for any workflow you want to run on a schedule, behind an API, or as a feature inside your product.

Use the Claude Agent SDK when you need code to drive the conversation, hand the model tools, and get a structured result back. If the work involves "read these 200 files, run these checks, write this report, commit the result," it is an Agent SDK job.

Claude Agent SDK vs Anthropic MCP

This is the question most developers get wrong. The Model Context Protocol and the Claude Agent SDK are complementary, not competing.

MCP is a protocol for exposing tools, resources, and prompts to a Claude client. You write an MCP server once, and any MCP-aware client (Claude Code, the Claude Agent SDK, the desktop apps, third-party tools) can use it.
Claude Agent SDK is a client that consumes MCP servers, plus a runtime that wraps the agent loop, plus the built-in toolset. It is one of the things you point your MCP server at.

A good rule of thumb: if you are exposing a capability that should be reusable across clients, write it as an MCP server. If you are orchestrating a workflow that calls multiple tools, write it inside an Agent SDK program that loads those MCP servers.

Claude Agent SDK vs raw Anthropic API

The raw /v1/messages API gives you a single turn of model inference. You send a prompt, you get a completion. If you want tool use, you have to implement the loop yourself: parse the model's tool calls, execute them, send the results back, repeat. You also have to implement context window management, subagents, sessions, file I/O, and shell execution from scratch.

The Claude Agent SDK does all of that for you. The cost is a small amount of additional opinion (the SDK picks the agent loop shape, the tool result format, and the system prompt scaffolding). For most agent use cases the trade is correct: a 20-line SDK program replaces a 600-line custom harness, and the SDK harness has been hardened by every Claude Code user in the world.

You should still use the raw API when the work is one or two turns, when you need fully custom tool plumbing that the SDK does not allow, or when you are building infrastructure that itself competes with the SDK.

The June 15, 2026 metering change

On April 2026, Anthropic announced that Claude Agent SDK and Claude Code GitHub Actions usage would be metered separately from interactive Claude Code starting June 15, 2026. The list prices on Pro, Max 5x, and Max 20x did not change. The quota knobs did.

What is actually changing on June 15:

Interactive Claude Code (the terminal you type into) and headless Claude Code (Agent SDK, GitHub Actions, scheduled jobs) draw from separate weekly token pools on the subscription plans.
The pool sizes are sized so that a normal interactive user's budget is not affected.
Heavy headless users (long-running Agent SDK jobs, CI pipelines that run Claude on every PR) will hit limits sooner and may need to top up via API credits.
API customers on direct token billing are unaffected. You pay per token at the model's posted rate.

The honest read: this is not a price hike, it is a quota carve-out. Anthropic is separating "human in front of a keyboard" usage from "agent running on a schedule" usage so that the second class of usage does not silently consume the first. If you are building anything serious on the Agent SDK, plan on API-credit billing as the steady-state, with the Pro or Max subscription as a development convenience.

We covered the full plan-by-plan math in our Claude Code pricing 2026 breakdown. The 5-tier table, the prompt-caching 90 percent discount lever, and the bundled-credit AI-app-builder alternative are all in that post.

Driving Totalum projects from the Claude Agent SDK

Totalum is an AI app builder that turns a single MCP call (or REST call) into a real, deployable Next.js + TotalumSDK application with built-in auth, payments, database, file storage, AI integrations, deployment, and an optional custom domain. The same builder that powers the Totalum web product is exposed via MCP, which means a Claude Agent SDK program can drive it.

Here is the minimal Python example. It connects the Claude Agent SDK to the Totalum MCP server, then asks the agent to scaffold and deploy a fully working app.

import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions

async def main():
    options = ClaudeAgentOptions(
        model="claude-opus-4-8",
        mcp_servers={
            "totalum": {
                "command": "npx",
                "args": ["-y", "@totalum/mcp"],
                "env": {"TOTALUM_API_KEY": "tlm_..."},
            }
        },
        allowed_tools=[
            "mcp__totalum__createProject",
            "mcp__totalum__runAgent",
            "mcp__totalum__deploy",
            "mcp__totalum__getDeploymentStatus",
        ],
        system_prompt=(
            "You are a senior full-stack engineer. Use the Totalum MCP "
            "to scaffold and deploy a working Next.js + TotalumSDK app."
        ),
    )

    async for message in query(
        prompt=(
            "Create a new Totalum project called 'beauty-booking', "
            "scaffold a salon booking app with auth, a customers "
            "table, a services table, an appointments table, a "
            "Stripe checkout flow, and an admin dashboard. Deploy "
            "the app and return the live URL."
        ),
        options=options,
    ):
        print(message)

asyncio.run(main())

What the agent does, in order:

Calls mcp__totalum__createProject to get a fresh Next.js + TotalumSDK project.
Calls mcp__totalum__runAgent repeatedly to scaffold pages, components, tables, auth, and the Stripe checkout flow.
Calls mcp__totalum__deploy to push the app to Totalum's hosted environment.
Polls mcp__totalum__getDeploymentStatus and returns the live *.totalum.app URL when the build is green.

You did not write any Next.js code. You did not configure NextAuth. You did not stand up a Postgres instance. You did not write a Stripe webhook handler. The Agent SDK drove Totalum to do all of it, in roughly the time it takes to read this section.

If you want the same shape in TypeScript, install @anthropic-ai/claude-agent-sdk and use the equivalent options object. The SDK API surface is symmetric across both languages.

When Agent SDK plus Totalum beats raw API integration

Most developers who pick the Claude Agent SDK do so because they want to ship a product, not because they want to write more agent plumbing. The pairing of Agent SDK and Totalum is the fastest path from a working agent to a working revenue-generating app, because each piece does exactly what it is best at.

The Agent SDK handles:

The agent loop, tool calls, and context management.
The conversation between Claude and your custom tools.
The harness that has already been battle-tested by every Claude Code user.
Streaming, hooks, subagents, sessions, and human-in-the-loop checkpoints.

Totalum handles:

The production app the agent is producing: Next.js source code, database schema, auth, payments, file storage, deployment, custom domain.
The runtime that hosts the resulting app.
The MCP and REST surface that the Agent SDK calls.
The "owned and exportable" output (you can clone the repo to your GitHub at any time).

If you build the same project on the raw Anthropic API plus your own scaffolding, you replicate the agent loop, you write your own Next.js boilerplate, you self-host the database, you wire NextAuth and Stripe by hand, and you spend two weeks setting up CI and deployments. The cost is the eight weeks of engineering you do not get back.

Comparison table: Agent SDK plus direct hosting vs Agent SDK plus Totalum

Capability	Agent SDK + direct hosting	Agent SDK + Totalum
Agent loop, tool calls	Built-in	Built-in
MCP client	Built-in	Built-in
Subagents and sessions	Built-in	Built-in
Production app scaffold	You write it (Next.js, Vite, FastAPI, your call)	One MCP call generates a real Next.js + TotalumSDK repo
Auth	NextAuth, Clerk, Auth0 (you wire it)	Built-in, OAuth providers preconfigured
Payments	Stripe SDK, webhooks (you wire it)	Built-in Stripe integration, checkout flow scaffolded
Database	Postgres, Supabase, etc. (you stand it up)	Built-in TotalumSDK database with admin panel
File storage	S3 or equivalent (you configure it)	Built-in storage, signed URLs out of the box
Deployment	Vercel, Fly, Render, AWS (you set it up)	One `deploy` MCP call, hosted Next.js runtime
Custom domain	You configure DNS and certs	One `addCustomDomain` MCP call, certs handled
Repo ownership	Always yours	Always yours (push to GitHub anytime)
Time from prompt to live URL	Hours to weeks	Minutes
Where the agent ends	A code repo and a TODO list	A running, billable product

The bottom row is the one to remember. Without Totalum, the Agent SDK ends a successful run with a code repo and a checklist of things you still have to set up. With Totalum, the same agent ends with a URL your users can pay you on.

When you should not use the Claude Agent SDK

The Agent SDK is the right tool for a large class of jobs. It is not always the right tool.

One-shot completions: if you only need a single inference (classify this text, summarize this email, extract these fields), use the raw Anthropic API or a smaller model. The Agent SDK adds overhead you do not need.
Strict latency budgets: an agent loop has variable runtime. If you need a deterministic response in under 500 milliseconds, the Agent SDK is not your tool.
Non-Claude models: the SDK is Claude-shaped. If your stack is multi-model (Claude, OpenAI, Gemini, open-source) and you need a single SDK across all of them, LiteLLM in front of the Agent SDK can work, but a vendor-neutral agent framework may be a cleaner fit.
Vendor-locked workflows on top of OpenAI: if your team already uses the OpenAI Agents SDK end to end, the switching cost is real. Run a small pilot on the Claude Agent SDK first.

If your work fits one of these patterns, the right answer is usually "smaller surface, simpler tool." Save the Agent SDK for the workflows that actually want an agent.

For the MCP layer that pairs with the SDK, see our Claude Code MCP servers in 2026 write-up with the seven defaults and the production integration pattern.

Once your agent can drive Totalum, the next step for SaaS teams is to expose that builder to end customers. We cover the architecture in how to embed an AI app builder in your SaaS via API in 2026.

FAQ

What is the difference between Claude Agent SDK and Claude Code?

Claude Code is the developer-facing terminal product (and IDE plugin) you launch with the claude command. The Claude Agent SDK is the underlying library that powers Claude Code, exposed for you to build your own agents on top of. Claude Code is what you use; Claude Agent SDK is what Claude Code is built with. They share the agent loop, the toolset, and the model surface.

Can I use the Claude Agent SDK with my Claude Pro or Max subscription?

Yes. The SDK can authenticate against your Pro, Max 5x, or Max 20x subscription via the bundled Claude Code CLI. From June 15, 2026 onward, headless SDK usage draws from a separate weekly token pool on those plans, sized so casual SDK use stays within plan limits. Heavy headless work is better billed via API credits.

Does the Claude Agent SDK support Sonnet 4.6 and Opus 4.7?

Yes, and as of May 28, 2026 it also supports Opus 4.8, which Anthropic shipped with adaptive thinking only, no sampling parameters, and a 1M context window at standard API pricing. The 4.7 to 4.8 migration is a model ID swap plus prompt re-tuning, with no new breaking changes.

How do I deploy Claude Agent SDK apps to production?

The SDK runs anywhere Python or Node runs, so you can host the agent itself on any standard runtime (Vercel functions, AWS Lambda, Fly, your own VM). The harder part is the app the agent produces. The two clean paths are (1) target a hosted platform like Totalum that exposes deployment as an MCP call, so the agent itself can ship the app, or (2) target a code repository and a CI pipeline you own, and accept the extra setup cost.

Is the Claude Agent SDK free?

The SDK code is open source and free to install. Running it consumes Claude model tokens, which are billed either against your subscription or against API credits. As of May 2026, expect roughly $3 per million Sonnet 4.6 input tokens and $15 per million output tokens, with a 90 percent discount on cached input tokens. The post-June 15 plan changes do not move list prices, only quota allocation.

What is the easiest way to ship a SaaS built on the Claude Agent SDK?

Pair the Agent SDK with Totalum. The SDK drives the agent loop and tool use. Totalum produces and hosts the actual Next.js + TotalumSDK application your users see. You skip auth, payments, storage, database, and deployment work, and you keep full code ownership. Start free at totalum.app, or book a discovery call if you are a SaaS team wanting to embed the same builder behind your own product.

Ready to ship with the Claude Agent SDK and Totalum?

If you are a developer or indie hacker building an agent today, the path is short: install claude-agent-sdk, wire the Totalum MCP, and let the agent produce a deployable application instead of a TODO list. You can register a free Totalum account at totalum.app and start running Agent SDK programs against it in the same session.

If you are a SaaS team or a software agency that wants to embed an AI app builder behind your own product, the Calendly call is faster than reading the docs. We will walk you through the Totalum API and MCP surface, show you how the Agent SDK drives it end to end, and answer the deployment, multi-tenant, and billing questions specific to your stack. Book a 30-minute call at calendly.com/cuentas-speedparadigm.

For context on the rest of the AI coding agent landscape, our best AI coding agents 2026 pillar covers the field, and the Totalum AI agent platform overview explains where the SDK fits into the larger build, ship, and run stack.

For the iOS 27 / Xcode 27 distribution surface that Anthropic just gained at Apple WWDC 2026, see Apple Intelligence Extensions in iOS 27.

Francesc

Writes for the Totalum blog about AI app building, no-code development, and product engineering.