Agents·June 8, 2026·7 min read

The Anthropic SDK Middleware: Stop Writing Your Own Tracing Wrappers

The Anthropic SDK shipped a native middleware API, the agent SDK pushed 10 releases in 7 days, and Nuxt 4.4.7 is a security hotfix. Quarterly dependency reviews are now too slow for production AI.

Ikki

Last verified · June 8, 2026

The Anthropic SDK Middleware: Stop Writing Your Own Tracing Wrappers

Three things shipped. One pattern.

@anthropic-ai/sdk 0.101.0 landed June 5 with a middleware API. @anthropic-ai/claude-agent-sdk 0.3.168 landed June 6 — the tenth release in seven days. Nuxt 4.4.7 landed June 2 as a security hotfix.

None of these is a headline event on its own. Together they make a point about production AI maintenance in mid-2026: it's weekly, it compounds, and passive semver ranges no longer cut it.

Teams treating their Anthropic SDK as "install once, update quarterly" are now several features behind. This week, at least one of those features is something they've been building by hand.

The middleware API: your logging wrapper is now technical debt

@anthropic-ai/sdk 0.101.0 ships a middleware API that intercepts every HTTP request the client makes.

The surface is an array of fetch-level middleware passed when you construct the client: new Anthropic({ middleware: [fn] }). Each middleware has the signature async (request, next, ctx) => Response and wraps every HTTP attempt, retries included. You install it once; every call (messages.create, tool executions, streaming) flows through the same layer — no per-call wrapper, no decorator pattern, no custom fetch override.

One registration, and you have a clean interception point for logging, tracing, retry logic, circuit-breaking, or a near-zero-config Langfuse integration.

0.102.0, released June 6, fixed the execution order: middleware now runs before request signing. That ordering matters for anything that reads or modifies headers pre-signing. If you installed 0.101.0, pin to 0.102.0.

The practical implication isn't subtle. Most teams running production AI already have some version of this layer: a wrapper around the SDK client, a custom fetch, a decorator that logs inputTokens / outputTokens / latency before passing through.

Some of those wrappers are clean. Most have accumulated edge cases. What happens on retry? Does the logger double-count tokens on a stream error? Does it compose correctly with the SDK's own exponential backoff?

Here's the shape of the thing most teams wrote by hand, and what replaces it:

// Before: a wrapper every call has to go through
async function tracedCreate(client, params) {
  const start = performance.now();
  const res = await client.messages.create(params);
  log({ tokens: res.usage, ms: performance.now() - start });
  return res;
}

// After: one fetch-level middleware, set once at construction.
// Signature (request, next, ctx); it runs per HTTP attempt,
// INSIDE the SDK's retry loop.
const tracing = async (request, next, ctx) => {
  const start = performance.now();
  const response = await next(request);
  log({ status: response.status, ms: performance.now() - start });
  return response;
};
const client = new Anthropic({ middleware: [tracing] });

(API verified against 0.102.0, core/middleware.ts. The middleware is fetch-level: it sees the HTTP request/response, not the parsed usage object — to count tokens, read the response body without consuming it, or keep that accounting at the messages.create result level.)

One caveat: the middleware runs inside the retry loop — it fires once per HTTP attempt. A call retried 3 times therefore invokes the middleware 3 times, so a naive logger over-counts on retries (the opposite of a wrapper around messages.create, which only sees the final result). The upside: you observe every attempt — valuable for tracing retries — without re-implementing backoff. The trade-off: aggregate or dedupe per logical request if you're counting cost or tokens. Verify this behavior under your own load before deleting the old code.

For teams on Langfuse, Helicone, or a homegrown observability layer, the migration path is the same: replace the wrapper with a middleware registration. The custom wrapper built in Q1 looks a lot like technical debt as of June 5.

The agent SDK: 10 releases in 7 days, two that matter

@anthropic-ai/claude-agent-sdk went from 0.3.158 on May 30 to 0.3.168 on June 6 — ten releases in seven days.

Most carry the same label as prior weeks: parity with Claude Code. The SDK mirrors the CLI surface; when the CLI ships daily, so does the SDK. Two changes are worth specific attention.

MCP resource tools at runtime. A fix landed for resource tools not being injected when servers are added at runtime via mcp_set_servers. If your agents build their MCP server list dynamically — adding servers based on context, task, or user — this was a silent failure mode.

Resource tools registered after initialization weren't reaching the model context. Agents ran tasks without tools they should have had, and the SDK gave no signal. The fix landed in the 0.3.162–0.3.165 window; if you need the precise release, diff the changelog before pinning.

Refusals are now explicit. stop_reason: "refusal" now propagates on refused messages. Previously a refusal looked like a normal completion — the model returned, the code moved on, and unless you were inspecting message content carefully you'd process the refusal as a successful response.

With stop_reason: "refusal", the signal is explicit. Any pipeline doing routing or fallback on safety grounds can branch cleanly without text inspection — a small change that removes a whole class of brittle string-matching.

If you're more than three minor versions behind 0.3.168, you're missing at least one of these. The check is one line: npm ls @anthropic-ai/claude-agent-sdk in your production environment.

Nuxt 4.4.7: a security patch that can't wait for next sprint

Nuxt 4.4.7 shipped June 2 as an explicit security release. Three issues: a bypass of the noSSR payload behavior in Nitro, a potential path traversal in Vite's allowDirs configuration, and a boundary check in the build cache.

The noSSR bypass and the path traversal are the two to read closely. The noSSR bypass means server-side rendering could be triggered in contexts where the app explicitly opted out — relevant if your SSR path has different data access or session handling than your CSR path.

The Vite allowDirs issue is primarily a dev-mode concern, but shared dev/prod Vite configs are common enough to warrant patching regardless.

Upgrade path: nuxt is the only package that changes. Lock to 4.4.7, run your install, verify the build. In a monorepo with a workspace catalog, it's one line in pnpm-workspace.yaml. This isn't a next-sprint situation — Nuxt has explicitly superseded 4.4.6 for security reasons.

What to actually do this week

Three concrete moves, in priority order:

Patch Nuxt to 4.4.7 today. Security release, single-package change, low blast radius.
Pin the Anthropic SDK to 0.102.0 and check whether your tracing wrapper can be retired in favour of middleware. Keep both running in parallel for a day, compare the logs, then delete the wrapper.
Run npm ls @anthropic-ai/claude-agent-sdk and upgrade if you're more than three minors behind 0.3.168.

Then make the recurring version of this automatic. A weekly CI check is roughly fifteen lines: pull the installed version with npm ls --json, pull the latest from the registry with npm view <pkg> version, and fail (or warn) when any @anthropic-ai/* or @google/genai package is more than five minor versions behind. Run it on a Monday cron, not on every push — the goal is a weekly nudge, not a blocked pipeline.

What we're betting on next week

The middleware API is the clearest signal yet that Anthropic treats the SDK as managed infrastructure, not a thin client wrapper. The open question is whether the agent SDK gets a parallel middleware layer or stays event-hook based. The 0.3.x velocity suggests the answer won't take long.

We're also watching @google/genai, which hit 2.8.0 on June 3 — the eighth minor since 2.0.0 on May 7. One version every five days. Teams still on 2.6.0 are two minors behind, which at this cadence plausibly means two weeks of Gemini API alignment they don't have.

The broader pattern holds: the AI SDK surface is moving faster than most dependency-update cycles. Quarterly reviews are structurally too slow for these packages. The teams that stay current aren't more diligent — they've just moved the check from a calendar reminder into CI.

If you want a second pair of eyes on which parts of your Anthropic stack should stay custom and which are now better off managed, get in touch.

Work with Ikki

Audit your agent infrastructure?

We map which parts of your Anthropic stack should stay custom and which become managed in light of recent SDK shifts — dependency scan and tracing setup delivered in two days.

Start a project See our work

Agents

The Week a Government Cut Off Anthropic's Best Model

Fable 5 suspended by US export control on June 12, two legacy models retired June 15, and the SDK shipped model fallback for every failure mode. Same week: the risk and the answer. Here's how to harden your stack.