Why MCP 2026-07-28 Spec Drops Sessions and Goes Stateless
Writing
TECHNOLOGY
May 24, 202612 min read

Why MCP 2026-07-28 Spec Drops Sessions and Goes Stateless

MCP 2026-07-28 spec release candidate drops sessions for a stateless protocol core. Here is why it changed, what breaks, and how to plan your migration.

mcp-2026-07-28-specmodel-context-protocolmcp-statelessai-agentsprotocol-designanthropic

The MCP team locked the 2026-07-28 spec release candidate on May 21. It is the largest revision since launch and yes, it breaks things. If you are running an MCP server today, you need to understand what changes before July 28 finalizes the spec.

I keep seeing people on X call this "MCP 2.0". It is not. The spec uses date versions, not semver. The official name is the 2026-07-28 specification, and it sits as a release candidate until the final freeze on July 28 2026. The ten-week window between now and then is for SDK maintainers and server authors to validate against real workloads.

The big shift is that MCP is going stateless. The current 2025-11-25 spec treats every client-server connection as a session with handshake, identifier, and lifecycle. The new spec rips most of that out at the protocol layer. Below I walk through what changed, what broke, and whether you should rush your migration or wait.

What is the MCP 2026-07-28 spec release candidate?

The 2026-07-28 release candidate is the next major version of the Model Context Protocol, locked on May 21 2026 and finalizing on July 28 2026. It introduces a stateless protocol core, a formal Extensions framework, the Tasks extension for long-running work, MCP Apps for server-rendered UIs, and OAuth-aligned authorization.

The official MCP roadmap calls this revision "the largest revision of the protocol since launch". Tier 1 SDK maintainers (the official Anthropic-maintained Python and TypeScript SDKs) are expected to ship support within the ten-week validation window. If you maintain a server or build agent infrastructure on top of MCP, this is the version you target next.

A subtle but useful detail: the MCP team shifted its 2026 roadmap from release-milestone organization to priority-area focus. Four priority areas drive the spec now: Transport Evolution and Scalability, Agent Communication, Governance Maturation, and Enterprise Readiness. Almost every concrete change in the 2026-07-28 spec maps directly to one of those four. That alignment matters because it tells you what to expect in the next spec drop too.

Why was the 2025-11-25 stateful protocol hard to scale?

The 2025-11-25 spec required every connection to start with an initialize/initialized handshake. The server returned a session identifier in an Mcp-Session-Id response header, and every subsequent request from that client had to carry the same header so the server could route it back to its session state.

That design is fine for a single-process MCP server running on a developer laptop. It is painful in production. The MCP roadmap calls the problem out directly: "Stateful sessions fight with load balancers, horizontal scaling requires workarounds." When you put a load balancer in front of stateful MCP servers you have three bad options.

You can run sticky sessions, which fail open whenever an instance restarts or scales down. You can share session state in Redis or a similar store, which adds a network hop and a single point of failure to every tool call. Or you can do deep packet inspection at the gateway, reading Mcp-Session-Id out of the header to manually route traffic, which forces every MCP client to know your private cluster topology.

In my own work running an MCP server behind a Vercel Function I hit this within a week. Functions are stateless by design. The first attempt to add MCP routing died on cold starts because the next invocation had no idea what session-id the client was carrying. Working around it meant pushing every session into Upstash Redis and eating the round trip on every tools/list call. Not fun.

The pattern is the same for any horizontally scaled deployment. Cloud Run, AWS Lambda, Kubernetes with HPA, even traditional VMs behind an L7 load balancer all run into the same wall. The old spec quietly assumed long-lived processes with shared memory. The new spec assumes nothing.

How does the stateless protocol core actually work?

The new spec removes the initialize/initialized handshake and the Mcp-Session-Id header from the base protocol. Each MCP request now carries everything the server needs to route it independently, and clients explicitly thread any state across calls themselves.

The headline change for ops folks: a stateless MCP server can sit behind a plain round-robin load balancer with no sticky sessions and no shared session store. According to the release notes, gateways can route traffic on the new Mcp-Method header instead of inspecting payloads, and clients can cache tools/list responses for as long as the server's ttlMs field permits. That last part matters because in production a tools/list call on a busy server can dominate the latency budget.

Here is the shape of the change in terms of request flow.

Before (2025-11-25):
client -> initialize -> server -> 200 + Mcp-Session-Id: abc
client -> tools/list (Mcp-Session-Id: abc) -> server (must own session abc)
client -> tools/call (Mcp-Session-Id: abc) -> same server instance
 
After (2026-07-28):
client -> tools/list -> load balancer -> server A (returns list + ttlMs)
client -> tools/call -> load balancer -> server B (different instance, same result)

You no longer need the same physical instance to handle both calls. That is the entire ballgame. In practice it means three things for your infrastructure: you can drop sticky session config from your load balancer, you can remove any shared Redis-based session store, and you can stop sending Mcp-Session-Id from your gateway routing rules.

The new Mcp-Method header is the small detail that pays off for gateway authors. Instead of parsing the JSON-RPC body to know whether a request is tools/list or tools/call, the gateway can read a single header and route by method. That lets you split tools/list (cacheable, idempotent) onto edge nodes and tools/call (mutating, sometimes expensive) onto warm origin nodes.

What do Extensions, Tasks, and MCP Apps add?

The 2026-07-28 spec formalizes Extensions as the way new capabilities ship outside the core protocol. The MCP team calls out Extensions as a deliberate move to keep the core small while still letting the ecosystem experiment. New capabilities propose themselves as Extensions first, prove production value, and only graduate into the base spec on the next date-stamped revision.

Two extensions matter most for new servers right now. The Tasks extension is built for long-running operations that need retries, expiry, and lifecycle tracking. If you have ever tried to wrap a 3-minute search index build inside a single MCP tool call and watched the client time out, Tasks fix that. The roadmap explicitly mentions "Tasks lifecycle refinement (retries, expiry policies)" as a priority for 2026.

The Tasks model looks roughly like this. Your tool call returns a task handle instead of a result. The client polls or subscribes to the task by id. The server keeps the task's state in whatever store you like (Redis, Postgres, S3), and the client decides how long it is willing to wait. The client and server never need to be in the same process for this to work. A different server instance can resume a task from its persistent state because the task id is the carrier, not a session.

MCP Apps are the second new extension. They let a server return server-rendered UI components instead of plain text or JSON payloads. The idea is that an MCP server for, say, a payments API can return a real card UI that the host renders inline. This is the closest MCP has come to giving servers presentation control, and it pulls the protocol closer to where Anthropic's Claude Skills and Claude Apps already live.

Both Extensions and Tasks shipped first as experimental features so the team could collect real-world feedback before locking them into a final spec. Treat them as production-ready in the RC window, but expect minor field renames before July 28.

What changes in authorization and error handling?

The new spec aligns authorization more closely with OAuth and OpenID Connect. The earlier spec left auth largely as an implementation detail, which meant every enterprise MCP deployment built its own bearer-token flow. The 2026-07-28 spec defines how OAuth flows interact with MCP transport so a single enterprise SSO setup covers all your MCP servers.

The roadmap lists "Governance Maturation" and "Enterprise Readiness" as priority areas, with audit trails, SSO auth, and gateway behavior as concrete deliverables. If you have been blocked on MCP rollout because security teams could not approve a custom auth flow, this is the change that unblocks you.

Error handling also tightens up. The error code for missing resources shifts from the proprietary -32002 to the JSON-RPC standard -32602. Looks like a small change. In practice it means any client that hard-coded -32002 in retry logic now silently swallows the error and retries forever on legitimate not-found responses. I have already seen this fail in a private SDK that hardcoded the old code. Catch both for the next year of mixed deployments.

A nice secondary effect is that monitoring tools that already understand JSON-RPC standard codes (and there are a lot of them) suddenly understand MCP traffic for free. You lose a small amount of MCP-specific signal in exchange for free integration with every JSON-RPC tracing tool out there. That is a good trade.

Which existing code breaks when you upgrade?

The breaking surface is small in lines of code but wide in impact. Here is what I am tracking across my own servers.

Anywhere your code asserts on Mcp-Session-Id will break. The header is no longer guaranteed and any session-based routing logic needs to come out of your gateway and your client. If you are running NGINX or an Envoy sidecar with rules on Mcp-Session-Id, those rules now do nothing. Worse, they may silently route ALL traffic to one origin because the default fallthrough rule kicks in.

Anywhere your client uses session state implicitly via the handshake will break. The new spec asks you to thread identifiers explicitly between tool calls. If you call tools/list and store the result keyed by session, you need to switch to keying by server URL plus ttlMs.

Anywhere your retry logic catches -32002 will silently misbehave. Change it to -32602 or, better, catch both for the next year of mixed deployments while clients catch up.

Anywhere your gateway does deep packet inspection on the body to route requests will break in a more positive direction. You can rip that logic out and rely on the new Mcp-Method header for routing decisions. Code you delete is code that cannot break in production.

If you are using a Tier 1 SDK (the Anthropic-maintained Python and TypeScript SDKs), the SDK will hide most of these changes from you within the ten-week validation window. If you are on a community SDK, check whether your maintainer is in the Tier 1 list before you plan a migration date. The SDK tier system is itself new to MCP and tells you which SDKs the spec maintainers actively coordinate with on breaking changes.

Should you migrate before July 28 or after?

It depends on whether you control both the client and the server. If you do, migrate as soon as the SDK you depend on ships RC support, which for Tier 1 should land within four to six weeks. The new protocol is friendlier to the production patterns you already want anyway.

If you ship a public MCP server consumed by clients you do not control, hold until July 28 and ship support for both spec versions on the same endpoint for at least a quarter. The spec's deprecation policy and the SDK tier system make dual support cheaper than it sounds, since both protocols can share the same handler code with a thin compatibility layer that reads the request and returns either a session-id (for old clients) or no session (for new ones).

The one case where you should rush is if you are running MCP in a Kubernetes deployment behind an L7 load balancer. The current setup probably has at least one band-aid (sticky sessions, Redis session store, deep packet inspection) that exists only because of the stateful protocol. Migrating lets you delete that code, and code you delete is code that cannot break in production.

There is also a clear case where you should wait. If your MCP server depends on a community SDK that is not in the Tier 1 list, you do not have a guarantee that the SDK will ship 2026-07-28 support before July 28 itself. Plan a quarter of slack. Do not promise a migration date until the SDK author publishes their own.

What does this mean for your MCP servers?

The 2026-07-28 spec is a clean win for anyone running MCP in production. The stateless core removes the awkward fit between MCP and modern serverless or horizontally scaled deployments, and the Tasks and MCP Apps extensions plug gaps that every real deployment has been working around with custom code.

The bigger lesson, though, is that MCP is starting to act like an open protocol with real production users, not a research experiment from Anthropic. Date-stamped versioned specs, an SEP process, formal SDK tiers, and a published roadmap with priority areas all push it in the direction of HTTP or LSP, not a vendor SDK. That is a healthy sign even if the immediate migration is annoying.

If you are starting a new MCP server this week, target the 2026-07-28 RC directly. If you have an existing one, start with deleting your Mcp-Session-Id routing rules and your shared session store. Most teams will find that the migration shrinks their MCP stack rather than grows it, which is the right direction for a protocol that is supposed to be the lowest-friction way to give an LLM access to tools.

For more on the new spec, see the official 2026-07-28 Release Candidate announcement, the 2026 MCP Roadmap, and the current 2025-11-25 specification for the baseline you are migrating from.

Keep Reading

Frequently Asked Questions

What is the MCP 2026-07-28 spec release candidate?

The MCP 2026-07-28 spec is the next major version of the Model Context Protocol, locked as a release candidate on May 21 2026 and finalizing on July 28 2026. It introduces a stateless protocol core, an Extensions framework, the Tasks extension for long-running work, MCP Apps for server-rendered UIs, and OAuth-aligned authorization.

Is this the same thing as MCP 2.0?

No. MCP uses date-stamped spec versions instead of semver, so there is no official "MCP 2.0" label. The new spec is identified as the 2026-07-28 specification version. The MCP team describes it as the largest revision of the protocol since launch, but the version string stays in the date format.

Why is MCP going stateless?

Stateful MCP sessions fight horizontal scaling. The old 2025-11-25 spec required every connection to carry an Mcp-Session-Id header so requests routed back to the same instance, which forced sticky sessions, shared Redis state, or deep packet inspection at the gateway. The 2026-07-28 stateless core removes all three workarounds.

What breaks if you upgrade an existing MCP server?

Anything that depends on the initialize handshake or the Mcp-Session-Id header. Routing rules in NGINX, client-side session keys, and retry logic that catches the old -32002 error code (now the JSON-RPC standard -32602) all need updates. Tier 1 SDKs from Anthropic should absorb most of these changes within the ten-week validation window.

Rabinarayan Patra

Rabinarayan Patra

SDE II at Amazon. Previously at ThoughtClan Technologies building systems that processed 700M+ daily transactions. I write about Java, Spring Boot, microservices, and the things I figure out along the way. More about me →

X (Twitter)LinkedIn

Stay in the loop

Get the latest articles on system design, frontend and backend development, and emerging tech trends, straight to your inbox. No spam.