April 14, 2026
· 12 min readMCP Over gRPC: Why Google Is Rewiring the Agent Transport Layer
MCP shipped with JSON-RPC over HTTP. That works beautifully for demos and small integrations — but it crashes into enterprises that run gRPC everywhere. Google is contributing a gRPC transport to MCP, with pluggable transports landing in the SDK. Here's what actually changes, what doesn't, and when the switch is worth making.

Enterprise MCP deployments are hitting a wall. Not the protocol — the transport. When your backend is a forest of gRPC microservices and MCP walks in speaking JSON over HTTP, something has to bend. On January 14, 2026, Google announced what's bending: a gRPC transport package for MCP, landing alongside agreed-on pluggable transport support in the MCP SDK.
TL;DR
- MCP's default transport is JSON-RPC over HTTP (plus SSE for server → client push). Human-readable, easy to debug, works everywhere — and a misfit for enterprises standardized on gRPC.
- Google is contributing a gRPC transport package to the MCP SDK, built on pluggable transport interfaces that MCP maintainers have agreed to support.
- The MCP protocol itself doesn't change. Tools, resources, prompts, sampling — same semantics. Only the layer underneath swaps: protobuf bytes instead of JSON text, HTTP/2 bidirectional streams instead of HTTP + SSE.
- This is mostly about fit, not raw speed. If you're already running gRPC, MCP-over-gRPC means existing service mesh, mTLS, OpenTelemetry traces, and IAM work for agent traffic too.
- It's not a magic bullet. For low-frequency agent sessions, the wire-format win is negligible — LLM inference dominates. Where it matters: high-QPS workflows, real-time resource watching, large production deployments.
- gRPC gives structure, not meaning. MCP's description layer remains the only place an agent learns what a tool does and when to use it.
Why the transport became the bottleneck
MCP has been a genuine win for agent infrastructure. One standard, one schema for describing tools, one way for agents to discover capabilities. Before MCP, every integration was a bespoke connector. Now services expose an MCP server and any compliant agent can use them.
But MCP was designed transport-first for a specific audience: AI developers who wanted to ship fast. JSON-RPC over HTTP was the obvious choice — text-based, browser-debuggable, works with curl, and fits naturally into the LLM-native payload shape where descriptions and parameters are natural language.
That choice starts paying a penalty the moment MCP leaves the demo environment and lands in an enterprise backend. Per Google's announcement, organizations with existing gRPC tooling currently have to bridge that tooling to JSON-RPC — typically by deploying transcoding gateways that sit between MCP JSON requests and the gRPC services underneath.
A transcoding gateway is a translation layer you didn't want, running on servers you have to pay for, introducing a failure point you can't debug without two sets of tools. None of the alternatives are better: rewrite your gRPC services to also speak JSON-RPC, or maintain two parallel implementations forever.
This wasn't a theoretical complaint. A July 2025 GitHub issue on the MCP spec repo (#966) asked for gRPC to be adopted as a standard transport, calling out high JSON serialization overhead, inefficient long-polling for resource watches, and lack of type safety in the API contract. It picked up community traction, and earlier discussions going back to April 2025 had already made the same case.
Spotify, standardized on gRPC in its backend, went ahead and built experimental MCP-over-gRPC support internally rather than wait for the ecosystem to catch up. Per Stefan Särne, senior staff engineer and tech lead for developer experience at Spotify, the benefits showed up as developer familiarity, less work building MCP servers, and statically typed APIs.
Refresher: MCP is actually three layers
To see why swapping transport is clean (and why it wasn't obvious at first), separate MCP into its three layers:
Layer 1 — MCP semantics. What agents and tools actually say to each other. List your tools. Call search_customers with these parameters. Subscribe to this resource. This is the valuable part.
Layer 2 — Message format. How each message is structured on the wire. MCP uses JSON-RPC 2.0, which gives every message a version, id, method name, and params (for requests) or result/error (for responses).
Layer 3 — Transport. How those structured messages get from client to server and back. MCP's spec ships with HTTP (plus SSE for server → client push) and stdio for local processes.
Google's contribution swaps Layer 3 without touching Layer 1. That's the whole game.
How the current transport works (and where it breaks)
A standard MCP tool call over the default transport looks like this on the wire:
POST /mcp HTTP/1.1
Content-Type: application/json
{
"jsonrpc": "2.0",
"id": 42,
"method": "tools/call",
"params": {
"name": "get_weather",
"arguments": { "location": "Mumbai" }
}
}And the response:
HTTP/1.1 200 OK
Content-Type: application/json
{
"jsonrpc": "2.0",
"id": 42,
"result": {
"content": [{ "type": "text", "text": "28°C, light rain." }]
}
}Clean, readable, obvious. You can curl it. You can log it. An intern can read it. For server-to-client push — subscribing to resource changes, streaming tool progress — the default transport layers Server-Sent Events (SSE) on top of the same HTTP connection. The connection stays open and the server pushes event lines down it.
This works. It's also where the enterprise friction shows up:
- Text on the wire is bigger. Every field name (
"jsonrpc","method","params","arguments") ships as literal bytes on every request. For one call, nobody notices. For hundreds of agents making thousands of calls per second with open resource subscriptions, the serialization tax is real. Google reports that protobuf messages can be up to 10x smaller than the equivalent JSON. - No schema enforcement at the transport layer. JSON-RPC validates shape — did you send an object with a
methodfield? It does not validate thatarguments.locationis a string, required, and non-null. You find out at application-level validation, or at runtime in production. - SSE is one-way. The server pushes events to the client. For truly bidirectional streaming — client and server both sending continuous data on a single connection — you need application-level connection synchronization or a second channel.
- It doesn't match what the rest of the backend speaks. This is the big one. If your service mesh routes gRPC, your observability understands gRPC metadata, and your retry, deadline, and circuit-breaker policies are gRPC-native, then every MCP endpoint is an outlier. Two protocol stacks, two sets of tools, two mental models — for the same underlying service.
What gRPC transport actually changes
The pluggable transport swap replaces Layer 3 entirely while leaving Layer 1 semantics intact. Here's what shifts:
| Dimension | JSON-RPC over HTTP + SSE | gRPC Transport |
|---|---|---|
| Wire format | JSON (text) | Protocol Buffers (binary) |
| Transport protocol | HTTP/1.1 | HTTP/2 |
| Streaming | One-way via SSE | Native full-duplex bidirectional |
| Flow control | Application-level | Built-in backpressure |
| Schema enforcement | Application-layer validation | Strict typing at serialization |
| Error semantics | JSON-RPC error codes | Standard gRPC codes (UNAVAILABLE, PERMISSION_DENIED, …) |
| Auth | Bearer tokens over HTTP | mTLS, JWT/OAuth, method-level authz |
| Observability | Custom instrumentation | Native OpenTelemetry integration |
| Debuggability | curl, browser dev tools |
grpcurl, BloomRPC — tooling required |
Two things stand out beyond the wire-format improvements:
HTTP/2 full-duplex streaming is native. A single persistent connection carries bidirectional traffic without the SSE workaround. An agent can subscribe to a resource, the server can push updates, the agent can send new queries on the same stream, and the server can send partial progress back — all concurrently, all over one socket. Google notes this opens the door to truly interactive, real-time agentic workflows without application-level connection synchronization.
Method-level authorization is baked in. You can write a policy that says "this agent identity is allowed to call ReadFile but not DeleteFile" and enforce it at the gRPC layer, with the authorization decision happening before the MCP handler ever runs. That matters enormously for agentic systems, which have a tendency to exhibit what Google politely calls excessive agency — doing more than you asked.
The sequence-diagram view
Breaking it down:
- SSE pattern: request-response for tool calls, a separate long-lived GET for server events. Two channels, one-way push.
- gRPC pattern: one bidirectional HTTP/2 stream carries tool calls, subscriptions, results, and pushes in parallel. Flow control is native. Deadlines and cancellations propagate across the stream.
Performance reality check
Here's where I'll be honest with you: the performance framing gets oversold.
If your use case is a user chatting with an agent that makes five tool calls per conversation, the difference between JSON-RPC and protobuf on the wire is not going to make your app feel faster. LLM inference dominates end-to-end latency by at least one — often two — orders of magnitude. You can swap protocols all day; the model is still the bottleneck.
Where the gRPC transport actually pays off:
- Sustained, high-QPS agent workloads. Hundreds of agents issuing thousands of tool calls per second. Now serialization overhead multiplies across every request, and binary encoding compounds into real resource savings.
- Real-time resource watching. Agents subscribed to streaming data feeds where HTTP/2 flow control and backpressure prevent the agent from getting buried in updates it can't process.
- Multi-hop agent workflows. A single user prompt fans out into a tree of tool calls across microservices, and you genuinely need distributed traces spanning the whole call graph.
- Environments already running gRPC. The real reason. Wire-format savings are a bonus; operational consistency is the main prize.
| Scenario | JSON-RPC/HTTP | gRPC Transport | Meaningful Win? |
|---|---|---|---|
| Prototype, local dev | ✅ | ✅ | No |
| Chat agent, ~5 calls/session | ✅ | ✅ | Marginal |
| Existing gRPC backend | ⚠️ transcoding tax | ✅ native fit | Yes |
| High-QPS agentic workflows | ⚠️ overhead compounds | ✅ | Yes |
| Real-time resource streams | ⚠️ SSE workarounds | ✅ bidirectional | Yes |
| Public, third-party MCP servers | ✅ | ⚠️ tooling friction | No |
The semantic layer isn't going anywhere
Here's the trap someone is going to walk into: "We already expose our services over gRPC. Agents can just call them directly now, right?"
No. And this is the subtle point that catches people out.
gRPC gives you structure: method names, parameter types, return types, standard error codes. That's what machines and type checkers need. What LLMs need is semantics: what does this method actually do? When should I use it? What does this field mean? What's the relationship between this tool and that one?
MCP was designed to carry that semantic layer — tool descriptions, usage hints, parameter semantics, resource annotations. A raw gRPC service has none of it. You can expose the service, but an agent will not know when to call it or what the outputs mean.
💡 The gRPC transport swaps how MCP messages travel. It does not eliminate the MCP description layer. If anything, it makes that layer more important — it is now the only place the agent learns what a tool means.
The strategic subtext
This isn't just a transport patch. Look at the timing:
- December 10, 2025: Google announces fully managed remote MCP servers for BigQuery, Maps, Compute Engine, and Kubernetes Engine. Enterprise-grade endpoints, IAM-integrated, Model Armor layered on top.
- December 2025: Anthropic donates the MCP spec to the Linux Foundation under the new Agentic AI Foundation (AAIF), making the protocol officially vendor-neutral. AWS and Microsoft are Platinum members.
- January 14, 2026: Google announces the gRPC transport contribution and MCP maintainers commit to pluggable transports.
- February 2026: Managed MCP servers for Google Cloud databases — AlloyDB, Cloud SQL, Spanner — go live.
Read as a sequence, Google is assembling a pipeline where enterprises running gRPC backends can slot MCP directly into their existing infrastructure and point their agents at Google Cloud's managed MCP endpoints with minimum protocol friction. It's a wedge against AWS Bedrock AgentCore and Azure's MCP integration paths.
It's also a milestone for MCP itself. Going from "JSON-RPC over HTTP" to "pluggable transport interfaces, with JSON-RPC as one option and gRPC as another" is a meaningful architectural maturation. The protocol is shedding its transport opinions.
Production checklist
If you're seriously evaluating MCP-over-gRPC:
- Check the SDK state. The gRPC transport is in active development against the Python MCP SDK via an open pull request. Pluggable transports have been agreed on; the package itself is being contributed. For anything customer-facing, wait for the official release.
- Don't throw out JSON-RPC. It remains the default and will continue to be. Most public third-party MCP servers and clients ship JSON-RPC first. If you're building an MCP server for external consumption, JSON-RPC is still the path of least friction.
- Match the transport to the audience. Internal services, gRPC-first infra, high-QPS: gRPC. Public-facing, third-party integration, low-volume: JSON-RPC.
- Plan for both. A well-designed MCP server shouldn't hard-code its transport. Pluggable transports mean you should be able to expose the same semantic server over both — and you probably will, sooner than you think.
- Don't skip the description layer. Repeating this: typed schemas are not descriptions. Write good tool descriptions, annotate parameters, give agents the context they need.
- Lean on method-level authorization. Per-method gRPC authz on MCP tools is one of the biggest wins for production agent safety. Use it to enforce least-privilege from day one.
When to use which transport
Conclusion
I've been following the MCP transport conversation for months, and this feels like the point where the protocol starts to grow up. Not because gRPC is inherently better than JSON-RPC — it isn't, for every use case — but because MCP is finally admitting that different deployments have different infrastructure constraints, and the transport should flex to fit them instead of the other way around.
For most teams today, the honest answer is: stay on JSON-RPC over HTTP. It's fine. It works. If you're in a gRPC-native backend and you've been quietly dreading the transcoding gateway project, this is your out. And if you're architecting a new MCP server for production, start with transport-agnostic abstractions so you're not rewriting the moment your deployment environment changes.
The agentic AI infrastructure stack is maturing fast. Pluggable transports are one of those moves that look small from the outside and change what's possible on the inside. Worth tracking the SDK PR.
FAQ
Is gRPC replacing JSON-RPC as the default MCP transport?
No. JSON-RPC over HTTP stays as the default transport in MCP. gRPC is being added as a first-class alternative through the pluggable transport interfaces that MCP maintainers have agreed to support. You pick the transport that fits your infrastructure.
When is MCP-over-gRPC actually worth the switch?
When your backend is already standardized on gRPC, so MCP traffic can use your existing service mesh, tracing, auth, and tooling. Also for high-QPS agentic workloads with sustained streaming. For low-volume chat agents, the wire-format savings are marginal because LLM inference dominates end-to-end latency.
Do I need to rewrite my MCP server to support gRPC?
Not if the SDK's pluggable transport abstraction is used correctly. The MCP protocol semantics — tools, resources, prompts — are transport-agnostic. Once the gRPC transport package ships, the same server should be exposable over both JSON-RPC and gRPC by swapping the transport layer.
Does gRPC solve the problem of making tools discoverable to LLMs?
No, and this catches people out. gRPC gives structure: method names, parameter types, error codes. LLMs need semantic descriptions — what a tool does, when to call it, what parameters mean. That's the MCP description layer, which sits on top of whatever transport you pick. You still need it.
How much faster is gRPC than JSON-RPC for MCP traffic?
Google cites binary Protocol Buffer messages as up to 10x smaller than equivalent JSON, with lower serialization overhead and reduced bandwidth. Real-world end-to-end gains depend on workload. For LLM-bound agents making a handful of tool calls, the difference is imperceptible. For high-QPS agent traffic with streaming subscriptions, it compounds.
Is the gRPC transport available today?
It is in active development as of early 2026. MCP maintainers have committed to pluggable transports, and Google Cloud is contributing the gRPC transport package. Initial work is against the Python MCP SDK. Production use should wait for the official release.
What about HTTP/2 bidirectional streaming — does that replace Server-Sent Events?
Yes. gRPC's native full-duplex bidirectional streaming over HTTP/2 means agents and tools can exchange data on a single persistent connection without SSE workarounds. For resource watching and real-time agent-tool interaction, this is a meaningful step up from HTTP + SSE.