April 26, 2026
· 18 min readWebSockets Are an HTTP Request That Stops Being HTTP
WebSockets disguise themselves as HTTP, then drop the disguise the moment they're past the firewall. That trick is the entire reason real-time works on the web — and the entire reason WebSockets break your load balancer, your thread pool, and your deploys. This post walks through the four promises of HTTP, how SSE bends one of them, how WebSockets break three, and what production teams have to rebuild because of it.

TL;DR
- WebSockets are a smuggling operation. They look like an HTTP GET to every firewall, proxy, and load balancer on the path — and stop being HTTP the moment the server returns
101 Switching Protocols. - HTTP makes four promises to its infrastructure: stateless, short-lived, client-initiated, and infrastructure-friendly. Real-time features need to break some of them.
- SSE bends one promise (short-lived) and gets server push, automatic reconnection, and
Last-Event-IDresume — for free, with no special server. - WebSockets break three promises at once (short-lived, stateless, client-initiated) and force you to rebuild concurrency, load balancing, NAT-keepalive, and reconnection yourself.
- Pick by direction: one-way → SSE. Genuinely bidirectional → WebSockets. One-shot → plain HTTP.
Why WebSockets feel weird
Most protocols are clean abstractions. WebSockets aren't — and the reason isn't bad design, it's that the web they had to ship into wasn't designed for them.
HTTP works on a single rule. Client asks. Server answers. Exchange ends. That rule comes with four promises baked into every piece of infrastructure between your browser and your origin:
- Stateless — the server remembers nothing between requests.
- Short-lived — connections exist just long enough to answer, then close.
- Client-initiated — the server never speaks first.
- Infrastructure-friendly — every CDN, proxy, firewall, and load balancer is built around the first three.
These promises are why HTTP is cheap. You can put 50 servers behind a load balancer and it doesn't matter which one gets a request. You can cache responses at the edge. A request can pass through a dozen middleboxes and none of them need to understand your application — they only need to understand HTTP, which is the same everywhere.
But notifications, live prices, chat, and multiplayer games need the server to say something the client didn't ask for. Promise three gets in the way. The naive workaround — polling — keeps all four promises but pays a full round trip every two seconds for updates that mostly don't exist.
Two protocols were designed to fix this. One bent HTTP's rules. The other broke them.
Server-Sent Events: bend one promise, keep three
SSE is built on a single observation: of the four promises, you can bend short-lived without breaking any of the others. Do that, and you get real-time server push almost for free.
The client makes a normal HTTP GET. The server accepts it, but instead of responding and closing, it keeps the connection open and writes events as they happen. The response body just never ends.
// SSE server (Node, no framework)
import http from "node:http";
http.createServer((req, res) => {
res.writeHead(200, {
"Content-Type": "text/event-stream",
"Cache-Control": "no-cache",
"Connection": "keep-alive",
});
let id = 0;
const tick = setInterval(() => {
res.write(`id: ${++id}\n`);
res.write(`event: price\n`);
res.write(`data: ${JSON.stringify({ btc: Math.random() * 100000 })}\n\n`);
}, 1000);
req.on("close", () => clearInterval(tick));
}).listen(3000);That's the entire protocol. No upgrade step, no new connection type. From the infrastructure's perspective it's a regular HTTP response that happens to be very, very long. The other three promises stay intact — the client still initiated, the connection still rides HTTP through every proxy that speaks HTTP, and the server can be any backend in the fleet.
The client side is built into the browser:
// SSE client — no library, browser-native
const events = new EventSource("/prices");
events.addEventListener("price", (e) => {
const { btc } = JSON.parse(e.data);
console.log("BTC:", btc);
});
// When the connection drops, the browser automatically reconnects
// and sends Last-Event-ID so the server can resume.Reconnection is built in. When the connection drops, the browser reconnects on its own and sends Last-Event-ID so the server can pick up exactly where it left off — that's part of the spec, not your code. Server-Sent Events have been supported in Firefox 6+, Safari 5+, and Chrome 6+ since 2010, but Internet Explorer never supported them, and Edge only joined in at version 79, which is why the industry quietly skipped SSE through the 2010s when IE compatibility still mattered.
💡 Tip: Most "real-time" features are one-directional — notifications, live dashboards, log streaming, AI token streaming, progress updates. That's SSE's entire sweet spot.
There's one practical limitation: the browser's EventSource API doesn't support custom request headers, so you can't attach an Authorization: Bearer ... token the usual way. You authenticate via cookies or a token in the URL. If you genuinely need custom headers, the microsoft/fetch-event-source polyfill adds them back — at the cost of reimplementing the browser's retry logic yourself. This is the main reason teams bail on SSE and reach for WebSockets, but it's a constraint on auth-flow, not on what real-time features you can build.
If one promise gently bent is enough for your feature, you're done. If both sides need to talk, you have to break promises — not bend them.
WebSockets: break three promises at once
For chat, multiplayer games, collaborative editors — anything where both sides send messages independently — three of the four HTTP promises have to go:
- Long-lived, not short-lived.
- Stateful, because the server has to remember you.
- Server-initiated, because the server has to speak first.
But we still need promise four. We still want the connection to travel through the actual infrastructure of the actual internet. Invent a new protocol on a new port and firewalls block it. Proxies get confused. Corporate networks reject it.
So WebSockets don't ask. They smuggle.
The protocol disguises itself as an HTTP request, passes through every middlebox that thinks it knows what HTTP looks like, and only after it's safely on the other side does it drop the disguise.
The handshake: how the smuggle works
Three steps. That's the whole trick.
Step 1 — the disguise
The client sends what looks, to anything inspecting it, like a perfectly ordinary HTTP GET. Any firewall that understands HTTP understands this. Any proxy forwards it without blinking.
GET /chat HTTP/1.1
Host: example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13Tucked inside is a polite request: if you also speak WebSockets, I'd like to stop speaking HTTP after this. If the server doesn't know about WebSockets, it ignores the Upgrade header — the request is just a normal GET. Nothing breaks.
The Sec-WebSocket-Key is a random 16-byte nonce, base64-encoded. It's a challenge.
Step 2 — 101 Switching Protocols, plus a sanity check
The server responds, also from the infrastructure's point of view, with a perfectly normal HTTP response. Status 101 Switching Protocols is part of the WebSocket opening handshake, and Sec-WebSocket-Accept is calculated from the value of Sec-WebSocket-Key in the corresponding request:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=The Sec-WebSocket-Accept value is computed by appending a fixed string to the client's key, hashing with SHA-1, and base64-encoding the result:
import crypto from "node:crypto";
// The "magic GUID" — defined verbatim in RFC 6455
const MAGIC = "258EAFA5-E914-47DA-95CA-C5AB0DC85B11";
function deriveAccept(clientKey) {
return crypto
.createHash("sha1")
.update(clientKey + MAGIC)
.digest("base64");
}
deriveAccept("dGhlIHNhbXBsZSBub25jZQ==");
// => "s3pPLMBiTxaQ9kYGzzhZRbK+xOo="Breaking it down:
clientKey— the random base64 nonce sent by the browser.MAGIC— a static GUID hardcoded into RFC 6455. In the official Ratchet WebSocket library and others, this constant is defined asGUID = '258EAFA5-E914-47DA-95CA-C5AB0DC85B11'.sha1(...)thenbase64(...)— the result is what the server returns asSec-WebSocket-Accept.
This isn't encryption. Both values are public. The GUID is written right in the spec. As security, it would be laughable. So what's it for?
Important: The magic GUID isn't security. It's a proof-of-understanding. It defends against two failures:
- A client sends
Upgrade: websocketto some misconfigured server (or a dumb proxy that just echoes headers it doesn't recognize). Without the challenge, the client can't tell a real WebSocket server from a pretender that copies headers around.- A caching layer — a CDN, an HTTP proxy — could grab a valid handshake response and serve it to a different client later. Each client generates a fresh random key, so only a live server talking to this client right now can compute the matching answer. Yesterday's cached response won't match today's key.
For a protocol designed to travel through infrastructure it doesn't trust, that's the minimum sanity check.
Step 3 — the disguise comes off
The moment the 101 finishes sending, both sides stop speaking HTTP. The TCP connection is still open. The firewalls and proxies along the way still see an open connection they already approved. But what flows over it is now an entirely different protocol — a compact binary framing format with text frames, binary frames, continuation frames, and three control frames: ping, pong, close.
None of it looks like HTTP. None of it would survive on its own. But because the connection began as HTTP, the bytes get where they need to go.
That's why WebSockets work over HTTP ports 80 and 443 as well as supporting HTTP proxies and intermediaries — the protocol is designed to operate in the existing HTTP infrastructure. There is no "WebSocket port." There couldn't be. The whole point is to travel on trusted infrastructure.
The infrastructure catches on
Clever design, but every deception has a cost. Three of HTTP's promises are no longer being kept, and every piece of software between the browser and the server was built around those promises.
The web server's concurrency model
A traditional HTTP server has a pool of worker threads. Request comes in, thread handles it, response sent, thread returns to the pool. Each thread is busy for milliseconds. That's the whole reason small thread pools serve huge traffic — the workers cycle fast.
Picture it as a restaurant. Waiters take an order, bring the food, move on. Each waiter serves 20 tables a shift. It works because each interaction is short.
Now put a WebSocket connection on every table. Every customer who sits down gets a dedicated waiter who stays at their table until they leave, whether or not anyone's talking. For 1,000 customers, you need 1,000 waiters. The restaurant's economics fall apart — not because customers got more demanding, but because the waiters never come back to the pool.
This is why WebSockets feel native in Node, Go, Elixir, or recent Java (virtual threads), and awkward in classic thread-per-request stacks. When a framework claims "WebSocket support," what that really means is our concurrency model doesn't assume the short-lived promise anymore.
Load balancers and stickiness
HTTP statelessness is what makes load balancers easy. Any request goes to any backend. If one dies, the next request goes somewhere else.
A WebSocket, once the handshake completes, is anchored. The TCP connection lives on exactly one backend, and whatever in-memory state that backend holds for the client lives there with it — nowhere else.
You don't technically need sticky sessions for the first connection. The load balancer could route the handshake anywhere. But that connection will drop and the client will reconnect — constantly, in production. You want the reconnection landing on a backend that already knows who the client is. That's what sticky sessions do.
And if any backend needs to send a message to a client connected elsewhere, you need a shared message bus. Redis Pub/Sub is the common pick, because the backend with the message isn't necessarily the one holding the connection.
None of this is really about WebSockets. It's every layer of infrastructure designed around HTTP's promises having to be rebuilt or worked around once those promises stop being true.
NAT gateways and the silent killer
NAT gateways and routers sit between billions of home networks and the internet, keeping tables of active TCP connections. Because HTTP connections are short-lived, those tables stay small and cheap. To keep them that way, most NAT gateways silently evict connections that have been idle for a few minutes.
Cellular carriers are far more aggressive. Cellular NAT gateways drop idle mappings in as little as 30 seconds, and network transitions like Wi-Fi to cellular or tower handoffs change the client's IP, invalidating the TCP connection entirely. An idle WebSocket gets quietly killed by your user's own carrier, and you don't find out until the first frame fails to send — neither side gets a FIN or RST.
The fix is heartbeats. Ping/pong control frames sent every 20–30 seconds to stay under the shortest NAT timeout. A common rule of thumb is to set your heartbeat interval to 75% of your shortest proxy timeout — so if Nginx or AWS ALB defaults to 60 seconds, send heartbeats every 45 seconds:
// Heartbeat loop on a Node ws server
import { WebSocketServer } from "ws";
const wss = new WebSocketServer({ port: 8080 });
wss.on("connection", (ws) => {
ws.isAlive = true;
ws.on("pong", () => { ws.isAlive = true; });
});
// Every 30s, ping every client. Terminate any that didn't pong since last tick.
const interval = setInterval(() => {
for (const ws of wss.clients) {
if (!ws.isAlive) return ws.terminate();
ws.isAlive = false;
ws.ping();
}
}, 30_000);
wss.on("close", () => clearInterval(interval));⚠️ Warning: The browser's
WebSocketAPI does not exposeping(). Browser clients respond to server pings automatically (the RFC requires it), but they can't send their own. If your client needs to detect a dead server, build an application-level heartbeat using regular text messages.
Reconnection is your problem
When an HTTP request fails, you've got layers of infrastructure catching the failure. CDNs serve stale content. Load balancer health checks remove dead backends. Application code retries with backoff. Worst case, the user hits refresh — and because HTTP is stateless and repeatable, running the request again just works.
When a WebSocket drops, almost none of that activates. Stateless infrastructure can't retry a stateful connection. A CDN can't cache a bidirectional channel. A load balancer can route the next connection, but whatever state both sides were keeping is gone. The browser won't rebuild it for you. You have to.
Networks drop. Servers restart during deploys. Users close laptops, go through tunnels, switch from Wi-Fi to cellular. WebSocket disconnects in production aren't rare — they're routine.
The thundering herd
A team runs a service with a few thousand WebSocket clients. They deploy normally — rolling restart, 90-second gap. The new version comes up and immediately falls over. Roll back. Deploy again next week. Same thing.
The deploy itself wasn't the problem. Every client that got disconnected during the restart reconnected the instant the service came back. Two thousand clients trying to re-establish in the same half-second is several times larger than the service's normal peak traffic.
It's called a thundering herd, and it's WebSocket-specific. HTTP never had simultaneous long-lived connections to drop in the first place.
The fix is three ideas in maybe 40 lines of code. Every WebSocket project eventually writes a version of it:
// Reconnect with exponential backoff + jitter
function connect(url) {
let attempt = 0;
const open = () => {
const ws = new WebSocket(url);
ws.addEventListener("open", () => {
// Reset only on STABLE connection, not the moment open fires.
// A flaky 200ms-then-drop connection should NOT reset attempt to 0.
setTimeout(() => { if (ws.readyState === WebSocket.OPEN) attempt = 0; }, 5_000);
});
ws.addEventListener("close", () => {
const base = Math.min(30_000, 500 * 2 ** attempt); // 0.5s, 1s, 2s, 4s ... cap 30s
const jitter = Math.random() * base * 0.3; // up to 30% random spread
const delay = base + jitter;
attempt++;
setTimeout(open, delay);
});
return ws;
};
return open();
}Breaking it down:
- Exponential backoff — delay doubles each failed attempt. A client that can't reconnect isn't hammering the server every second.
- Jitter — small random addition so thousands of clients that all failed at the same moment don't all retry in perfect lockstep.
- Reset only on stability — wait until the connection has held for ~5s before resetting
attempt. Otherwise a brief reconnect-then-fail starts the next retry at 30s, which is terrible UX and masks bugs.
And remember — SSE gave you all of this for free. Automatic reconnection with a server-configurable retry interval, built into the browser, with Last-Event-ID resume. Because SSE only bent one promise, the browser could keep doing its normal retry behavior. WebSockets broke too many. The browser can reopen the connection, but it has no way to rebuild the state both sides were keeping, so it doesn't try.
HTTP vs SSE vs WebSockets
| Capability | HTTP | SSE | WebSockets |
|---|---|---|---|
| Direction | Request/response | Server → client only | Full duplex |
| Connection lifetime | Short (ms) | Long-lived (one stream) | Long-lived (hours) |
| Browser API | fetch |
EventSource (built-in) |
WebSocket (built-in) |
| Auto-reconnect | N/A | ✅ Built-in + Last-Event-ID |
❌ Roll your own |
| Custom auth headers | ✅ | ❌ (cookies / URL token only) | ❌ During handshake |
| Works through any HTTP infra | ✅ | ✅ | ⚠️ Most, but breaks naive proxies |
| Needs sticky load balancing | ❌ | ❌ | ✅ |
| Needs special concurrency model | ❌ | ⚠️ Many idle connections | ✅ Definitely |
| Needs heartbeats for NAT | ❌ | ⚠️ Some networks | ✅ Always |
| HTTP/2 multiplexing benefit | ✅ | ✅ | ⚠️ Needs RFC 8441 |
| Best for | One-shot ops | Notifications, streams, AI tokens | Chat, games, collab |
When to use, when not to
Use plain HTTP when: the interaction is request → response. Form submits, REST APIs, GraphQL queries, file uploads.
Use SSE when: the server pushes, the client mostly listens. Live dashboards, log tails, notifications, progress updates, AI token streaming (this is what most LLM APIs use), price tickers, deployment status.
Use WebSockets when: both sides genuinely need to push messages independently with low latency. Chat with typing indicators, multiplayer games, collaborative editors (Figma, Google Docs cursor sync), live trading interfaces, screen sharing signaling.
Don't reach for WebSockets just because they sound modern. If your "real-time" feature is one-directional, SSE is half the work and a tenth of the operational pain.
Production checklist
When you do ship WebSockets, you're shipping infrastructure that has to behave correctly across deploys, mobile networks, and rolling failures. Run through this:
- Choose a non-blocking concurrency model — Node, Go, Elixir, or Java virtual threads. Thread-per-connection on a classic stack will kill you at four-digit concurrency.
- Enable sticky sessions on your load balancer (cookie or hash-based) so reconnects land on a backend that remembers the client.
- Wire a shared message bus — Redis Pub/Sub, NATS, or your provider's equivalent — so any backend can deliver to any connection.
- Run application-level heartbeats every 20–30 seconds. HTTP/1.1 infrastructure typically closes idle connections after 30 to 120 seconds, so proxies may terminate WebSocket connections prematurely when no message is exchanged in 30 seconds. Heartbeats are non-optional.
- Implement client-side dead-connection detection. If three consecutive heartbeats get no reply, close the socket and reconnect. The browser won't fire
onclosefor silent NAT drops. - Reconnect with exponential backoff + jitter. Reset
attemptonly after the connection has been stable for several seconds — never the momentopenfires. - Plan deploys with the thundering herd in mind. Stagger restarts, run drain logic that closes sockets gracefully with
1012 Service Restart, or ramp clients back via a server-pushed reconnect delay. - Monitor zombie connections. Track
ws.isAliveflags, missed pongs, and resource counts. A server holding 10,000 dead connections is leaking memory and file descriptors. - Always use
wss://. Plainws://is more likely to be blocked by corporate proxies and can't survive any TLS-terminating intermediary. - Consider managed real-time (Pusher, Ably, Supabase Realtime, AWS API Gateway WebSockets) before hand-rolling. The infra rebuild above is real engineering work, and "we built our own WebSocket fleet" is rarely a moat.
Conclusion
The web evolves under a constraint: the existing infrastructure can't be upgraded. There's too much of it, and most of it isn't owned by anyone who could change it. Network engineers call this ossification — the protocols at the bottom of the stack are frozen because everything built on top assumes they never change.
The same pattern runs through every major shift in modern networking. HTTP/2 multiplexes streams over a single TCP connection partly because middleboxes had locked in HTTP/1's one-connection-per-request model. QUIC was built on UDP specifically because TCP had become unmodifiable — those same middleboxes inspect and rewrite TCP headers in ways that break any change to the protocol.
WebSockets are what that negotiation looks like when you want bidirectional, real-time, persistent connections on a web that was never designed for any of the three. The handshake is the disguise. The 101 is the moment the disguise comes off. Everything painful about running WebSockets in production — sticky load balancing, thread-pool exhaustion, NAT timeouts, the thundering herd — is the infrastructure catching on, one layer at a time, that the four promises don't hold here anymore.
Pick by direction first. If your feature is one-way, ship SSE today and skip the entire pile of operational work below it. If both sides really need to talk, ship WebSockets — but ship the heartbeats, the sticky sessions, the Redis bus, and the backoff-with-jitter from day one, because every one of them is a promise the old web is no longer keeping for you.
FAQ
Are WebSockets a different protocol from HTTP?
Yes. WebSockets begin life as an HTTP/1.1 GET request with an `Upgrade: websocket` header, but the moment the server responds with `HTTP/1.1 101 Switching Protocols`, both sides stop speaking HTTP. The TCP connection stays open and a binary framing protocol (defined by RFC 6455) flows over it. So WebSockets travel through HTTP infrastructure but are not HTTP after the handshake.
Why do WebSockets need sticky sessions on a load balancer?
After the handshake, the TCP connection is anchored to one specific backend, and any in-memory state for that client lives on that backend only. When the connection drops and the client reconnects (which happens constantly in production), you want the new connection to land on a backend that already remembers the client. Sticky sessions handle that. For cross-backend message delivery, you also need a shared bus like Redis Pub/Sub.
Should I use Server-Sent Events instead of WebSockets?
If your real-time feature is one-directional — notifications, dashboards, log streams, progress bars, AI token streaming — SSE is almost always the better pick. It travels through normal HTTP infrastructure, the browser handles reconnection for you, and there's a `Last-Event-ID` mechanism for resuming where you left off. Reach for WebSockets only when the client genuinely needs to push messages too (chat, multiplayer games, collaborative editors).
Why do WebSocket connections silently drop on mobile?
Cellular carrier-grade NAT gateways drop idle TCP mappings in as little as 30 seconds, and the OS suspends background TCP connections aggressively on iOS and Android. Neither side gets a FIN or RST — the connection just stops working. The fix is application-level ping/pong heartbeats every 20–30 seconds plus client-side dead-connection detection.
What is the magic GUID in the WebSocket handshake?
It's the fixed string `258EAFA5-E914-47DA-95CA-C5AB0DC85B11`, defined in RFC 6455. The server appends it to the client's `Sec-WebSocket-Key`, SHA-1 hashes the result, and base64-encodes it as `Sec-WebSocket-Accept`. It isn't security — both values are public. It's a sanity check that proves the server actually understands WebSockets and is responding to this specific handshake right now, not echoing headers blindly or replaying a cached response.
Why do WebSocket servers need different concurrency models?
The classic thread-per-request model assumes requests are short-lived. WebSocket connections live for hours, so a thread pinned to each connection means thousands of pinned threads, each holding ~1 MB of stack memory. At 10,000 concurrent connections that's roughly 10 GB of thread stacks. Event-loop, goroutine, or virtual-thread models handle idle connections cheaply because they don't tie a kernel thread to each one.
What is a WebSocket thundering herd?
When a server restarts during a deploy, every connected client reconnects the instant the service comes back. Thousands of clients hitting the load balancer in the same half-second can produce traffic several times larger than normal peak. The fix is exponential backoff plus random jitter on every reconnect, with a reset to base delay only on a stable connection — not the moment the socket opens.