June 1, 2026
· 9 min read5 WebSocket Design Decisions Most Developers Never Learn
Millions of developers ship chat apps, live dashboards, and multiplayer games on WebSockets without understanding the protocol's hidden design choices. This deep-dive unpacks five of them — the magic GUID handshake, TCP slow-start exploitation, client-side masking, firewall-friendly ports, and why HTTP/2 sometimes wins. You'll walk away able to explain the protocol at a level that holds up in any system design interview.

You use WebSockets constantly. Chat apps, live dashboards, multiplayer games, collaborative editors — they all run on this protocol. But most developers who ship WebSocket code every day have never asked why it works the way it does.
Why does the handshake involve a weird magic string? Why does every message from your browser get masked with XOR? And here's the one that surprises people: WebSockets piggyback on HTTP in a way that tricks TCP into performing better. Let's unpack five design decisions that most developers never learn.
TL;DR
- The
Sec-WebSocket-Key→Sec-WebSocket-Accepthandshake proves the server actually speaks WebSocket — it defeats proxies that might fake a 101 response. - The HTTP upgrade request isn't just ceremony; those bytes warm up TCP's congestion window so real frames don't pay the slow-start penalty.
- Client-to-server frames are masked with XOR to prevent cache-poisoning attacks; server-to-client frames aren't, because servers are trusted.
- WebSockets run on ports 80 and 443 so they slip through corporate firewalls disguised as web traffic.
- WebSockets have no built-in multiplexing — for one-way server push, HTTP/2 + Server-Sent Events is often the better tool.
Why this matters
Rewind to 2005. You're building a chat application. How do you push a message from the server to the browser when a friend sends it?
HTTP is request-response. The browser asks, the server answers — that's the whole model. The server can't just call the browser. So early developers invented hacks:
- Polling — ask the server every few seconds: "Got anything? Got anything?" It works, but it burns bandwidth and server resources even when nothing is happening.
- Long polling — slightly better. The client asks, and the server holds the connection open until it has something to say. The client then immediately reconnects and waits again. But you're still constantly tearing down and rebuilding HTTP connections.
Then WebSockets arrived with a simple idea: keep one TCP connection open and let both sides send messages whenever they want. One connection, full-duplex, no more request-response dance.
The clever part isn't the open connection itself — it's how WebSockets establish it.
How the connection starts: HTTP in disguise
WebSockets don't create a special connection from scratch. They start as a regular HTTP GET request that politely asks to switch protocols.
GET /chat HTTP/1.1
Host: chat.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13The server replies:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=Breaking it down:
Upgrade: websocket+Connection: Upgrade— tells the server (and any intermediaries) the client wants to change protocols.101 Switching Protocols— from this point forward, both sides stop speaking HTTP and start speaking WebSocket.- The same TCP connection stays open. No new handshake, no new socket.
Simple on the surface. But hidden inside this exchange are five genuinely clever decisions.
1. The handshake key: proving the server really speaks WebSocket
Look at that Sec-WebSocket-Key again. The client sends a random base64 string, and the server sends back a different string in Sec-WebSocket-Accept. Why the round trip?
The server takes the client's key, concatenates it with a fixed magic GUID, hashes it with SHA-1, and base64-encodes the result.
const crypto = require('crypto');
const GUID = '258EAFA5-E914-47DA-95CA-C5AB0DC85B11';
function computeAccept(clientKey) {
return crypto
.createHash('sha1')
.update(clientKey + GUID) // concatenate, then SHA-1
.digest('base64'); // base64-encode the 20-byte hash
}
// computeAccept('dGhlIHNhbXBsZSBub25jZQ==')
// → 's3pPLMBiTxaQ9kYGzzhZRbK+xOo='The point of this dance is to confirm that the server actually understands WebSocket and is not just a generic HTTP server that happens to return 101 by accident. A plain HTTP server or a transparent proxy wouldn't know to perform this specific GUID-and-SHA-1 computation, so its response would never produce the value the client expects.
This is exactly what protects you from caching intermediaries. Without the GUID check, a proxy could receive the upgrade request, cache the response, and replay it later. Imagine a transparent proxy — your ISP's, your corporate network's — that caches a 101 response for /chat. A second user requesting the same URL gets the cached 101 and thinks it has a WebSocket connection, but it's really talking to the proxy. With the handshake, the cached Sec-WebSocket-Accept won't match the new client's key, and the connection fails safely.
Important: The magic GUID
258EAFA5-E914-47DA-95CA-C5AB0DC85B11has no cryptographic significance. It is a fixed constant from RFC 6455, section 4.2.2. It's the process that matters, not the value — and this is not authentication or encryption.
2. The upgrade request secretly warms up TCP
This one is subtle but beautiful. TCP has a feature called slow start.
When you open a new TCP connection, it doesn't blast data at full speed. It sends a little, waits for acknowledgements, then gradually increases the send rate. This is smart congestion control — but it means brand-new connections are slow at first. You have to warm up the pipe.
Here's the clever part. The HTTP upgrade request isn't just protocol formality. Those bytes flowing back and forth during the handshake are growing your TCP congestion window. By the time you send your first real WebSocket frame, you've already spent a couple of round trips warming up the connection — you're not starting from zero.
If WebSockets had invented their own protocol on a custom port with a custom handshake, you'd pay the slow-start penalty twice — once for TCP, once for the new handshake. By reusing HTTP, the cost is amortized into work you were doing anyway.
💡 Tip: This is why the HTTP request is the "bait." It looks like overhead, but it's secretly priming your TCP pipe so application data flows fast from the first frame.
3. Client masking: defeating cache poisoning
Here's something that looks strange at first. Every frame sent from client to server must be masked — XORed with a random 4-byte key. But frames going server to client are never masked. Why the asymmetry?
Picture a malicious web page. It opens a WebSocket connection to a vulnerable server but carefully crafts its frames to look exactly like a valid HTTP response. If a transparent proxy in the middle doesn't understand WebSocket, it might see those bytes and think: "this looks like an HTTP response for jquery.js — let me cache it." The next user who requests jQuery through that proxy gets the attacker's malicious code instead. That's cache poisoning.
Client-to-server masking uses a random 32-bit key XORed with the payload to prevent cache-poisoning attacks on intermediary proxies. Because the client picks a fresh random key per frame, the attacker can't predict what actually hits the wire and can't forge bytes that resemble a valid HTTP response.
Client frame (masked): 81 85 37 FA 21 3D 7F 9F 4D 51 58 ...
│ │ └──────────┘ └─────────────┘
│ │ mask key masked payload
│ └── payload length (5 = "Hello")
└── FIN + opcode (text frame)So why don't servers mask? Masking can have a significant performance impact. The attack only works against untrusted, sandboxed browser clients whose traffic an attacker can shape. Servers are trusted endpoints — they have no reason to poison their own infrastructure — so the spec only mandates masking in the direction where the threat actually exists.
⚠️ Warning: Masking is not encryption. It does not hide your data from anyone deliberately reading the stream. Use
wss://(TLS) for confidentiality.
4. Ports 80 and 443: disguised for firewall traversal
This is quick but important. WebSockets run on port 80 (ws://) and port 443 (wss://) — the same ports as HTTP and HTTPS. That's no accident.
Corporate firewalls typically block everything except web traffic. If WebSockets used port 8080 or some custom port, they'd be blocked in half the environments where people actually need them. And because the connection starts as HTTP, even deep-packet-inspection firewalls that examine content see a perfectly normal HTTP request go by. By the time it upgrades, the connection is already established.
The design goal, straight from the protocol's overview, is to first use HTTP to traverse network intermediaries and then use the established end-to-end underlying TCP/SSL channel for bidirectional application communication.
This is pragmatic protocol design. WebSockets work in the real world because they disguise themselves as HTTP until they're safely through the door.
5. No multiplexing: where WebSockets show their age
Say your app has three real-time features: chat, notifications, and live stock prices. With classic WebSockets, you might open three separate connections — three TCP sockets, three handshakes, three slow starts, three chunks of server memory.
HTTP/2 changed the math. It has multiplexing built in: one TCP connection carries many independent streams, and the server can push data using Server-Sent Events. For one-directional server-to-client updates, that's often simpler and more efficient than juggling WebSocket connections.
| Capability | WebSockets | HTTP/2 + SSE |
|---|---|---|
| True bidirectional | ✅ Native full-duplex | ❌ Server push is one-way |
| Multiplexing | ❌ One stream per connection¹ | ✅ Many streams, one connection |
| Server → client push | ✅ | ✅ (SSE) |
| Client → server push | ✅ | ❌ Needs separate request |
| Best for | Gaming, collaborative editing | Live feeds, notifications, dashboards |
¹ RFC 8441 later defined WebSocket bootstrapping over HTTP/2 via Extended CONNECT, letting multiple WebSocket connections share one TCP connection — but classic RFC 6455 WebSockets over HTTP/1.1 don't multiplex.
WebSockets still win whenever both sides need to send unpredictably — multiplayer games, collaborative editors, anything genuinely bidirectional. But for "server tells client when something changed," reach for HTTP/2 and SSE first.
When to use WebSockets — and when not to
Reach for WebSockets when:
- You need genuine bidirectional, low-latency messaging (chat, multiplayer, live collaboration).
- Both client and server send messages at unpredictable times.
- You're behind restrictive firewalls and need
wss://on port 443 to get through.
Reach for something else when:
- You only need server-to-client updates → HTTP/2 + Server-Sent Events is simpler and reconnects automatically.
- You have several independent real-time feeds and want to avoid connection sprawl → HTTP/2 multiplexing or WebSocket-over-HTTP/2 (RFC 8441).
- The data is request-response shaped → plain HTTP is fine; don't over-engineer.
Conclusion
I keep coming back to WebSockets in interviews and design reviews because they're a masterclass in pragmatic engineering. Nothing about the protocol is arbitrary: the handshake key proves the server speaks WebSocket and defeats proxy replays; the HTTP upgrade quietly warms up TCP's congestion window; client masking shuts down cache poisoning; ports 80 and 443 buy firewall traversal for free; and the honest admission that there's no built-in multiplexing tells you exactly when to reach for HTTP/2 instead.
Next time you wire up a real-time feature, pause before defaulting to a WebSocket. If it's one-way server push, SSE will probably serve you better. If it's truly bidirectional, you now know precisely why the protocol bends over backwards to look like HTTP — and you can explain every byte of that handshake to anyone who asks.
FAQ
What is the WebSocket magic GUID string?
It's the fixed constant `258EAFA5-E914-47DA-95CA-C5AB0DC85B11` defined in RFC 6455. The server concatenates it with the client's `Sec-WebSocket-Key`, hashes the result with SHA-1, and base64-encodes it to produce `Sec-WebSocket-Accept`. It has no cryptographic significance — it just proves the server actually understands WebSocket rather than being a plain HTTP server or proxy that returned 101 by accident.
Why are WebSocket frames masked from client to server?
Client-to-server frames are XORed with a random 32-bit key to prevent cache-poisoning attacks against intermediary proxies. Without masking, a malicious page could craft frames that look like a valid HTTP response, which a confused proxy might cache and serve to other users.
Why don't servers mask WebSocket frames?
The masking attack only works against unpredictable, attacker-controlled clients running in a browser sandbox. Servers are trusted endpoints and have no incentive to poison their own infrastructure's caches, so the spec only mandates masking in the client-to-server direction.
Do WebSockets use the same ports as HTTP?
Yes. `ws://` defaults to port 80 and `wss://` defaults to port 443 — the same ports as HTTP and HTTPS. This is deliberate: corporate firewalls typically block everything except web traffic, so reusing these ports lets WebSockets traverse restrictive networks.
Is HTTP/2 a replacement for WebSockets?
Not entirely. HTTP/2 multiplexing plus Server-Sent Events is often simpler and more efficient for server-to-client push. But for true bidirectional communication where both sides send unpredictably — gaming, collaborative editing — WebSockets still win. RFC 8441 also lets WebSockets bootstrap over a single HTTP/2 connection.
What's the difference between long polling and WebSockets?
Long polling keeps an HTTP request open until the server has data, then the client immediately reconnects — you keep tearing down and rebuilding connections. WebSockets open one TCP connection and keep it open for full-duplex communication, eliminating the request-response cycle entirely.