Which data structure should I learn first?

Arrays, linked lists, hash maps, stacks, and queues. Everything else — trees, graphs, heaps, tries — either builds on these or compares against them. Master the first five and the rest become much easier.

Why is array access O(1) but linked list access O(n)?

Arrays sit in one contiguous block of memory, so the address of index i is literally base_address + i * element_size — one arithmetic jump. Linked lists scatter nodes across memory and connect them by pointers, so reaching index i means walking i pointers from the head.

What's the difference between a hash map and a hash set?

A hash map stores key-value pairs (e.g., username → user object). A hash set stores only keys (e.g., the set of taken usernames). Hash sets are what you use when you only care whether something exists, not what it points to.

When should I use a trie instead of a hash map for lookups?

Use a trie when you need prefix operations — autocomplete, spell check, 'find all words starting with app'. A hash map can only answer exact-key questions; a trie answers prefix questions in O(prefix length) time.

Why would anyone use a bloom filter if it can give false positives?

Because it uses 5–10x less memory than a real set and answers in constant time. You use it as a first-line filter: the bloom filter rules out the 99% of cases that definitely don't match, and you only hit the expensive source of truth for the 'maybe' cases. Chrome, Ethereum, and databases like Cassandra all use this pattern.

Is LRU the only cache eviction policy?

No — there's LFU (least frequently used), FIFO, random, ARC, and several ML-based policies. LRU wins most of the time because it's simple, O(1) for both read and write, and matches real access patterns surprisingly well. It's the default in Redis, operating system page caches, and most CDNs.

Do I really need to know this in the age of AI-generated code?

Yes, and arguably more than before. AI assistants will happily suggest the wrong data structure for your workload — an O(n²) loop where a hash set would be O(n), or a graph traversal that blows the stack. Recognizing the right shape of the problem is still a human skill.

15 Data Structures Every Developer Should Actually Understand

TL;DR

Arrays are fast to read, slow to change. Linked lists are the opposite.
Stacks serve the newest item; queues serve the oldest; deques do both.
Hash maps and hash sets trade memory for constant-time lookups.
Trees model hierarchy; BSTs cut search space in half at every step; heaps always give you the most urgent item in O(1).
Graphs model relationships; tries model prefixes; disjoint sets track connected groups.
Bloom filters lie occasionally to save huge amounts of memory. LRU caches combine a hash map and a doubly linked list to be O(1) in both directions.

If you build software for a living, these are the fifteen shapes your data keeps trying to fold itself into. Let's walk through each one.

Why this matters

A data structure isn't a library or a language feature. It's a decision about how your data sits in memory and which operations on it stay cheap. Every choice you make — should this be a list, a dict, a set, a Map, an array of objects, a graph of references — is a data structure decision, whether you name it or not.

Get it right and a query that was 1,800 ms becomes 6 ms. Get it wrong and no amount of caching, indexing, or horizontal scaling will save you, because the shape of the data is fighting you at every step.

1. Arrays — the foundation

An array is a fixed-size, ordered collection of items where every item sits at a numbered index. Think seats in a theater: seat 47 is seat 47, you don't need to check seats 1 through 46 first.

The magic is that arrays are contiguous in memory. The computer knows the starting address and the size of each element, so reaching index i is a single calculation:

// Conceptually, what the CPU does on arr[i]
address = base_address + i * element_size

That's why array access is O(1) — the same speed whether you're reading index 4 or index 40,000.

The trade-off: expanding an array means allocating a new block and copying every element. Inserting in the middle means shifting every later element by one. Reads are cheap, structural changes are not.

Where you'll see them: image pixels, audio samples, video frames, numerical matrices, anywhere position is fixed and known in advance.

💡 Tip: Most "dynamic arrays" (Python list, JavaScript Array, Java ArrayList, Go slice) are arrays with a resize strategy that doubles capacity when full. Appends are amortized O(1), not true O(1).

2. Linked lists — flexibility over speed

A linked list is a chain of nodes. Each node holds a value and a pointer to the next node. Unlike arrays, nodes are scattered across memory — they stay connected only through those pointers.

Insertion and removal are O(1) if you have a pointer to the node — you just rewire two pointers. No shifting, no reallocating.

class Node:
    def __init__(self, value):
        self.value = value
        self.next = None
 
# Insert after a given node — O(1)
def insert_after(node, value):
    new_node = Node(value)
    new_node.next = node.next
    node.next = new_node

The catch: there's no index. To reach the 10,000th element you walk 10,000 pointers. Access is O(n), and CPU cache performance is terrible because the nodes aren't next to each other in memory.

Real-world use: implementation detail inside LRU caches, undo stacks, music playlists, and the internal buckets of many hash map implementations.

3. Stacks — last in, first out

A stack only lets you touch the top. Push adds, pop removes. That's it.

Push and pop are both O(1). The restriction (top only, always) is exactly what makes it fast and easy to reason about.

Where stacks quietly run your world:

Browser back button — each visited page gets pushed; "back" pops the last one.
Function call stack — every function call pushes a frame; return pops it. This is why infinite recursion gives you a stack overflow.
Undo in every editor you use — each action is pushed, Ctrl+Z pops the most recent.
Expression evaluation — compilers use stacks to parse and evaluate nested expressions.

# A stack is just a list where you only use append() and pop()
stack = []
stack.append("page1")   # push
stack.append("page2")
stack.append("page3")
stack.pop()             # returns "page3" — last in, first out

4. Queues — first in, first out

A queue is the opposite of a stack. You join at the back (enqueue) and you're served from the front (dequeue). Like a line at a coffee shop.

Both operations are O(1) when implemented with a linked list or a ring buffer.

Where queues show up:

Print spoolers — documents print in the order they were submitted.
Web server request queues — requests are handled FIFO under most configurations.
Message brokers (RabbitMQ, SQS, Kafka consumer groups) — the whole category is built on queue semantics.
BFS (breadth-first search) in graphs — the traversal itself requires a queue.
OS scheduler run queues — processes wait their turn for the CPU.

⚠️ Warning: A naive queue implemented as arr.pop(0) in Python or shift() in JavaScript is O(n), not O(1), because it shifts every remaining element. Use collections.deque in Python or a proper linked-list queue.

5. Deques — both ends live

A deque (double-ended queue, pronounced "deck") lets you add and remove from both ends. It's a shape-shifter: use only one end and it acts like a stack; add to one end and remove from the other and it acts like a queue.

All four operations — push/pop at each end — are O(1).

Where deques are genuinely the right tool:

Sliding window algorithms — sliding-window maximum, rate limiters, real-time analytics. You add new data on one end and drop stale data on the other.
Browser history with forward and back — a pure stack can't handle "forward" after several backs; a deque can.
Undo + redo — same shape as history.
Work-stealing schedulers (Go, Java ForkJoinPool, Tokio) — threads push to their own deque's tail and steal from other deques' heads.

from collections import deque
 
d = deque([1, 2, 3])
d.appendleft(0)    # [0, 1, 2, 3]
d.append(4)        # [0, 1, 2, 3, 4]
d.popleft()        # 0
d.pop()            # 4

6. Hash maps — the workhorse of modern programming

A hash map stores key-value pairs and retrieves any value in O(1) average time using the key.

The trick is a hash function — a formula that turns any key into an integer, which points to a slot in an underlying array. Store "mom" → "+91 98xxxxxx" and the hash function sends "mom" to slot 4827. Later, searching for "mom" runs the same hash function, gets 4827 again, and jumps straight there.

It doesn't matter if you have 10 entries or 10 million — lookup is constant time. This is the single most important data structure in day-to-day programming.

Where you already use hash maps without thinking:

Python dict, JavaScript Object / Map, Java HashMap, Go map, Rust HashMap, PHP associative arrays.
Every database index that's not sorted is a hash index.
Redis is, at its core, a networked hash map.
Caches of every kind.

The honest drawbacks:

Memory overhead — typically 1.5–3x the size of the keys and values alone.
Collisions (two keys hashing to the same slot) require a resolution strategy (chaining with linked lists, or open addressing with probing). Worst-case lookup degrades to O(n) with an adversarial input.
No ordering. If you need sorted iteration, use a tree map or a sorted container instead.

# Python dict — a hash map with extra features
user = {"id": 42, "name": "Saurabh", "city": "Mumbai"}
user["name"]          # O(1)
user["email"] = "..." # O(1)
"name" in user        # O(1)

7. Hash sets — membership, nothing more

A hash set is a hash map with no values, just keys. Its entire job is answering one question: "is this item in the set?" — in O(1) time.

Think of a nightclub guest list. The bouncer doesn't care about your table number or your favorite drink. Are you on the list? Yes, in. No, out.

seen_users = set()
 
def mark_seen(user_id):
    if user_id in seen_users:   # O(1)
        return "already seen"
    seen_users.add(user_id)     # O(1)
    return "first time"

Where hash sets earn their keep:

Deduplication — list(set(items)) is the classic Python trick.
Visited tracking in graph traversal (BFS/DFS).
Unique constraints — "is this username taken?", "have I processed this event ID?".
Set algebra — intersections, unions, differences between two collections. set_a & set_b is O(min(len(a), len(b))).

8. Trees — hierarchy made explicit

A tree is a data structure with one root at the top, branching into child nodes, ending in leaves that have no children. Every node (except the root) has exactly one parent.

Where trees model reality:

File systems — the canonical example.
HTML DOM — every tag is a node, nested in its parent.
AST (abstract syntax trees) — every compiler and every formatter builds one.
Reddit-style comment threads — replies nest inside replies.
Decision trees in ML, behavior trees in game AI, scene graphs in 3D engines.

A plain tree has no ordering rule — values go anywhere. That means searching it is O(n) in the worst case. Which is exactly the problem the next structure solves.

9. Binary search trees — divide and conquer, built-in

A binary search tree (BST) is a tree with one rule: for every node, everything in the left subtree is smaller, everything in the right subtree is larger.

That one rule turns search into binary search. Looking for 40? Start at 50 — smaller, go left. At 30 — larger, go right. At 40 — found. Each step throws away half the remaining tree.

Search, insert, and delete are O(log n) in a balanced BST. In a balanced tree of a million nodes, you need about 20 comparisons to find anything (because log₂(1,000,000) ≈ 20).

The trap: a plain BST is only balanced if the insertion order is lucky. Insert 1, 2, 3, 4, 5 in order and you get a straight line — effectively a linked list with O(n) operations.

That's why production systems use self-balancing BSTs:

Variant	Where it's used
Red-Black tree	Java `TreeMap`, C++ `std::map`, Linux kernel process scheduler
AVL tree	Databases where reads massively outnumber writes
B-tree / B+ tree	PostgreSQL, MySQL InnoDB, MongoDB, almost every disk-based index

💡 Tip: When you hear "database index," assume B+ tree unless told otherwise. It's the same BST idea, but with wide fan-out (hundreds of children per node) so each disk read covers more of the search space.

10. Heaps — always the most urgent item first

A heap is a tree that always keeps the highest-priority element at the root.

Max-heap: root is the largest element.
Min-heap: root is the smallest element.

Peek the top: O(1). Push or pop: O(log n) because the heap rebalances itself by swapping nodes up or down the tree. Building a heap from an unsorted array: O(n).

Where heaps are the answer:

Priority queues — OS process schedulers, network packet schedulers, emergency-room triage simulations.
Dijkstra's algorithm and A* — the shortest-path algorithms behind Google Maps pull the nearest unvisited node from a min-heap on every iteration.
Top-K problems — "find the 100 most-viewed products" → keep a size-K min-heap, O(n log k).
Heap sort — a guaranteed O(n log n) sort with O(1) extra space.
Event simulation — the next event to happen is always at the top of a min-heap keyed by timestamp.

import heapq
 
tasks = []
heapq.heappush(tasks, (1, "critical bug"))   # priority 1
heapq.heappush(tasks, (5, "refactor"))       # priority 5
heapq.heappush(tasks, (2, "security patch")) # priority 2
 
heapq.heappop(tasks)   # (1, 'critical bug') — always the lowest priority number

⚠️ Don't confuse with memory "heap." The word is also used for a region of process memory (as in "stack and heap"). Completely unrelated concept — same word, different universe.

11. Graphs — the shape of the real world

A graph is a set of nodes (vertices) connected by edges. That's it. And that's also why it's everywhere — any time things are related, you have a graph.

Graph flavors you'll encounter:

Property	Meaning	Example
Directed	Edges have direction	Twitter follows (A → B ≠ B → A)
Undirected	Edges go both ways	Facebook friendship
Weighted	Edges have cost/distance	Road maps, network latency
Cyclic	Contains cycles	Most real-world graphs
Acyclic (DAG)	No cycles	Build dependencies, blockchain, Git commit history

Two ways to store a graph:

# Adjacency list — good when the graph is sparse
adj_list = {
    "A": ["B", "C"],
    "B": ["A", "D"],
    "C": ["A", "D", "E"],
    "D": ["B", "C", "E"],
    "E": ["C", "D"],
}
 
# Adjacency matrix — good when the graph is dense or you need O(1) edge lookups
# adj_matrix[i][j] == 1 means there's an edge from i to j

The two fundamental traversals:

BFS (breadth-first search) — uses a queue, explores layer by layer. Shortest path in unweighted graphs.
DFS (depth-first search) — uses a stack (or recursion), dives deep first. Cycle detection, topological sort, connected components.

Where graphs show up in production: social networks, route planning (Google Maps, Waze), recommendation systems (PageRank started as a graph algorithm), package managers (npm, pip, apt — all DAGs), Git's commit history, Kubernetes resource dependencies, GraphQL schemas, knowledge graphs behind search engines.

12. Tries — prefix lookups at the speed of typing

A trie (pronounced "try", from retrieval) is a tree where each level represents one character. Words with a shared prefix share a path.

Type app and you've already walked to one node. Every word in the trie under that node is a candidate completion. Prefix lookup is O(k) where k is the length of the prefix — it does not depend on the total number of words.

A hash map can only answer exact-key questions. A trie answers "everything that starts with…" natively.

Where tries show up:

Autocomplete — Google search, IDE code completion, phone contact search.
Spell checkers — a trie of valid words is the lookup structure behind red squiggles.
IP routing tables — longest-prefix match across billions of IP ranges is a radix trie (a compressed trie).
Word search games — Boggle solvers prune impossible paths by checking the trie.

class TrieNode:
    def __init__(self):
        self.children = {}
        self.is_end = False
 
class Trie:
    def __init__(self):
        self.root = TrieNode()
 
    def insert(self, word):
        node = self.root
        for ch in word:
            if ch not in node.children:
                node.children[ch] = TrieNode()
            node = node.children[ch]
        node.is_end = True
 
    def starts_with(self, prefix):
        node = self.root
        for ch in prefix:
            if ch not in node.children:
                return False
            node = node.children[ch]
        return True

⚠️ Warning: Tries can be memory-hungry. A naive trie allocates a dict at every node. For large dictionaries, use a radix trie (path-compressed) or a DAWG (directed acyclic word graph) instead.

13. Disjoint set (Union-Find) — tracking groups that merge

A disjoint set (also called union-find) tracks which items belong to which group, even as groups keep merging.

Every element starts in its own group. Two operations:

find(x) — which group does x belong to? (returns the group's "leader")
union(x, y) — merge the groups containing x and y

With two optimizations — path compression (flatten the tree during find) and union by rank (always attach the smaller tree to the larger) — both operations run in effectively O(α(n)), where α is the inverse Ackermann function. In practice, that's less than 5 for any input the universe will ever see.

Where union-find shines:

Kruskal's MST algorithm — detect whether adding an edge would form a cycle.
Image processing — connected-component labeling: which pixels form the same object?
Network connectivity — are these two nodes in the same cluster?
Game level design — are these two rooms reachable from each other?
Social dynamics — friend-of-friend group detection.

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n
 
    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # path compression
        return self.parent[x]
 
    def union(self, x, y):
        rx, ry = self.find(x), self.find(y)
        if rx == ry:
            return False
        if self.rank[rx] < self.rank[ry]:
            rx, ry = ry, rx
        self.parent[ry] = rx
        if self.rank[rx] == self.rank[ry]:
            self.rank[rx] += 1
        return True

14. Bloom filters — the data structure that lies on purpose

A bloom filter answers one question: "is this item in the set?" — but with a twist.

If it says no, it is definitely not in the set. Truth, guaranteed.
If it says yes, it is probably in the set. Small chance of a false positive.

It never produces a false negative. The mistakes only ever go in one direction.

Under the hood it's just a bit array plus a few hash functions. To add an item, you hash it k times and set those bits to 1. To check, you hash and verify all k bits are 1. If any bit is 0, the item is definitely absent. If all are 1, it might be present — or other insertions might have flipped those bits.

Why trade correctness for this? Space. A bloom filter for 1 million items with a 1% false-positive rate needs about 1.2 MB. A real hash set of 1 million 100-byte strings needs 100+ MB.

Where bloom filters are standard production kit:

Chrome Safe Browsing has historically used a bloom-filter-like local check to avoid shipping a multi-gigabyte list of malicious URLs to every browser.
Cassandra, HBase, LevelDB, RocksDB use bloom filters on every SSTable to skip disk reads for keys that definitely aren't there.
CDNs use them to avoid origin fetches for known-missing assets.
Ethereum and many other blockchains use bloom filters on block headers so light clients can cheaply check if a block contains relevant events.
Spell checkers use them as a fast first pass before an expensive dictionary lookup.
Username availability checks often use bloom filters to reject obviously-taken names without hitting the database.

💡 Variants you'll hear about: Counting bloom filters (support deletion), Cuckoo filters (better false-positive rate, support deletion), and XOR filters (smaller than bloom for the same FP rate).

15. LRU cache — knowing what to forget

An LRU cache (Least Recently Used) is a fixed-size cache that evicts the item you haven't touched the longest when space runs out.

The genius is in the construction: it's a hash map combined with a doubly linked list.

The hash map gives you O(1) lookup by key.
The doubly linked list gives you O(1) reordering and eviction.

On every access, move the accessed node to the head. When the cache is full and a new item comes in, evict the tail. Both get and put are O(1).

from collections import OrderedDict
 
class LRUCache:
    def __init__(self, capacity):
        self.cache = OrderedDict()
        self.capacity = capacity
 
    def get(self, key):
        if key not in self.cache:
            return -1
        self.cache.move_to_end(key)     # mark as most-recently used
        return self.cache[key]
 
    def put(self, key, value):
        if key in self.cache:
            self.cache.move_to_end(key)
        self.cache[key] = value
        if len(self.cache) > self.capacity:
            self.cache.popitem(last=False)  # evict least-recently used

Where LRU caches run the world:

CPU caches — L1, L2, L3 all use LRU-like policies.
OS page caches — the kernel decides which memory pages to swap out.
Database buffer pools — PostgreSQL, MySQL.
Redis — the allkeys-lru eviction policy is exactly this.
HTTP caches — browsers, CDNs, reverse proxies.
Application-level memoization — @functools.lru_cache in Python.

Complexity cheat sheet

Keep this table close. It's the single most useful piece of reference for picking the right structure:

Data structure	Access	Search	Insert	Delete	Space
Array	O(1)	O(n)	O(n)	O(n)	O(n)
Linked list	O(n)	O(n)	O(1)*	O(1)*	O(n)
Stack	O(n)	O(n)	O(1)	O(1)	O(n)
Queue	O(n)	O(n)	O(1)	O(1)	O(n)
Deque	O(n)	O(n)	O(1)	O(1)	O(n)
Hash map	—	O(1) avg	O(1) avg	O(1) avg	O(n)
Hash set	—	O(1) avg	O(1) avg	O(1) avg	O(n)
BST (balanced)	O(log n)	O(log n)	O(log n)	O(log n)	O(n)
Heap	O(1) peek	O(n)	O(log n)	O(log n)	O(n)
Trie	—	O(k)	O(k)	O(k)	O(n·k)
Union-Find	—	O(α(n))	O(α(n))	—	O(n)
Bloom filter	—	O(k)	O(k)	—	O(m)
LRU cache	—	O(1)	O(1)	O(1)	O(capacity)

* Given a pointer to the node. Finding the node is still O(n).

Decision guide: what to reach for

Need	Reach for
Fast index-based access to fixed-size data	Array
Frequent insertion/removal at known positions	Linked list
LIFO order (most recent first)	Stack
FIFO order (oldest first)	Queue
Both ends live, sliding windows	Deque
Fast key-based lookup	Hash map
Membership / deduplication	Hash set
Hierarchical relationships	Tree
Sorted data with range queries	BST (or B-tree for disk)
"Give me the highest priority item, fast"	Heap
Relationships, networks, paths	Graph
Prefix search, autocomplete	Trie
"Are these in the same group?" under merges	Union-Find
Probabilistic membership with tiny memory	Bloom filter
Bounded cache with recency eviction	LRU cache

Production best practices

Pick structures based on the hot path, not the average case. A million reads and one insert? Optimize for reads. It's the opposite for write-heavy workloads.
Prefer the built-ins. dict, set, list, heapq, collections.deque, OrderedDict in Python; Map, Set, Array in JavaScript. They're C-implemented, battle-tested, and fast.
Measure before optimizing. Profile first. A hash map might look O(1) but have terrible cache locality for small n; an array might beat it at 10 elements.
Watch out for worst-case behavior. Hash maps degrade with adversarial keys — use randomized hashes (Python, Rust, and Go do this by default). BSTs degrade with sorted input — use self-balancing variants.
Fix the constant factor, not just the big-O. O(n log n) with a 10x constant can lose to O(n²) at small n. CPUs love arrays because of cache lines.
Never reinvent a bloom filter, an LRU cache, or a union-find in production. Use a vetted library — redis-py-cluster for distributed LRU, pybloom-live for bloom filters, NetworkX for graphs.
Pay attention to memory layout. An array of structs vs. a struct of arrays can be a 10x difference in real workloads, especially in numerical code.
When in doubt, sketch the data. Draw the shape on paper. The right structure usually becomes obvious the moment you visualize what's actually going on.

When not to obsess over data structures

For problems with fewer than a thousand items, the difference between O(n) and O(log n) is lost in noise. Readability wins.
For I/O-bound code (database calls, HTTP requests), the data structure almost never matters — the network does.
For one-off scripts, use the ugliest thing that works and move on.

The goal is knowing which tool to grab, not showing off which ones you know.

Conclusion

I've been writing code for nearly a decade, across Laravel, Node, Python, and FastAPI, and the single biggest separator between "it works" and "it scales" has almost always been a data structure decision somewhere deep in the stack. The frameworks change. The languages change. These fifteen shapes don't.

If you want to get better at this: the next time you feel tempted to write a nested loop, stop and ask whether a hash set would make it one loop. The next time you're about to .push and .shift from an array, ask if you actually need a deque. The next time a query is slow, think about whether it's scanning when it should be looking up.

The structures themselves are not the point. The habit of recognizing which one the problem is asking for — that's the skill worth building.

Start there, and your code will feel different within a month.

TL;DR

Arrays are fast to read, slow to change. Linked lists are the opposite.
Stacks serve the newest item; queues serve the oldest; deques do both.
Hash maps and hash sets trade memory for constant-time lookups.
Trees model hierarchy; BSTs cut search space in half at every step; heaps always give you the most urgent item in O(1).
Graphs model relationships; tries model prefixes; disjoint sets track connected groups.
Bloom filters lie occasionally to save huge amounts of memory. LRU caches combine a hash map and a doubly linked list to be O(1) in both directions.

If you build software for a living, these are the fifteen shapes your data keeps trying to fold itself into. Let's walk through each one.

Why this matters

1. Arrays — the foundation

An array is a fixed-size, ordered collection of items where every item sits at a numbered index. Think seats in a theater: seat 47 is seat 47, you don't need to check seats 1 through 46 first.

The magic is that arrays are contiguous in memory. The computer knows the starting address and the size of each element, so reaching index i is a single calculation:

// Conceptually, what the CPU does on arr[i]
address = base_address + i * element_size

That's why array access is O(1) — the same speed whether you're reading index 4 or index 40,000.

Where you'll see them: image pixels, audio samples, video frames, numerical matrices, anywhere position is fixed and known in advance.

💡 Tip: Most "dynamic arrays" (Python list, JavaScript Array, Java ArrayList, Go slice) are arrays with a resize strategy that doubles capacity when full. Appends are amortized O(1), not true O(1).

2. Linked lists — flexibility over speed

A linked list is a chain of nodes. Each node holds a value and a pointer to the next node. Unlike arrays, nodes are scattered across memory — they stay connected only through those pointers.

Insertion and removal are O(1) if you have a pointer to the node — you just rewire two pointers. No shifting, no reallocating.

class Node:
    def __init__(self, value):
        self.value = value
        self.next = None
 
# Insert after a given node — O(1)
def insert_after(node, value):
    new_node = Node(value)
    new_node.next = node.next
    node.next = new_node

Real-world use: implementation detail inside LRU caches, undo stacks, music playlists, and the internal buckets of many hash map implementations.

3. Stacks — last in, first out

A stack only lets you touch the top. Push adds, pop removes. That's it.

Push and pop are both O(1). The restriction (top only, always) is exactly what makes it fast and easy to reason about.

Where stacks quietly run your world:

Browser back button — each visited page gets pushed; "back" pops the last one.
Function call stack — every function call pushes a frame; return pops it. This is why infinite recursion gives you a stack overflow.
Undo in every editor you use — each action is pushed, Ctrl+Z pops the most recent.
Expression evaluation — compilers use stacks to parse and evaluate nested expressions.

# A stack is just a list where you only use append() and pop()
stack = []
stack.append("page1")   # push
stack.append("page2")
stack.append("page3")
stack.pop()             # returns "page3" — last in, first out

4. Queues — first in, first out

A queue is the opposite of a stack. You join at the back (enqueue) and you're served from the front (dequeue). Like a line at a coffee shop.

Both operations are O(1) when implemented with a linked list or a ring buffer.

Where queues show up:

Print spoolers — documents print in the order they were submitted.
Web server request queues — requests are handled FIFO under most configurations.
Message brokers (RabbitMQ, SQS, Kafka consumer groups) — the whole category is built on queue semantics.
BFS (breadth-first search) in graphs — the traversal itself requires a queue.
OS scheduler run queues — processes wait their turn for the CPU.

⚠️ Warning: A naive queue implemented as arr.pop(0) in Python or shift() in JavaScript is O(n), not O(1), because it shifts every remaining element. Use collections.deque in Python or a proper linked-list queue.

5. Deques — both ends live

All four operations — push/pop at each end — are O(1).

Where deques are genuinely the right tool:

Sliding window algorithms — sliding-window maximum, rate limiters, real-time analytics. You add new data on one end and drop stale data on the other.
Browser history with forward and back — a pure stack can't handle "forward" after several backs; a deque can.
Undo + redo — same shape as history.
Work-stealing schedulers (Go, Java ForkJoinPool, Tokio) — threads push to their own deque's tail and steal from other deques' heads.

from collections import deque
 
d = deque([1, 2, 3])
d.appendleft(0)    # [0, 1, 2, 3]
d.append(4)        # [0, 1, 2, 3, 4]
d.popleft()        # 0
d.pop()            # 4

6. Hash maps — the workhorse of modern programming

A hash map stores key-value pairs and retrieves any value in O(1) average time using the key.

It doesn't matter if you have 10 entries or 10 million — lookup is constant time. This is the single most important data structure in day-to-day programming.

Where you already use hash maps without thinking:

Python dict, JavaScript Object / Map, Java HashMap, Go map, Rust HashMap, PHP associative arrays.
Every database index that's not sorted is a hash index.
Redis is, at its core, a networked hash map.
Caches of every kind.

The honest drawbacks:

Memory overhead — typically 1.5–3x the size of the keys and values alone.
Collisions (two keys hashing to the same slot) require a resolution strategy (chaining with linked lists, or open addressing with probing). Worst-case lookup degrades to O(n) with an adversarial input.
No ordering. If you need sorted iteration, use a tree map or a sorted container instead.

# Python dict — a hash map with extra features
user = {"id": 42, "name": "Saurabh", "city": "Mumbai"}
user["name"]          # O(1)
user["email"] = "..." # O(1)
"name" in user        # O(1)

7. Hash sets — membership, nothing more

A hash set is a hash map with no values, just keys. Its entire job is answering one question: "is this item in the set?" — in O(1) time.

Think of a nightclub guest list. The bouncer doesn't care about your table number or your favorite drink. Are you on the list? Yes, in. No, out.

seen_users = set()
 
def mark_seen(user_id):
    if user_id in seen_users:   # O(1)
        return "already seen"
    seen_users.add(user_id)     # O(1)
    return "first time"

Where hash sets earn their keep:

Deduplication — list(set(items)) is the classic Python trick.
Visited tracking in graph traversal (BFS/DFS).
Unique constraints — "is this username taken?", "have I processed this event ID?".
Set algebra — intersections, unions, differences between two collections. set_a & set_b is O(min(len(a), len(b))).

8. Trees — hierarchy made explicit

A tree is a data structure with one root at the top, branching into child nodes, ending in leaves that have no children. Every node (except the root) has exactly one parent.

Where trees model reality:

File systems — the canonical example.
HTML DOM — every tag is a node, nested in its parent.
AST (abstract syntax trees) — every compiler and every formatter builds one.
Reddit-style comment threads — replies nest inside replies.
Decision trees in ML, behavior trees in game AI, scene graphs in 3D engines.

A plain tree has no ordering rule — values go anywhere. That means searching it is O(n) in the worst case. Which is exactly the problem the next structure solves.

9. Binary search trees — divide and conquer, built-in

A binary search tree (BST) is a tree with one rule: for every node, everything in the left subtree is smaller, everything in the right subtree is larger.

That one rule turns search into binary search. Looking for 40? Start at 50 — smaller, go left. At 30 — larger, go right. At 40 — found. Each step throws away half the remaining tree.

Search, insert, and delete are O(log n) in a balanced BST. In a balanced tree of a million nodes, you need about 20 comparisons to find anything (because log₂(1,000,000) ≈ 20).

The trap: a plain BST is only balanced if the insertion order is lucky. Insert 1, 2, 3, 4, 5 in order and you get a straight line — effectively a linked list with O(n) operations.

That's why production systems use self-balancing BSTs:

Variant	Where it's used
Red-Black tree	Java `TreeMap`, C++ `std::map`, Linux kernel process scheduler
AVL tree	Databases where reads massively outnumber writes
B-tree / B+ tree	PostgreSQL, MySQL InnoDB, MongoDB, almost every disk-based index

💡 Tip: When you hear "database index," assume B+ tree unless told otherwise. It's the same BST idea, but with wide fan-out (hundreds of children per node) so each disk read covers more of the search space.

10. Heaps — always the most urgent item first

A heap is a tree that always keeps the highest-priority element at the root.

Max-heap: root is the largest element.
Min-heap: root is the smallest element.

Peek the top: O(1). Push or pop: O(log n) because the heap rebalances itself by swapping nodes up or down the tree. Building a heap from an unsorted array: O(n).

Where heaps are the answer:

Priority queues — OS process schedulers, network packet schedulers, emergency-room triage simulations.
Dijkstra's algorithm and A* — the shortest-path algorithms behind Google Maps pull the nearest unvisited node from a min-heap on every iteration.
Top-K problems — "find the 100 most-viewed products" → keep a size-K min-heap, O(n log k).
Heap sort — a guaranteed O(n log n) sort with O(1) extra space.
Event simulation — the next event to happen is always at the top of a min-heap keyed by timestamp.

import heapq
 
tasks = []
heapq.heappush(tasks, (1, "critical bug"))   # priority 1
heapq.heappush(tasks, (5, "refactor"))       # priority 5
heapq.heappush(tasks, (2, "security patch")) # priority 2
 
heapq.heappop(tasks)   # (1, 'critical bug') — always the lowest priority number

⚠️ Don't confuse with memory "heap." The word is also used for a region of process memory (as in "stack and heap"). Completely unrelated concept — same word, different universe.

11. Graphs — the shape of the real world

A graph is a set of nodes (vertices) connected by edges. That's it. And that's also why it's everywhere — any time things are related, you have a graph.

Graph flavors you'll encounter:

Property	Meaning	Example
Directed	Edges have direction	Twitter follows (A → B ≠ B → A)
Undirected	Edges go both ways	Facebook friendship
Weighted	Edges have cost/distance	Road maps, network latency
Cyclic	Contains cycles	Most real-world graphs
Acyclic (DAG)	No cycles	Build dependencies, blockchain, Git commit history

Two ways to store a graph:

# Adjacency list — good when the graph is sparse
adj_list = {
    "A": ["B", "C"],
    "B": ["A", "D"],
    "C": ["A", "D", "E"],
    "D": ["B", "C", "E"],
    "E": ["C", "D"],
}
 
# Adjacency matrix — good when the graph is dense or you need O(1) edge lookups
# adj_matrix[i][j] == 1 means there's an edge from i to j

The two fundamental traversals:

BFS (breadth-first search) — uses a queue, explores layer by layer. Shortest path in unweighted graphs.
DFS (depth-first search) — uses a stack (or recursion), dives deep first. Cycle detection, topological sort, connected components.

12. Tries — prefix lookups at the speed of typing

A trie (pronounced "try", from retrieval) is a tree where each level represents one character. Words with a shared prefix share a path.

A hash map can only answer exact-key questions. A trie answers "everything that starts with…" natively.

Where tries show up:

Autocomplete — Google search, IDE code completion, phone contact search.
Spell checkers — a trie of valid words is the lookup structure behind red squiggles.
IP routing tables — longest-prefix match across billions of IP ranges is a radix trie (a compressed trie).
Word search games — Boggle solvers prune impossible paths by checking the trie.

class TrieNode:
    def __init__(self):
        self.children = {}
        self.is_end = False
 
class Trie:
    def __init__(self):
        self.root = TrieNode()
 
    def insert(self, word):
        node = self.root
        for ch in word:
            if ch not in node.children:
                node.children[ch] = TrieNode()
            node = node.children[ch]
        node.is_end = True
 
    def starts_with(self, prefix):
        node = self.root
        for ch in prefix:
            if ch not in node.children:
                return False
            node = node.children[ch]
        return True

⚠️ Warning: Tries can be memory-hungry. A naive trie allocates a dict at every node. For large dictionaries, use a radix trie (path-compressed) or a DAWG (directed acyclic word graph) instead.

13. Disjoint set (Union-Find) — tracking groups that merge

A disjoint set (also called union-find) tracks which items belong to which group, even as groups keep merging.

Every element starts in its own group. Two operations:

find(x) — which group does x belong to? (returns the group's "leader")
union(x, y) — merge the groups containing x and y

Where union-find shines:

Kruskal's MST algorithm — detect whether adding an edge would form a cycle.
Image processing — connected-component labeling: which pixels form the same object?
Network connectivity — are these two nodes in the same cluster?
Game level design — are these two rooms reachable from each other?
Social dynamics — friend-of-friend group detection.

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n
 
    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # path compression
        return self.parent[x]
 
    def union(self, x, y):
        rx, ry = self.find(x), self.find(y)
        if rx == ry:
            return False
        if self.rank[rx] < self.rank[ry]:
            rx, ry = ry, rx
        self.parent[ry] = rx
        if self.rank[rx] == self.rank[ry]:
            self.rank[rx] += 1
        return True

14. Bloom filters — the data structure that lies on purpose

A bloom filter answers one question: "is this item in the set?" — but with a twist.

If it says no, it is definitely not in the set. Truth, guaranteed.
If it says yes, it is probably in the set. Small chance of a false positive.

It never produces a false negative. The mistakes only ever go in one direction.

Why trade correctness for this? Space. A bloom filter for 1 million items with a 1% false-positive rate needs about 1.2 MB. A real hash set of 1 million 100-byte strings needs 100+ MB.

Where bloom filters are standard production kit:

Chrome Safe Browsing has historically used a bloom-filter-like local check to avoid shipping a multi-gigabyte list of malicious URLs to every browser.
Cassandra, HBase, LevelDB, RocksDB use bloom filters on every SSTable to skip disk reads for keys that definitely aren't there.
CDNs use them to avoid origin fetches for known-missing assets.
Ethereum and many other blockchains use bloom filters on block headers so light clients can cheaply check if a block contains relevant events.
Spell checkers use them as a fast first pass before an expensive dictionary lookup.
Username availability checks often use bloom filters to reject obviously-taken names without hitting the database.

💡 Variants you'll hear about: Counting bloom filters (support deletion), Cuckoo filters (better false-positive rate, support deletion), and XOR filters (smaller than bloom for the same FP rate).

15. LRU cache — knowing what to forget

An LRU cache (Least Recently Used) is a fixed-size cache that evicts the item you haven't touched the longest when space runs out.

The genius is in the construction: it's a hash map combined with a doubly linked list.

The hash map gives you O(1) lookup by key.
The doubly linked list gives you O(1) reordering and eviction.

On every access, move the accessed node to the head. When the cache is full and a new item comes in, evict the tail. Both get and put are O(1).

from collections import OrderedDict
 
class LRUCache:
    def __init__(self, capacity):
        self.cache = OrderedDict()
        self.capacity = capacity
 
    def get(self, key):
        if key not in self.cache:
            return -1
        self.cache.move_to_end(key)     # mark as most-recently used
        return self.cache[key]
 
    def put(self, key, value):
        if key in self.cache:
            self.cache.move_to_end(key)
        self.cache[key] = value
        if len(self.cache) > self.capacity:
            self.cache.popitem(last=False)  # evict least-recently used

Where LRU caches run the world:

CPU caches — L1, L2, L3 all use LRU-like policies.
OS page caches — the kernel decides which memory pages to swap out.
Database buffer pools — PostgreSQL, MySQL.
Redis — the allkeys-lru eviction policy is exactly this.
HTTP caches — browsers, CDNs, reverse proxies.
Application-level memoization — @functools.lru_cache in Python.

Complexity cheat sheet

Keep this table close. It's the single most useful piece of reference for picking the right structure:

Data structure	Access	Search	Insert	Delete	Space
Array	O(1)	O(n)	O(n)	O(n)	O(n)
Linked list	O(n)	O(n)	O(1)*	O(1)*	O(n)
Stack	O(n)	O(n)	O(1)	O(1)	O(n)
Queue	O(n)	O(n)	O(1)	O(1)	O(n)
Deque	O(n)	O(n)	O(1)	O(1)	O(n)
Hash map	—	O(1) avg	O(1) avg	O(1) avg	O(n)
Hash set	—	O(1) avg	O(1) avg	O(1) avg	O(n)
BST (balanced)	O(log n)	O(log n)	O(log n)	O(log n)	O(n)
Heap	O(1) peek	O(n)	O(log n)	O(log n)	O(n)
Trie	—	O(k)	O(k)	O(k)	O(n·k)
Union-Find	—	O(α(n))	O(α(n))	—	O(n)
Bloom filter	—	O(k)	O(k)	—	O(m)
LRU cache	—	O(1)	O(1)	O(1)	O(capacity)

* Given a pointer to the node. Finding the node is still O(n).

Decision guide: what to reach for

Need	Reach for
Fast index-based access to fixed-size data	Array
Frequent insertion/removal at known positions	Linked list
LIFO order (most recent first)	Stack
FIFO order (oldest first)	Queue
Both ends live, sliding windows	Deque
Fast key-based lookup	Hash map
Membership / deduplication	Hash set
Hierarchical relationships	Tree
Sorted data with range queries	BST (or B-tree for disk)
"Give me the highest priority item, fast"	Heap
Relationships, networks, paths	Graph
Prefix search, autocomplete	Trie
"Are these in the same group?" under merges	Union-Find
Probabilistic membership with tiny memory	Bloom filter
Bounded cache with recency eviction	LRU cache

Production best practices

Pick structures based on the hot path, not the average case. A million reads and one insert? Optimize for reads. It's the opposite for write-heavy workloads.
Prefer the built-ins. dict, set, list, heapq, collections.deque, OrderedDict in Python; Map, Set, Array in JavaScript. They're C-implemented, battle-tested, and fast.
Measure before optimizing. Profile first. A hash map might look O(1) but have terrible cache locality for small n; an array might beat it at 10 elements.
Watch out for worst-case behavior. Hash maps degrade with adversarial keys — use randomized hashes (Python, Rust, and Go do this by default). BSTs degrade with sorted input — use self-balancing variants.
Fix the constant factor, not just the big-O. O(n log n) with a 10x constant can lose to O(n²) at small n. CPUs love arrays because of cache lines.
Never reinvent a bloom filter, an LRU cache, or a union-find in production. Use a vetted library — redis-py-cluster for distributed LRU, pybloom-live for bloom filters, NetworkX for graphs.
Pay attention to memory layout. An array of structs vs. a struct of arrays can be a 10x difference in real workloads, especially in numerical code.
When in doubt, sketch the data. Draw the shape on paper. The right structure usually becomes obvious the moment you visualize what's actually going on.

When not to obsess over data structures

For problems with fewer than a thousand items, the difference between O(n) and O(log n) is lost in noise. Readability wins.
For I/O-bound code (database calls, HTTP requests), the data structure almost never matters — the network does.
For one-off scripts, use the ugliest thing that works and move on.

The goal is knowing which tool to grab, not showing off which ones you know.

Conclusion

The structures themselves are not the point. The habit of recognizing which one the problem is asking for — that's the skill worth building.

Start there, and your code will feel different within a month.

FAQ

FAQ