May 23, 2026
· 14 min readHow a 4-Day-Old Account Reaches 10 Million Strangers: Inside Instagram's Recommendation System
This week a 4-day-old account crossed 10 million followers and overtook a major political party. The follower count is the headline, but the real story is the architecture underneath. We break down how Instagram went from a chronological photo stream to a four-stage recommendation funnel — fan-out, two-tower retrieval, multi-stage ranking, and cold start — and what every backend engineer can steal from it.

TL;DR
- Instagram quietly turned from a social network into a recommendation system. The follow edge used to be a hard requirement to reach someone — now it isn't.
- The old feed used fan-out on write (a push model): copy each post into every follower's precomputed list so reads stay cheap. It breaks for huge accounts — the celebrity problem — so platforms go hybrid.
- The new feed is a pull model built on an interest graph. Every app open triggers a four-stage funnel: retrieval → first-stage ranking → second-stage ranking → re-ranking.
- Retrieval uses a two-tower network + approximate nearest neighbor (ANN) search to go from billions of posts to a few thousand candidates, fast.
- Ranking distills thousands of candidates down with a lightweight model, then scores the survivors with a heavy multi-task model that predicts likes, saves, shares, and negative feedback — combined into one value model score.
- Cold start is solved by understanding content directly and running a small audition with non-followers, watching the rate (sends/likes per view), not the total. A good rate earns a bigger audience, round after round.
Why this matters
This week an Instagram account went from zero to over 10 million followers — not in a year, in about 4–5 days, with roughly 50 posts. It crossed the follower count of one of the biggest political parties in India, an account that had been posting for years.
The account behind it, the satirical Cockroach Janta Party, amassed over 10 million Instagram followers in just five days and surpassed the BJP's Instagram following of about 8.7 million. Source: O Heraldo
But the follower count isn't the interesting part. Here's the question that should bother you as an engineer: a 4-day-old account starts with zero followers. So how does its content reach 10 million feeds?
The answer is the system underneath. Over the last few years Instagram slowly stopped being a social network and became a recommendation engine — and the same machinery that pushed an unknown account to 10 million strangers is deciding what's in your feed right now. Let's reverse-engineer it, stage by stage.
Phase 1: The follow graph (2010)
When Instagram launched in 2010, the feed was almost embarrassingly simple. You follow some people, you open the app, you see their posts — newest first. That's it.
Under the hood, this is just a graph. Every user is a node. When you follow someone, that creates an edge between you and them. To build your feed, the system walks the edges connected to you, grabs the latest posts, sorts by time, and renders them.
How do you make that fast?
You don't want to traverse the graph and query every followee's posts on every app open. So Instagram used a trick: when I post a photo, the system doesn't wait for my followers to open the app. It immediately copies that post into a ready-made feed list for each of my followers.
This is fan-out on write. You do the heavy work at write time, once, so that read time stays cheap.
Write time (rare): I post once → system writes into N follower feeds
Read time (frequent): you open app → just read your prebuilt listBreaking it down:
- A feed is read far more often than it is written — you post once a day, you open the app a dozen times a day.
- So you move the cost to the rare event (the write) and keep the frequent event (the read) trivial.
- This is a push model: content is pushed toward consumers ahead of time.
The catch: the celebrity problem
Now picture an account with 50 million followers. One post means 50 million copies — 50 million writes for a single post. This is the famous celebrity problem, and it's one of the most common system design interview questions out there.
The standard fix is a hybrid fan-out:
| Account type | Strategy | Why |
|---|---|---|
| Normal accounts | Fan-out on write (push) | Few followers → cheap to precompute feeds |
| Very large accounts | Fan-out on read (pull) | Millions of followers → fetch the post only when a follower opens the app |
Normal accounts get pushed. Celebrity posts get pulled at read time and merged in. Hold onto that push vs pull idea — it comes back with a vengeance.
Phase 2: Ranking the same graph (2016)
The app kept growing. People followed hundreds of accounts, and Instagram noticed a real problem. By 2016, people were missing 70% of all their posts in Feed, including almost half of posts from their close connections. Source: Instagram / Adam Mosseri
The posts you'd actually care about were getting buried — purely because they were posted at the wrong time. So in July 2016, Instagram made its first big change: it officially switched from displaying content chronologically to using an algorithm to sort posts. Source: Power Digital
The feed stopped sorting by time and started sorting by predicted interest. The post you're most likely to engage with goes on top.
Important: This change reordered the feed. It did not change where the posts came from. You were still only seeing accounts you follow.
And that's the real ceiling of this design. If there's no edge between you and an account, that account cannot reach you at all — no matter how good the post is. For a company competing directly with TikTok for your time, that limit is unacceptable.
Phase 3: The interest graph
So Instagram went deeper than ranking. It changed the sourcing — the candidate set itself.
- Old question: "What did the people I follow post?"
- New question: "Out of all the content on the platform, what should this person see right now?"
That second question lives on a different graph: the interest graph. The connection here isn't "you follow someone." The connection is "your interests and a piece of content match."
Now look at the scale. There are billions of posts. You cannot run a heavy model over billions of items every time someone opens the app — too slow, too expensive. Given real-world requirements and constraints, most large-scale recommender systems employ a multi-stage funnel approach, starting with thousands of candidates. Source: Engineering at Meta
The four-stage funnel
This is the single most transferable idea in the whole system. The multi-stage approach involves retrieval, first-stage ranking, second-stage ranking and final re-ranking. Source: Medium — Instagram Recommender Systems
Each stage takes a large set of candidates and passes a smaller, better set to the next stage. When you can't afford to run your best, heaviest check on everything, you run a cheap, rough filter first, then a slightly better one, and save your most expensive model for the tiny set that survives.
The same pattern shows up in search engines, ad systems, and fraud detection. Once you see it, you see it everywhere.
Stage 1 — Retrieval: from billions to thousands
The job: go from billions of posts down to a few thousand that are roughly relevant. The main tool is the two-tower network.
- Tower 1 (user tower) looks only at you — your interests, recent activity — and turns you into a vector, a user embedding.
- Tower 2 (content tower) looks only at a piece of content and turns it into a vector.
The model is trained so that when a user is likely to engage with a post, their two vectors come out close together.
Here's the clever bit. The content tower never looks at the user, so Instagram can run it ahead of time. Given that user and item networks are independent after training, the item tower can generate embeddings on a daily basis using an offline pipeline, and those embeddings can be put into a service that supports online approximate nearest neighbors search. Source: Modern Recommender Systems — Instagram Explore
So a design choice that looks like a limitation — keeping the two towers apart — is exactly what makes the system fast enough to exist. If you let the model mix user and content in one network, every score would have to be computed live for every post, every time. At this scale, that's simply not possible.
💡 Tip: Notice the word approximate in ANN. It enables fast candidate generation via approximate nearest neighbor search — critical for real-time personalization. The system doesn't promise the perfect closest matches — it promises very close matches very fast. Trading a sliver of accuracy for orders of magnitude of speed is a recurring move in big systems. Source: Shaped — Two-Tower Deep Dive
Stages 2 & 3 — Ranking: from thousands to a screen
Retrieval handed us a few thousand candidates. Ranking decides which ~10–15 actually make it onto your screen — and it also happens in stages.
First, a lightweight model trims thousands of candidates down to a few hundred. The first-stage ranker predicts the output of the second stage, allowing knowledge distillation from a large model to a more lightweight one. Source: Medium — Instagram Recommender Systems The cheap model learns to imitate the expensive one — that's knowledge distillation.
Then comes the heavy ranking model. The second-stage ranker employs a multi-task multi-label (MTML) neural network capable of handling powerful user-item interaction features. It doesn't just predict whether you'll like a post — it predicts whether you'll save it, share it, or even tap "show fewer posts like this."
All those predictions are combined into a single score — the value model:
value = w_like * P(like)
+ w_save * P(save)
+ w_share * P(share)
- w_neg * P(negative_feedback)
+ ...Breaking it down:
- A like adds value. A save adds more. A share/send adds even more.
- Negative feedback subtracts. The model is optimizing a weighted blend, not a single metric.
- Sends per reach are weighted several times higher than likes, because a DM share represents the highest user intent — someone valuing content enough to personally recommend it. Source: funnl.ai
Stage 4 — Re-ranking: cleanup and diversity
Finally, re-ranking cleans everything up. It removes content you've already seen and injects diversity so your feed doesn't show ten near-identical posts in a row. This is also where freshness, integrity rules, and "don't show the same creator five times" constraints get applied.
Push vs Pull, again — connected vs unconnected reach
Remember fan-out on write? You post, the system pushes your post into ready-made lists. That's a push model. But you can't push a feed of strangers — the system has no way to know in advance which stranger's post is right for you. It only knows that when you actually open the app.
So the new feed is a pull model. The moment you open Instagram, the whole funnel runs and your feed is assembled right then, for that moment. It's the same write-time vs read-time choice from Phase 1 — just flipped.
| Dimension | Push (fan-out on write) | Pull (recommendation funnel) |
|---|---|---|
| Work done at | Write time | Read time (every app open) |
| Read cost | Cheap (prebuilt) | Expensive (full funnel runs) |
| Handles strangers? | ❌ No — needs a follow edge | ✅ Yes — built on the interest graph |
| Scales to | Your followers | The entire platform |
| Best for | Posts from people you follow | Discovery / recommended content |
Instagram even has names for the two kinds of content. Reach breaks down into connected reach — followers and people who've interacted with your account before — and unconnected reach — everyone discovering you through Explore, Reels recommendations, or hashtags. Source: Socialinsider
And the signals differ by type. Connected reach is driven primarily by likes from your existing audience, while unconnected reach is driven more by shares and sends — content people forward to others is what unlocks new audiences. For many users today, the unconnected part is the larger slice of the feed.
The cold start problem
So what happens to a 1-hour-old post from an account nobody has heard of? It has no history at all. This is the cold start problem, and every recommendation system has to deal with it.
Instagram handles it from two sides.
1. Understand the content on its own. Even with zero engagement, the system reads what's inside the post — the audio, the images, on-screen text, the topic — and builds an embedding directly from it. So a brand-new post can be placed into the interest graph and matched to people without waiting for likes. This is why the content tower being user-independent matters so much for new posts.
2. Run an audition. Instagram uses something like an "audition system": a post gets an initial round of exposure to a small group of non-followers, and if it performs well, it's shown to a wider audience of non-followers — and that process can repeat multiple times, which is why small accounts can suddenly go viral. Source: MeetEdgar
The number it watches isn't total likes — it's the rate: sends and likes per view.
Each round feeds the next. A good rate earns a bigger audience; a bigger audience produces more data; if the rate holds, the audience grows again. This is why a post can jump from a few hundred people to a few million in a short time.
Putting it together: how a 4-day-old account hit 10 million
Now we can answer the question we started with directly. An account with no followers reached ~10 million people because of three things working together:
- The interest graph rebuild. An account no longer needs followers to be eligible to reach you. The follow edge isn't required anymore — interest match is enough.
- Content understanding + the non-follower audition. A brand-new account with no history could still be classified from its content and handed a first small audience to test.
- The rate-based feedback loop. Content that people were sending to each other at a high rate kept earning larger and larger audiences, round after round.
The post grabbing attention is one part of the story. But the reason that attention could reach 10 million strangers in days is the system — and the same system is choosing what's in your feed right now.
What backend engineers can actually steal from this
You're probably not building Instagram. But these patterns transfer directly:
🚀 The funnel pattern. When you can't run your best check on everything, cascade: cheap filter → medium filter → expensive model on the survivors. Search, ads, fraud detection, spam filtering, even RAG retrieval pipelines all use it.
⚡ Precompute what doesn't depend on the request. The content tower runs offline because it never looks at the user. Find the parts of your hot path that don't depend on live input and push them to a background job. The decoupling is the optimization.
🧠 Approximate beats exact when exact is too slow. ANN gives up perfect results for massive speed. Most systems have a place where "very close, very fast" is strictly better than "perfect, eventually."
🔒 Push vs pull is a deliberate trade. Push gives cheap reads but can't handle unbounded fan-out or strangers. Pull handles everything but pays at read time. Hybrid approaches (normal vs celebrity accounts) are usually the real-world answer.
🧠 Optimize a value function, not a single metric. The ranking model blends like, save, share, and negative feedback into one score with tunable weights. If you're ranking anything — search results, notifications, recommendations — think in terms of a weighted value function, not one proxy metric.
Production checklist
If you're designing a feed or recommendation surface, these are the load-bearing decisions:
- Decide push vs pull per content class. Hybrid fan-out (write for normal, read for high-fan-out) avoids the celebrity problem.
- Stage your ranking. Cheap recall stage → progressively heavier precision stages. Never run the heavy model on the full candidate pool.
- Decouple anything request-independent. Precompute item embeddings offline; compute only the user-side vector live.
- Use ANN for retrieval at scale. FAISS, HNSW, or a managed vector index — accept approximate recall for latency.
- Distill heavy models into light ones for the early ranking stage so it stays cheap but mimics the final ranker.
- Rank on a value model, not one metric — and weight high-intent signals (shares/saves) above low-intent ones (likes).
- Plan for cold start explicitly. Derive features from content itself, then run a small audition and watch the rate, not the raw count.
- Re-rank for diversity and integrity as a final pass, separate from relevance scoring.
Conclusion
I keep coming back to this system because it's a near-perfect tour of large-scale design trade-offs in one product. Every "limitation" turns out to be a deliberate choice: keeping the two towers apart is what makes retrieval fast; accepting approximate nearest neighbors is what makes it real-time; doing work at read time is the price of being able to surface strangers at all.
If you're building anything that ranks content for users, start where Instagram started — with the cheap, decoupled, precomputed version — and only add the heavy machinery where the data tells you it pays off. The funnel isn't just Instagram's architecture. It's the default shape of any system that has to find a needle in a billion-item haystack, every time someone opens the app.
FAQ
How can a brand new account with zero followers reach millions of people on Instagram?
Because the modern feed is built on an interest graph, not a follow graph. A post no longer needs a follow-edge to reach you — Instagram understands the content directly, shows it to a small test audience of non-followers, and grows the audience round by round if the share rate stays high.
What is fan-out on write?
It's a push strategy where a post is copied into each follower's precomputed feed list at the moment it's created, so reading the feed later is cheap. Reads happen far more often than writes, so you pay the cost once at write time.
What is the celebrity problem in system design?
When an account has tens of millions of followers, fan-out on write means a single post triggers tens of millions of writes. Large platforms solve it with a hybrid: normal accounts fan out on write, very large accounts are fanned out on read (their posts are fetched only when a follower opens the app).
What is a two-tower neural network?
A retrieval model with two independent encoders: one turns the user into a vector, the other turns each piece of content into a vector. They're trained so likely-to-engage user/content pairs land close together. Because the towers never mix, content vectors can be precomputed offline and looked up fast.
What is the difference between connected reach and unconnected reach?
Connected reach is distribution to people who already follow you (driven by likes from your existing audience). Unconnected reach is distribution to non-followers through recommendations and Explore (driven mostly by shares and sends). For many users today, unconnected reach is the larger part of the feed.
What is the cold start problem in recommendation systems?
It's the challenge of ranking content or accounts that have no engagement history yet. Instagram handles it by understanding the content itself (audio, image, text, topic) to place it in the interest graph, then running a small audition with non-followers and watching the engagement rate, not the raw count.