Did SpaceX actually buy Cursor for $60 billion?

Not yet. SpaceX has the option to acquire Cursor for $60 billion later this year, or pay $10 billion for a working partnership instead. The acquisition is reportedly being delayed until after SpaceX's IPO this summer to avoid revising its confidential filings.

Why does Cursor feel so much better than using ChatGPT or Claude directly when it's the same model?

Because the model is only as good as the context it gets. Cursor's value is the retrieval pipeline — local Merkle-tree indexing, Tree-sitter AST chunking, and semantic search via Turbopuffer — which sends the model a focused slice of your codebase instead of the whole repo.

Is my source code stored on Cursor's servers?

No. Per Cursor's security docs, only embeddings and obfuscated metadata (encrypted file paths, line ranges) are stored in Turbopuffer. Raw code is processed transiently to compute embeddings and is not persisted. Plain-text code is fetched from your local machine only at inference time.

What is Composer and how is it different from Claude or GPT?

Composer is Cursor's proprietary agent model — a Mixture-of-Experts LLM trained with reinforcement learning inside the actual Cursor tool harness. Composer 2 was continued-pretrained on the Kimi K2.5 base, then RL-trained to use tools (search, edit, run) like a real engineer. Cursor reports it runs ~4x faster than comparably intelligent models, finishing most turns in under 30 seconds.

Why would a rocket company want a code editor?

Three reasons: (1) xAI is far behind Anthropic's Claude Code and OpenAI's Codex in agentic coding, and Cursor closes that gap instantly. (2) The interface layer — where developers actually live — is the part of the AI stack that becomes a real moat as frontier models commoditize. (3) Bundling Cursor into the SpaceX IPO story reframes a rocket company as a full AI platform.

What's a Merkle tree and why does Cursor use one?

A Merkle tree is a hierarchy of cryptographic hashes where every file gets a fingerprint, and every directory's fingerprint is derived from its children. When one file changes, only the hashes on its path to the root change — so Cursor can detect exactly what moved and re-index only that, instead of reprocessing your entire repo every sync.

Why SpaceX Is Paying $60B for Cursor — And How Cursor Actually Works Under the Hood

TL;DR

On April 21, 2026, SpaceX announced a deal giving it the right to acquire Cursor for $60 billion later this year — or pay $10 billion for the partnership instead. Source: Bloomberg
Cursor's edge isn't the model — it's the retrieval pipeline: local Merkle-tree indexing, Tree-sitter AST chunking, Turbopuffer vector search, and a tight tool-use loop powered by their proprietary Composer model.
The deal isn't really about the editor. It's about owning the interface layer of the AI stack — the place developers actually live every day — ahead of what could be the largest IPO in history.

Why this deal matters

The numbers are absurd on their own. Cursor was valued at $2.5B in January 2025, climbed to $9B by May, hit a $29.3B post-money valuation on a $2.3B Series D in November 2025, and was reportedly closing a fresh $2B round at over $50B before SpaceX preempted it with this offer. Source: TechCrunch

Revenue tells the same story: $100M ARR in January 2025 → $500M by June → $1B by November → $2B by February 2026, with Anysphere projecting more than $6B ARR by year-end. Source: Bloomberg. As of April 2026, roughly 70% of Fortune 1000 companies — Nvidia, Uber, Adobe, Salesforce, PwC — use Cursor across engineering teams. Source: Sacra

But here's the puzzle that makes the price tag interesting. Cursor uses the same frontier models you can use directly — Claude, GPT, Gemini. So why does Cursor feel completely different from pasting your code into a chat window?

That one question is the entire game. And the answer is what SpaceX is paying $60 billion for.

How Cursor actually works

Think about how you debug a real codebase at work. 50,000 files, maybe more. You don't sit and read the whole repo — you jump straight to the three files that matter. Knowing where to look is the whole job.

LLMs can't do that. They've never seen your code, and you can't dump a real production repo into a context window — even the biggest models choke. So the question becomes: how do you pick the right files to show the model, every single time?

Cursor's pipeline answers that in five stages.

Let's walk through it.

Stage 1 — Local scan and Merkle tree

The moment you open a project, Cursor scans the folder locally on your machine. Files matching .gitignore and .cursorignore are filtered out. Then comes the clever part: Cursor computes a Merkle tree of cryptographic hashes for every file in the repo. Source: Cursor Security

A Merkle tree is a hierarchy of fingerprints. Each file gets its own hash. Every folder's hash is derived from the hashes of its children, all the way up to a single root hash that represents the entire workspace.

                 root_hash
                /         \
           dir_hash_A   dir_hash_B
            /     \       /    \
        file1   file2  file3  file4

Why bother? Because the next time you change one file, only the hashes on the path from that file to the root change. Cursor sends the root hash to the server in a startup handshake, the server compares fingerprints, and only the chunks that actually moved get re-processed. Source: Pragmatic Engineer

That's why Cursor stays fast on huge codebases. The sync engine runs every ~3–5 minutes, but it almost never has real work to do.

Stage 2 — Tree-sitter chunking

Now Cursor needs to break files into searchable units. Naive text splitting tears apart a function definition mid-body — useless for retrieval.

So Cursor uses Tree-sitter, a parser that builds an AST and understands code structure. Files get split into real units: a function, a class, a logical block. Chunks stay whole. Sibling nodes get merged into larger chunks as long as they fit under the token limit. Source: Engineer's Codex

💡 Why this matters: A function call without its definition is half a snippet. AST-aware chunking preserves the unit of meaning that humans actually reason about.

Stage 3 — Embeddings into Turbopuffer

Now we have clean chunks. The next step is to make them searchable by meaning, not by text.

Text search is too literal. If you grep for login, you'll miss the file called authenticate.ts — even though it's literally the login code. So Cursor converts each chunk into a vector: a list of numbers that captures semantic meaning. Authentication code lands in one neighborhood of that number space; payment code lands somewhere completely different, even if the words never overlap.

Embeddings are computed using OpenAI's embedding API or Cursor's own models, then stored in Turbopuffer — a serverless vector + full-text search engine on Google Cloud, optimized for fast nearest-neighbor search across millions of code chunks. Source: Towards Data Science

One critical detail: your raw code never leaves your machine for storage. Only embeddings and obfuscated metadata (encrypted file paths split by / and ., encrypted with a client-side key) live in Turbopuffer. Source code is decrypted transiently to compute embeddings and then deleted. Source: Cursor Security

Stage 4 — Retrieval at query time

You type "refactor the login flow to support Google OAuth."

Cursor embeds your question into the same vector space as the code chunks. It sends that vector to Turbopuffer, which returns the top semantic matches — as obfuscated paths and line ranges. The client decrypts the paths, reads the actual code from your local machine, and now has a candidate set.

But it doesn't stop there. This is the part most retrieval explainers miss. Cursor then follows the code graph. If your AuthController is a top match, Cursor pulls in what it imports, what calls it, what it calls. It spreads outward through the dependency web until it has the full slice an engineer would actually look at.

Then it builds a structured prompt — your question on top, the relevant chunks below, project rules and conventions interleaved — and sends a clean focused brief to the model. The model isn't reading your repo. It's reading a three-page document instead of your entire Confluence wiki.

Stage 5 — The execution loop and Composer

Most AI tools stop here: "Here's the code, copy-paste it, fix it yourself."

Cursor doesn't. It generates a diff, shows you exactly what changes, you click apply, and the edit goes in across however many files. If something breaks, Cursor reads the error and tries again.

With Cursor 2.0 (October 2025), Anysphere went further and shipped their own model: Composer. It's a Mixture-of-Experts LLM trained with reinforcement learning inside the real Cursor environment — not just to generate code, but to use tools: search the codebase, read files, edit, run terminal commands, recover from errors. Source: Codecademy

A few months later, Cursor published the Composer 2 technical report, detailing a two-phase training process: continued pretraining on Kimi K2.5 to deepen coding knowledge, followed by large-scale RL to improve end-to-end agent performance — finding that "reducing pretraining loss improves downstream RL performance, with better base knowledge reliably translating into a better agent." Source: Cursor Research

The reported numbers: roughly 4× faster than comparably intelligent models, with most agentic coding turns completing in under 30 seconds. Source: CometAPI

⚠️ The key insight: Composer wasn't trained in isolation. It was trained inside the exact same tool harness developers use in production — same search, same edit primitives, same sandboxes. Co-design beats raw model size for this specific job.

Why $60B? Three strategic reasons

Now the architecture makes the price tag legible. But the deal is about more than one product.

Reason 1: xAI is losing the coding war

Look at the AI coding leaderboard today: OpenAI ships Codex, Anthropic owns the agentic-coding crown with Claude Code (a $2.5B run rate and 300,000+ business customers), and xAI ships Grok — which honestly nobody is reaching for to write production code. Source: Fortune

That's a serious problem if your goal is to be a top-tier AI company. Buying Cursor solves it overnight. The product is already built, the Fortune 500 distribution is already there, and as TechCrunch noted, "SpaceX currently lacks a meaningful AI workforce and is widely seen as not having a significant AI business." Source: TechCrunch

Reason 2: Owning the interface layer

The AI stack has three layers.

Layer	Examples	Defensibility
Infrastructure	GPUs, data centers, Colossus (~1M H100-equivalent chips)	Capital-heavy, durable
Models	Claude, GPT, Gemini, Grok	Commoditizing fast — 5 frontier labs becoming 10
Interface	Cursor, Copilot, Windsurf, Claude Code	Workflow lock-in, distribution, training data flywheel

Models are slowly becoming substitutable. Five frontier models today, ten next year, all competing on price-per-token. The interface is different — it's where the developer lives every day, and where the choice gets made about which model handles which task. The interface is the remote control for the entire AI stack.

Whoever owns that layer owns:

Distribution to expert software engineers (Cursor's exact pitch in the SpaceX announcement). Source: Yahoo Finance
The workflow data — what real engineers actually do all day, which is the training fuel for the next generation of agent models.
An awkward dependency for rivals — Cursor still resells Claude and GPT today, "an awkward arrangement that this new SpaceX partnership may be designed to eventually escape." Source: TechCrunch

Reason 3: Reframing the IPO narrative

This is where the math gets interesting. SpaceX is targeting an IPO at $1.75T–$1.8T valuation in June 2026 — potentially the largest in history. Source: Yahoo Finance

A space company gets a space-company multiple. But a full-stack AI platform — running on the world's largest GPU cluster, with the fastest-growing developer tool ever built sitting on top — gets a completely different multiple.

The acquisition structure also reveals careful planning: SpaceX is delaying the actual purchase until after the IPO to avoid revising its confidential filings, and it'll be easier to finance a $60B deal in publicly traded stock anyway. Source: TechCrunch

The narrative shift: from a rocket company that also does AI, to an AI platform that also launches rockets.

What this means for the rest of the stack

A few second-order effects worth tracking:

Player	Pre-deal position	Likely post-deal
Anthropic / Claude Code	Dominant agentic-coding model	Loses biggest external customer (Cursor) over time
OpenAI / Codex	Was an early Cursor investor	Awkward — competitor now controls the editor
GitHub Copilot	Microsoft-backed, model-agnostic	Most insulated — already vertically integrated
Windsurf, Cline, others	Distant second-tier interfaces	Suddenly the "neutral" option for non-SpaceX customers

Note also the timing: the announcement landed less than a week before the Musk v. Altman trial — a Musk lawsuit against OpenAI CEO Sam Altman, whose company was an early investor in Cursor. Source: CNBC. Make of that what you will.

Conclusion

I've been watching the AI coding space closely, and what makes Cursor genuinely interesting isn't a single trick — it's the combination. Tree-sitter gives you syntactically meaningful chunks. Merkle trees give you incremental sync. Turbopuffer gives you fast semantic retrieval with privacy properties baked in. Composer closes the loop with a model trained in the same harness it ships in.

Each piece is well-known on its own. The moat is the integration — and the data flywheel that comes from millions of developers running thousands of agentic turns a day.

For practitioners, the takeaway is simpler: the model is rarely the bottleneck anymore. Retrieval and tool-use are. Whether you're building an internal Claude wrapper, a custom code search, or just trying to ship faster with the AI tools you have, the shape of Cursor's pipeline — local index, AST chunking, semantic + structural retrieval, tight execution loop — is the pattern to copy.

And if SpaceX exercises the option in June, that pattern just became the default at the biggest IPO in history.

TL;DR

On April 21, 2026, SpaceX announced a deal giving it the right to acquire Cursor for $60 billion later this year — or pay $10 billion for the partnership instead. Source: Bloomberg
Cursor's edge isn't the model — it's the retrieval pipeline: local Merkle-tree indexing, Tree-sitter AST chunking, Turbopuffer vector search, and a tight tool-use loop powered by their proprietary Composer model.
The deal isn't really about the editor. It's about owning the interface layer of the AI stack — the place developers actually live every day — ahead of what could be the largest IPO in history.

Why this deal matters

That one question is the entire game. And the answer is what SpaceX is paying $60 billion for.

How Cursor actually works

Cursor's pipeline answers that in five stages.

Let's walk through it.

Stage 1 — Local scan and Merkle tree

                 root_hash
                /         \
           dir_hash_A   dir_hash_B
            /     \       /    \
        file1   file2  file3  file4

That's why Cursor stays fast on huge codebases. The sync engine runs every ~3–5 minutes, but it almost never has real work to do.

Stage 2 — Tree-sitter chunking

Now Cursor needs to break files into searchable units. Naive text splitting tears apart a function definition mid-body — useless for retrieval.

💡 Why this matters: A function call without its definition is half a snippet. AST-aware chunking preserves the unit of meaning that humans actually reason about.

Stage 3 — Embeddings into Turbopuffer

Now we have clean chunks. The next step is to make them searchable by meaning, not by text.

Stage 4 — Retrieval at query time

You type "refactor the login flow to support Google OAuth."

Stage 5 — The execution loop and Composer

Most AI tools stop here: "Here's the code, copy-paste it, fix it yourself."

Cursor doesn't. It generates a diff, shows you exactly what changes, you click apply, and the edit goes in across however many files. If something breaks, Cursor reads the error and tries again.

The reported numbers: roughly 4× faster than comparably intelligent models, with most agentic coding turns completing in under 30 seconds. Source: CometAPI

⚠️ The key insight: Composer wasn't trained in isolation. It was trained inside the exact same tool harness developers use in production — same search, same edit primitives, same sandboxes. Co-design beats raw model size for this specific job.

Why $60B? Three strategic reasons

Now the architecture makes the price tag legible. But the deal is about more than one product.

Reason 1: xAI is losing the coding war

Reason 2: Owning the interface layer

The AI stack has three layers.

Layer	Examples	Defensibility
Infrastructure	GPUs, data centers, Colossus (~1M H100-equivalent chips)	Capital-heavy, durable
Models	Claude, GPT, Gemini, Grok	Commoditizing fast — 5 frontier labs becoming 10
Interface	Cursor, Copilot, Windsurf, Claude Code	Workflow lock-in, distribution, training data flywheel

Whoever owns that layer owns:

Distribution to expert software engineers (Cursor's exact pitch in the SpaceX announcement). Source: Yahoo Finance
The workflow data — what real engineers actually do all day, which is the training fuel for the next generation of agent models.
An awkward dependency for rivals — Cursor still resells Claude and GPT today, "an awkward arrangement that this new SpaceX partnership may be designed to eventually escape." Source: TechCrunch

Reason 3: Reframing the IPO narrative

This is where the math gets interesting. SpaceX is targeting an IPO at $1.75T–$1.8T valuation in June 2026 — potentially the largest in history. Source: Yahoo Finance

The narrative shift: from a rocket company that also does AI, to an AI platform that also launches rockets.

What this means for the rest of the stack

A few second-order effects worth tracking:

Player	Pre-deal position	Likely post-deal
Anthropic / Claude Code	Dominant agentic-coding model	Loses biggest external customer (Cursor) over time
OpenAI / Codex	Was an early Cursor investor	Awkward — competitor now controls the editor
GitHub Copilot	Microsoft-backed, model-agnostic	Most insulated — already vertically integrated
Windsurf, Cline, others	Distant second-tier interfaces	Suddenly the "neutral" option for non-SpaceX customers

Conclusion

Each piece is well-known on its own. The moat is the integration — and the data flywheel that comes from millions of developers running thousands of agentic turns a day.

And if SpaceX exercises the option in June, that pattern just became the default at the biggest IPO in history.

Why SpaceX Is Paying $60B for Cursor — And How Cursor Actually Works Under the Hood

TL;DR

Why this deal matters

How Cursor actually works

Stage 1 — Local scan and Merkle tree

Stage 2 — Tree-sitter chunking

Stage 3 — Embeddings into Turbopuffer

Stage 4 — Retrieval at query time

Stage 5 — The execution loop and Composer

Why $60B? Three strategic reasons

Reason 1: xAI is losing the coding war

Reason 2: Owning the interface layer

Reason 3: Reframing the IPO narrative

What this means for the rest of the stack

Conclusion

FAQ

Why SpaceX Is Paying $60B for Cursor — And How Cursor Actually Works Under the Hood

TL;DR

Why this deal matters

How Cursor actually works

Stage 1 — Local scan and Merkle tree

Stage 2 — Tree-sitter chunking

Stage 3 — Embeddings into Turbopuffer

Stage 4 — Retrieval at query time

Stage 5 — The execution loop and Composer

Why $60B? Three strategic reasons

Reason 1: xAI is losing the coding war

Reason 2: Owning the interface layer

Reason 3: Reframing the IPO narrative

What this means for the rest of the stack

Conclusion

FAQ