June 16, 2025
· 6 min readScaling AI Agents at LinkedIn: From Framework to Production
LinkedIn's journey from experimental AI agents to production-scale systems, including their bold Python migration, comprehensive agent framework built on LangChain/LangGraph, and distributed platform serving 30+ production services across 20+ teams.
How LinkedIn built a comprehensive agent platform serving 20+ teams and 30+ production services
Introduction
As AI agents move from experimental proof-of-concepts to production-scale systems, organizations face a critical question: how do you scale not just the technical infrastructure, but the adoption and development of agents across an entire organization?
LinkedIn recently shared their journey of building agents at scale, culminating in their first production agent - the LinkedIn Hiring Assistant. This isn't just a story about building one successful agent, but about creating an entire ecosystem that has enabled over 20 teams to deploy 30+ production AI services.
The Two Dimensions of Scale
When we think about scaling AI agents, LinkedIn identified two critical dimensions:
Performance Scale: The traditional engineering concern of handling massive data volumes and maintaining system performance under load.
Adoption Scale: The more subtle but equally important challenge of scaling agent development across an organization, ensuring teams can innovate quickly and generate the best ideas.
While most discussions focus on the first dimension, LinkedIn's success came from solving the second.
[[NEWSLETTER]]
LinkedIn Hiring Assistant: The Production Blueprint
LinkedIn's first production agent, the Hiring Assistant, demonstrates what thoughtful agent design looks like in practice. The agent follows an "ambient agent" pattern:
- Initial Input: Recruiters describe the role they want to fill and attach relevant documents
- Automatic Processing: The agent generates qualifications based on input and supplementary documents
- Background Execution: The system works asynchronously, notifying the recruiter of progress
- Results Delivery: Candidates are sourced and presented in a detailed review interface
Under the hood, this follows a supervisor multi-agent architecture where a main coordinator manages specialized sub-agents, each capable of interacting with LinkedIn's existing services through tool calling (or as LinkedIn calls them, "skills").
The Great Python Migration
Perhaps the most surprising aspect of LinkedIn's agent strategy was their bold decision to abandon Java for AI applications and standardize on Python - despite Java being their primary language for business logic.
Why the Switch Was Necessary
Initially, LinkedIn tried to build GenAI applications in Java, their traditional stack. This worked for simple prompt-based applications but quickly revealed fundamental problems:
- Experimentation Friction: Teams wanted to prototype in Python but were forced to production-ize in Java
- Innovation Barriers: The language gap made it nearly impossible to leverage the rapidly evolving Python-based AI ecosystem
- Development Speed: Every new technique, library, or model required custom Java implementations
The Decision Point
LinkedIn made several key observations that led to their Python adoption:
- Undeniable Demand: Teams across all verticals wanted to use generative AI
- Java-Python Gap: The existing stack wasn't working for AI-specific needs
- Ecosystem Reality: Staying current with AI developments required Python integration
Their solution was characteristically bold: use Python for everything - business logic, engineering, evaluations, and production services.
Building the Agent Framework
LinkedIn's agent framework centers around several key architectural decisions:
Core Technology Stack
- Python gRPC: For service communication with built-in streaming, binary serialization, and cross-language support
- LangChain & LangGraph: For modeling core business logic and agent workflows
- Standard Utilities: Centralized tools for LLM inference, conversational memory, and checkpointing
Why LangChain and LangGraph?
The choice wasn't arbitrary. LinkedIn evaluated the ecosystem and selected these tools for specific reasons:
Ease of Use: Even Java engineers could quickly understand and implement agent logic. The syntax is intuitive enough that teams could build non-trivial applications in days rather than weeks.
Sensible Interfaces: The abstractions provided clean ways to model LinkedIn's internal infrastructure. For example, teams could switch between Azure OpenAI and on-premise models with just a few lines of code.
Community Integration: Pre-built implementations like ReAct agents and community integrations dramatically accelerated development.
The Agent Platform: Solving Distribution Challenges
Building individual agents is one challenge; orchestrating them at scale is another entirely. LinkedIn identified two critical problems with agentic systems:
- Long-Running Processes: Agents can take significant time to process data, requiring asynchronous execution patterns
- Complex Dependencies: Agents need to execute in parallel while respecting dependencies between their outputs
Messaging as the Foundation
LinkedIn solved the first problem by modeling agent communication as a messaging challenge, extending their existing robust messaging infrastructure to support:
- Agent-to-agent communication
- User-to-agent messaging
- Automatic retry mechanisms through queuing systems
Layered Memory Architecture
For the dependency problem, LinkedIn built a sophisticated memory system with different scopes:
- Working Memory: For immediate interaction context
- Long-term Memory: For persistent agent-user relationships
- Collective Memory: For shared knowledge across agent instances
This layered approach allows agents to maintain context appropriately while scaling across interactions.
Skills: Beyond Simple Tool Calling
LinkedIn's "skills" concept extends traditional function calling in several important ways:
Scope: Skills aren't limited to local functions - they can be RPC calls, database queries, prompts, or even other agents
Execution Model: Skills support both synchronous and asynchronous invocation, critical for the ambient agent pattern
Centralization: A skill registry allows teams to expose capabilities that other agents can discover and use
This design remarkably anticipated the Model Context Protocol (MCP) concept well before it was formally introduced.
Production Lessons and Key Insights
LinkedIn's journey offers several critical insights for organizations scaling AI agents:
Invest in Developer Productivity
The AI space moves incredibly fast. Success requires making it as easy as possible for developers to:
- Adopt best practices automatically
- Adapt quickly to new techniques and models
- Contribute without extensive specialized knowledge
Standardizing patterns and providing opinionated frameworks dramatically reduces the barrier to entry.
Don't Forget Production Fundamentals
Despite the novelty of AI agents, they're still production software systems. This means:
- Reliability: Traditional availability and performance concerns still apply
- Observability: Custom observability solutions are essential for agentic execution patterns
- Evaluation: Robust testing frameworks must account for non-deterministic behavior
Results That Speak
LinkedIn's approach has proven successful at scale:
- 20+ teams actively using the framework
- 30+ production services supporting GenAI product experiences
- Days instead of weeks for building non-trivial agent applications
The Path Forward
LinkedIn's experience demonstrates that scaling AI agents requires more than just technical infrastructure - it demands a comprehensive approach to developer experience, organizational adoption, and production operations.
Their success came from recognizing that the bottleneck isn't usually computational resources, but rather the ability of teams across an organization to innovate quickly and reliably with AI agent technology.
As AI agents become increasingly central to product experiences, LinkedIn's blueprint offers a proven path for organizations looking to scale beyond single-agent experiments to comprehensive agent-powered ecosystems.
The key insight? Scale isn't just about handling more data or users - it's about enabling more teams to build better agents, faster. LinkedIn's platform approach shows how to achieve both dimensions of scale simultaneously.
References
📺 Watch the Full Presentation: Agents at Scale using LangGraph - LinkedIn
This blog post is based on LinkedIn's presentation about their agent scaling journey. Their approach demonstrates how thoughtful platform engineering can accelerate AI adoption across large organizations while maintaining production quality and reliability.