AI agent using Claude Cookbooks patterns + Browser MCP for LinkedIn research
An intelligent agent that researches LinkedIn profiles using Claude Sonnet 4.5, implementing the orchestrator-workers pattern from Claude Cookbooks, with Browser MCP's 33 browser automation tools.
┌─────────────────────────────────────────────────────────────────┐
│ LinkedIn Research Agent │
│ (Orchestrator-Workers Pattern) │
├─────────────────────────────────────────────────────────────────┤
│ ┌──────────────────┐ │
│ │ Orchestrator │ Analyzes task, breaks into subtasks │
│ │ (Claude 4.5) │ Coordinates workers │
│ └────────┬─────────┘ │
│ │ │
│ ├─────────┬──────────┬──────────┬──────────┐ │
│ ▼ ▼ ▼ ▼ ▼ │
│ ┌─────────┐ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ │
│ │Navigator│ │Searcher│ │Extractor│ │Analyzer│ │Reporter│ │
│ │ Worker │ │ Worker │ │ Worker │ │ Worker │ │ Worker │ │
│ └────┬────┘ └───┬────┘ └───┬─────┘ └───┬────┘ └───┬────┘ │
└──────────┼──────────┼──────────┼───────────┼──────────┼────────┘
│ │ │ │ │
└──────────┴──────────┴───────────┴──────────┘
│
┌────────────┴────────────┐
│ │
┌────────▼────────┐ ┌────────▼────────┐
│ Browser MCP │ │ Memory System │
│ (33 Tools) │ │ (File Storage) │
│ - navigate │ │ - profiles/ │
│ - snapshot │ │ - sessions/ │
│ - click │ │ - cache/ │
│ - evaluate │ └─────────────────┘
│ - wait_for │
└──────────────────┘
- Dynamic Task Decomposition: Orchestrator analyzes each task and creates optimal subtask plan
- Specialized Workers: Navigator, Searcher, Extractor, Analyzer, Reporter
- Adaptive Planning: Worker selection based on specific task requirements
- Coordinated Execution: Structured communication via XML
- File-based Storage: Profiles, sessions, and cache in
/memories - Persistent State: Research data survives between sessions
- Structured Storage: JSON format for profiles
- Security: Path validation prevents directory traversal
- 33 Tools Available: All Browser MCP tools accessible to workers
- Session Persistence: Login once, stay logged in
- CDP Integration: Stable element targeting
- Accessibility Tree: Reliable element identification
- Agent Lightning Ready: Collects orchestrator + worker data for APO training
- Enhanced Metrics: Worker performance, task decomposition quality
- Structured Logging: JSONL format for ML training
- Performance Tracking: Duration, success rate, profiles found
- Profile search with filters (title, location, company)
- Company employee research
- Competitive analysis across companies
- Structured data extraction
- Professional Excel/PDF reports (via Skills API)
- Python 3.8+
- Node.js 18+ (for Browser MCP server)
- Browser MCP Server: Must be built at
/Users/rammaree/projects/social-browser-mcp
cd linkedin-researcher
pip install -r requirements.txtcp .env.example .envEdit .env and set:
ANTHROPIC_API_KEY: Your Anthropic API keyBROWSER_MCP_PATH: Path to Browser MCP server (default:../social-browser-mcp/dist/index.js)
# Make sure Browser MCP is built
cd /Users/rammaree/projects/social-browser-mcp
npm run build
# Verify it works
node dist/index.js# Research 10 Product Managers in Mumbai
python main.py \
--query "Product Manager" \
--count 10 \
--location "Mumbai"
# Research 5 Software Engineers at Google
python main.py \
--query "Software Engineer" \
--count 5 \
--company "Google"
# Save results to custom file
python main.py \
--query "Data Scientist" \
--count 20 \
--location "Bangalore" \
--output data_scientists.jsonimport asyncio
from src.agent import LinkedInResearchAgent
from src.mcp_client import BrowserMCPClient
async def main():
# Connect to Browser MCP
mcp_client = BrowserMCPClient("/path/to/social-browser-mcp/dist/index.js")
await mcp_client.connect()
# Create agent
agent = LinkedInResearchAgent(
api_key="your_anthropic_api_key",
mcp_client=mcp_client
)
# Run research
result = await agent.research_profiles(
query="Product Manager",
count=10,
location="Mumbai"
)
print(result)
# Cleanup
await mcp_client.disconnect()
asyncio.run(main())The agent uses Claude Sonnet 4.5 with tool use capabilities:
1. Receive research task
2. Agent analyzes task and decides which tools to use
3. Agent calls Browser MCP tools (navigate, click, extract, etc.)
4. Agent processes results
5. Agent decides next action (continue or finish)
6. Repeat until task completeAgent workflow for "Research 5 Product Managers in Mumbai":
Step 1: browser_navigate → Navigate to www.linkedin.com
Step 2: browser_snapshot → Get page structure
Step 3: browser_click → Click search box
Step 4: browser_type → Type "Product Manager Mumbai"
Step 5: browser_press → Press Enter
Step 6: browser_wait_for → Wait for results
Step 7: browser_snapshot → Get search results
Step 8: browser_click → Click first profile
Step 9: browser_snapshot → Extract profile data
Step 10: browser_navigate → Navigate back to search
... repeat for 5 profiles
Step N: Return structured JSON with all profile data
When ENABLE_TRAINING_MODE=true, the agent logs:
{
"task_id": "task_1234567890",
"timestamp": "2025-11-04T10:30:00Z",
"query": "Product Manager in Mumbai",
"task_type": "profile_search",
"parameters": {"count": 10, "location": "Mumbai"},
"status": "completed",
"result": {...},
"duration_seconds": 45,
"tools_used": ["browser_navigate", "browser_click", ...]
}This data is used for Agent Lightning APO training.
After collecting 50+ training examples, optimize the agent with Agent Lightning APO:
from agent_lightning import APO
# Load training data
training_data = load_training_examples()
# Initialize APO
apo = APO(
initial_prompt=agent.system_prompt,
evaluation_dataset=training_data,
optimization_metric="success_rate"
)
# Optimize (costs ~$5-10)
optimized_prompt = apo.optimize()
# Deploy
agent.system_prompt = optimized_promptBefore Training:
- Success rate: ~80%
- Avg profiles per task: 8/10
- Avg duration: 60 seconds
After Agent Lightning:
- Success rate: ~90-95%
- Avg profiles per task: 9.5/10
- Avg duration: 45 seconds
Improvement: 10-20% across all metrics
linkedin-researcher/
├── src/
│ ├── __init__.py # Package initialization
│ ├── agent.py # Claude SDK agent (autonomous loop)
│ ├── mcp_client.py # Browser MCP connection
│ ├── memory.py # Memory system (future)
│ └── workflows/
│ └── profile_search.py # Profile search workflow (future)
│
├── config/
│ ├── agent_config.yaml # Agent configuration
│ └── mcp_config.json # MCP connection config
│
├── tests/
│ └── test_agent.py # Unit tests (future)
│
├── training/
│ └── data.jsonl # Training data for Agent Lightning
│
├── logs/
│ └── agent.log # Agent logs
│
├── main.py # CLI entry point
├── requirements.txt # Python dependencies
├── .env.example # Environment template
└── README.md # This file
agent:
model: "claude-sonnet-4-5"
max_tokens: 4000
temperature: 0.7
system_prompt: |
You are a LinkedIn research specialist...
workflows:
profile_search:
max_profiles: 50
timeout_seconds: 300{
"mcp_servers": {
"browser-mcp": {
"command": "node",
"args": ["/path/to/social-browser-mcp/dist/index.js"],
"transport": "stdio"
}
}
}- Created directory structure
- Configuration files
- Environment setup
- stdio transport to Browser MCP
- Tool listing and invocation
- Error handling
- Autonomous agent loop
- Tool use integration
- Training data collection
- Profile search workflow
- Company research workflow
- Competitive analysis workflow
- Redis for session state
- PostgreSQL for research results
- pgvector for semantic search
- Unit tests
- Integration tests
- E2E tests
- Run 50+ research tasks
- Collect performance metrics
- Analyze failure cases
- Load training data
- Run APO optimization
- A/B test improvements
- Deploy optimized agent
- Containerization
- API server
- Monitoring
- Scaling
Claude API:
- Input tokens: ~5,000 tokens (research query + tool results)
- Output tokens: ~2,000 tokens (agent thinking + structured response)
- Cost per task: ~$0.15 (with Claude Sonnet 4.5)
For 50 profiles (5 tasks × 10 profiles):
- Total cost: ~$0.75
Comparison:
- Manual research: 2-3 hours @ $50/hour = $100-150
- Agent cost: $0.75
- Savings: 99%+
Agent Lightning APO:
- Training: ~$5-10 for 50 tasks
- One-time cost
Expected ROI:
- 10-20% efficiency improvement
- Pays for itself after ~50-100 tasks
# Check if Browser MCP is built
cd /Users/rammaree/projects/social-browser-mcp
npm run build
# Test Browser MCP directly
node dist/index.js# Verify API key is set
echo $ANTHROPIC_API_KEY
# Or check .env file
cat .env | grep ANTHROPIC_API_KEY# Reinstall dependencies
pip install -r requirements.txt --force-reinstallpython main.py \
--query "Product Manager" \
--count 10 \
--location "Mumbai" \
--output pm_mumbai.jsonOutput (pm_mumbai.json):
{
"result": "Found 10 Product Manager profiles in Mumbai",
"profiles": [
{
"name": "John Doe",
"title": "Senior Product Manager",
"company": "Google",
"location": "Mumbai, India",
"experience_years": 8,
"education": "MBA, IIM Bangalore",
"profile_url": "https://www.linkedin.com/in/johndoe"
},
...
],
"iterations": 15,
"duration_seconds": 45
}python main.py \
--query "Software Engineer" \
--count 5 \
--company "Google" \
--output google_engineers.json- 33 browser automation tools
- Deterministic behavior (given input → fixed output)
- Session persistence
- Used by agents, not trainable itself
- Uses Browser MCP tools
- Autonomous decision-making (which tools to use, when, how)
- Trainable with Agent Lightning/DSPy
- Optimizable prompts and workflows
Key Insight: RL/DSPy applies to the AGENT (this project), not the tools (Browser MCP)
MIT
Built using:
- Claude Sonnet 4.5: Anthropic's frontier model
- Browser MCP: 33-tool browser automation server
- Agent Lightning: Microsoft's APO training framework
- MCP Protocol: Model Context Protocol for tool integration
Inspired by: Production-ready autonomous agent patterns from Claude Cookbooks and Azure AI Foundry research.