A foundational scaffold for building robust, modular, and scalable multi-agent systems using LangGraph. This project provides a production-ready architecture that moves beyond simple scripts to a fully-fledged, API-driven application. It is designed to be the best possible starting point for any LangGraph-based agentic system.
The mission is to provide a clear, maintainable, and testable template for constructing multi-agent systems. The core philosophy is a separation of concerns, where the system is composed of distinct agent types:
- Specialists (
BaseSpecialist): Modular agents that perform a single, well-defined task. The system supports both LLM-driven specialists for complex reasoning and deterministic "procedural" specialists for reliable, code-based actions. - Runtime Orchestrator (
RouterSpecialist): A specialized agent that makes the turn-by-turn routing decisions within the running graph. - Structural Orchestrator (
GraphBuilder): A high-level system component responsible for reading the configuration, instantiating all specialists, and compiling the finalLangGraphinstance before execution.
This scaffold provides a well-defined architecture designed for reliability, scalability, and resilience.
-
API-First Design: The system is exposed via a FastAPI web server with sample Gradio UIs, providing clean, modern interfaces for interaction and integration.
-
Configuration-Driven: The entire agentic system—specialists, models, and prompts—is defined in configuration files. The system's structure does not depend on changing Python code.
-
Three-Tiered Configuration System:
- Tier 1 (
.env): Secrets and environment-specific settings (API keys, connection URLs) - Tier 2 (
config.yaml): Architectural blueprint defining all possible components (committed to git) - Tier 3 (
user_settings.yaml): Model bindings and runtime configuration (git-ignored) - Environment Variable Interpolation: Supports
${VAR_NAME}and${VAR_NAME:-default}syntax for single-source-of-truth configuration
- Tier 1 (
-
MCP (Message-Centric Protocol): Synchronous, direct service invocation between specialists with timeout controls and LangSmith tracing. Enables specialists to call each other's functions without routing through the graph, reducing latency and LLM costs.
-
Virtual Coordinator Pattern: Transparent upgrade from single-node capabilities to multi-node subgraphs. The Router chooses WHAT capability is needed, while the Orchestrator decides HOW to implement it. Exemplified by the Tiered Chat Subgraph.
-
Tiered Chat Subgraph (CORE-CHAT-002): Production-ready multi-perspective chat with:
- Parallel execution of analytical and contextual specialists (ProgenitorAlpha/Bravo)
- Fan-out/fan-in graph pattern with proper state management
- Graceful degradation when components fail
- 39 comprehensive tests ensuring reliability
-
Fail-Fast Validation:
- Connectivity Check: Startup script (
verify_connectivity.py) validates internet access and LLM provider reachability before the application starts. Prevents "zombie" containers that look healthy but can't work. - Route Validation: Eliminates silent infinite-loop bugs by validating graph edges at build time.
- System Invariants: Runtime monitor enforces structural integrity and prevents invalid state transitions.
- Connectivity Check: Startup script (
-
First-Class Observability: Integrated with LangSmith out of the box. FastAPI
lifespanmanager ensures buffered traces are sent before exit. Essential for debugging complex multi-agent interactions. -
Schema-Enforced Reliability: Pydantic models define "hard contracts" for LLM outputs, dramatically reducing runtime errors and validation boilerplate.
-
Robust Termination Sequence: Mandatory three-stage finalization (specialist signals completion → EndSpecialist synthesizes → Router archives) ensures predictable shutdown.
-
Decoupled Adapter Pattern: Specialists request pre-configured "adapters" by name, allowing you to swap LLM providers (Gemini, OpenAI, LM Studio, etc.) without touching business logic.
-
Model-Agnostic Architecture: All model bindings are runtime configuration. Develop with local models, deploy with cloud APIs—no code changes required.
-
Comprehensive Documentation:
- Developer's Guide (architecture, patterns, best practices)
- Specialist Creation Guide (step-by-step tutorial)
- Integration Test Guide (testing patterns and examples)
- Graph Construction Guide (subgraph patterns)
-
Modern Python Tooling: Uses
pyproject.tomlandpip-toolsto ensure reproducible and reliable builds for both production and development.
This scaffold grants significant power to one or more LLMs that you define as specialists. The tools you create can execute real code, access your file system, and make external API calls with your keys.
Warning
You are granting the configured LLM direct control over these powerful tools.
An agentic system can create feedback loops that amplify a simple misunderstanding over many iterations. This emergent behavior can lead to complex, unintended, and irreversible actions like file deletion or data exposure.
Always run this project in a secure, sandboxed environment (like a Docker container or a dedicated VM).
Using Docker is the recommended way to run this project. It provides a secure, sandboxed environment and guarantees a consistent setup.
- Docker and Docker Compose
-
Clone the Repository
git clone https://github.com/shanevcantwell/langgraph-agentic-scaffold.git cd langgraph-agentic-scaffold -
Configure Your Environment
- Copy the example environment file:
cp .env.example .env - Edit the new
.envfile to add your API keys (e.g.,GOOGLE_API_KEY,LANGSMITH_API_KEY). - Crucially, to connect to local model servers (like LM Studio or Ollama) running on your host machine:
- Use the special
host.docker.internalhostname in your URLs. - Ensure
host.docker.internalis added to yourNO_PROXYenvironment variable if you are behind a corporate proxy or using the included Squid proxy.
- Use the special
- Copy the proxy configuration:
cp proxy/squid.conf.example proxy/squid.conf# .env # Use host.docker.internal to connect from the container to services on the host. LMSTUDIO_BASE_URL="http://host.docker.internal:1234/v1/" OLLAMA_BASE_URL="http://host.docker.internal:11434" # Ensure local traffic bypasses the proxy NO_PROXY=localhost,127.0.0.1,host.docker.internal
- Copy the user settings template:
cp user_settings.yaml.example user_settings.yaml - Edit
user_settings.yamlto bind your desired models to core specialists like therouter_specialist.
- Copy the example environment file:
-
Build and Run the Containers From the project root, run the following command. This will build the Docker image, start the application and proxy containers, and run them in the background.
docker compose up --build -d
- Web UI (Gradio): Access the web interface in your browser at
http://localhost:5003. - V.E.G.A.S. Terminal: A real-time, streaming terminal interface for monitoring agent execution at
http://localhost:3000. - API Docs (FastAPI): Access the interactive API documentation at
http://localhost:8000/docs. - Command Line (CLI): To interact via the CLI, execute the
cli.pyscript inside the runningappcontainer.For multi-line input, pipe your prompt into the command:docker compose exec app python -m app.src.clicat your_prompt.txt | docker compose exec -T app python -m app.src.cli
If you make changes to configuration files while the containers are running, you may need to restart the services for them to take effect.
- Changes to
.env,config.yaml, or Python code: Restart theappcontainer.docker compose restart app
- Changes to
proxy/squid.conf: Restart theproxycontainer.docker compose restart proxy
If you prefer not to use Docker, you can set up a local Python virtual environment.
- Python 3.12+
- Run the installation script for your OS from the project root (e.g.,
./scripts/install.sh). This creates a virtual environment and copies example configuration files. - Configure your environment. Edit the newly created
.envfile to add your API keys and local model server URLs (e.g.,http://localhost:1234). - Bind your models. Open
user_settings.yamland specify which LLM providers you want to use.
- Start the API Server:
# On Linux/macOS: ./scripts/server.sh start # On Windows: .\scripts\server.bat start
- Start the Web UI (in a separate terminal):
# First, activate the virtual environment source ./.venv_agents/bin/activate # Then, run the UI python -m app.src.ui --port 5003
This scaffold is designed for serious agentic system development with comprehensive documentation:
- Developer's Guide: The central hub for all documentation.
- System Architecture: Core concepts, patterns, and best practices.
- Configuration Guide: The 3-Tiered Configuration System.
- MCP Guide: Synchronous service calls and the Message-Centric Protocol.
- Observability Guide: LangSmith integration and debugging.
- Creating a New Specialist: Step-by-step tutorial for adding custom specialists.
- Integration Test Guide: Patterns and examples for writing integration tests.
- Graph Construction Guide: Subgraph patterns and workflow composition
Debugging complex multi-agent interactions with print statements is insufficient. This scaffold integrates with LangSmith out of the box for:
- Visual tracing of every run (hierarchical specialist execution)
- State inspection at each step
- Error isolation with red highlighting
- Token count and latency tracking
Setup: Add LangSmith credentials to .env and ensure the FastAPI lifespan manager is present (see Observability Guide).
Maturity: Alpha / Active Development Roadmap Progress: Project Bedrock 100% complete (37/37 tasks) Test Coverage: 1,000+ tests across unit, integration, and contract testing
Production-Ready Features:
- Tiered Chat Subgraph (CORE-CHAT-002) with parallel progenitors
- MCP Infrastructure (internal + external containerized services)
- Fail-Fast Validation (startup + route validation)
- Invariant Monitor & Circuit Breaker system
- Context Engineering pipeline (Triage → Facilitate → Execute)
- Hybrid Routing Engine (declarative + procedural + probabilistic)
- V.E.G.A.S. Terminal UI for real-time streaming
- surf-mcp integration for browser automation
Post-Bedrock Development:
- ReActMixin for iterative tool-use patterns
- Deep Research pipeline
- Tiered synthesis with graceful degradation
See docs/ADRs/ for architectural decisions and design documentation.
This project is licensed under the MIT License. See the LICENSE file for details.
© 2025 Reflective Attention