Boundary
Stress-test engineering decisions before they harden.
An open source, cloud-first, MCP-native multi-agent debate system that surfaces hidden assumptions, irreversible commitments, and failure modes through structured debates between opinionated AI agents.
Multiple agent perspectives
14 opinionated AI agents with distinct biases and viewpoints
Risk surface mapping
Identifies failure modes, assumptions, and irreversible commitments
Structured adversarial reasoning
Agents critique each other to illuminate trade-offs and tensions
Live debate
NATS vs Kafka

“If you shard the database now, you are committing to operational complexity that will be extremely hard to undo later. What concrete load do you expect in the next 12 months?”
Deploy the Agents
Get Boundary running and start your first debate in 6 simple steps
Setup (Steps 1-4)
Pull Docker Image
Pull the official Boundary MCP server image
docker pull ghcr.io/boundary-mcp/boundary-mcp:latestConfigure Cursor MCP
Add to ~/.cursor/mcp.json
{
"mcpServers": {
"boundary": {
"command": "docker",
"args": [
"run",
"-i",
"--rm",
"--env-file",
"/path/to/your/.env",
"ghcr.io/boundary-mcp/boundary-mcp:latest"
]
}
}
}Set Environment Variables
Create a .env file with your API keys
OPENAI_API_KEY=your-key-here
# Optional:
ANTHROPIC_API_KEY=your-key-here
GOOGLE_API_KEY=your-key-here
MCP_TRANSPORT=stdioCreate Cursor Commands
Copy command files to ~/.cursor/commands/
mkdir -p ~/.cursor/commandsUsage (Steps 5-6)
Prepare the Debate
Start a new agent in Cursor to prepare the debate
/prepare-boundary-debate for this question: "Should we use Redis or PostgreSQL for caching?"Review and Start the Debate
Review the generated document, make any necessary adjustments, then start the debate
/start-boundary-debate for this question and context @<file-generated-in-previous-step>How Boundary Works
Engineering decisions are not "right answer" problems. They are context-sensitive trade-off problems. Boundary preserves the tension between competing values through structured adversarial reasoning.
Frame the Decision
Present your engineering decision to Boundary. The system ingests your codebase context, architectural constraints, and the specific boundary you're approaching. This is where irreversible commitments begin to form.
Agents Take Positions
Opinionated AI agents with consistent biases materialize. The Pragmatist, Security Analyst, Complexity Auditor, Scalability Maximalist, and Domain Purist each take positions. They are not neutral. Their disagreement is the feature, not a bug.
Structured Adversarial Review
Agents engage in structured rounds of critique. Each perspective challenges the others' assumptions, exposing hidden trade-offs, alternative failure modes, and competing engineering values. The tension between positions illuminates the decision space.
Map the Risk Surface
The system extracts failure modes, irreversible commitments, and assumption dependencies from the debate. You receive a clear map of decision risk, not probabilities, but concrete failure vectors that emerge from the agent conflict.
Synthesize the Decision Map
Boundary synthesizes all perspectives into a structured decision map, not a verdict, but an auditable record of trade-offs, assumptions, and failure modes. This becomes your decision memory, queryable for future reference when similar boundaries emerge.
Common Use Cases
Boundary helps with high-stakes engineering decisions:
Database & Storage Choices
"Should we use PostgreSQL or Redis for caching?" "When should we shard the database?" Get perspectives on scalability, cost, and operational complexity.
API Design Decisions
"REST vs GraphQL?" "Microservices vs monolith?" Understand trade-offs in maintainability, performance, and team velocity.
Infrastructure Choices
"Kubernetes or simpler orchestration?" "Cloud vs on-premise?" Evaluate cost, complexity, and lock-in risks.
Architecture Patterns
"Event-driven or request-response?" "CQRS or traditional CRUD?" Understand when patterns help vs. add unnecessary complexity.
Opinionation is Key: The Agent Mindsets
Neutral agents are useless. Each agent embodies a consistent bias that mirrors how senior engineers naturally reason in roles:
- •Pragmatist: "Adding Kafka might be overkill for current load. Are we solving a problem we don't have?"
- •Security / Threat Analyst: "This introduces an unverified auth path, any mistake could expose sensitive data."
- •Scalability Maximalist: "If user traffic doubles in a month, this design will fail catastrophically unless we plan for partitioning."
- •Complexity Auditor: "Each additional microservice adds a mental overhead and testing burden; is it justified?"
- •Domain Purist: "The proposed schema violates the invariants of our billing domain. This will create bugs downstream."
The debate is not about answers. It's about decision risk surfaces. The output is not a verdict. It is a decision map.