Consensus in Multi-Agent Systems: When Agents Disagree

The Democracy Problem

Three code reviewers. Same pull request. One approves. One requests changes. One wants to redesign the entire approach.

With humans, a tech lead steps in and makes a call. With AI agents, you need to solve this programmatically. And the solution can't be "pick the first response" because that defeats the purpose of having multiple agents.

Consensus in multi-agent AI systems is fundamentally different from distributed database consensus. Paxos and Raft solve "which value did we agree on?" Agents need to solve "which judgment is correct when truth is subjective?"

That's a harder problem.

Why Consensus Matters

Without consensus, multi-agent systems produce contradictory outputs. The coding agent writes a function. The security agent says it's vulnerable. The performance agent says the security fix will be too slow. The coding agent rewrites it. The security agent finds a new vulnerability in the rewrite. Loop forever.

This isn't hypothetical. I've watched agent systems burn tokens in infinite disagreement cycles. The agents weren't wrong individually. They just had different optimization targets with no mechanism to resolve the conflict.

Strategy 1: Weighted Voting

Each agent votes on a decision. Votes are weighted by the agent's domain expertise relative to the decision being made. It is worth reading about communication protocols alongside this.

class WeightedConsensus:
    def __init__(self, agents, domain_weights):
        self.agents = agents
        self.weights = domain_weights

    async def decide(self, question, domain):
        votes = {}
        for agent in self.agents:
            opinion = await agent.evaluate(question)
            weight = self.weights[agent.role].get(domain, 0.1)
            votes[agent.id] = {
                "opinion": opinion,
                "weight": weight,
                "confidence": opinion.confidence
            }

        # Weight * confidence = effective vote
        scored = {}
        for v in votes.values():
            position = v["opinion"].position
            score = v["weight"] * v["confidence"]
            scored[position] = scored.get(position, 0) + score

        winner = max(scored, key=scored.get)
        return ConsensusResult(
            decision=winner,
            margin=self._calculate_margin(scored),
            dissenting=self._get_dissenters(votes, winner)
        )

The security agent's vote on a security question carries more weight than the performance agent's vote. But the performance agent still gets a voice because security decisions have performance implications.

The nuance: Weight alone isn't enough. You also factor in confidence. A security agent that's 50% confident about a vulnerability shouldn't override a performance agent that's 95% confident about a critical bottleneck.

When Weighted Voting Fails

When the domains overlap. Is "SQL injection via ORM query builder" a security issue or a database design issue? Both agents claim expertise. Both have high confidence. Weighted voting degenerates to a coin flip.

Strategy 2: Structured Debate

Agents don't just vote. They argue. Each agent presents its position with evidence, other agents respond to the evidence, and a moderator evaluates the quality of arguments.

class StructuredDebate:
    def __init__(self, moderator, max_rounds=3):
        self.moderator = moderator
        self.max_rounds = max_rounds

    async def resolve(self, agents, question):
        positions = []
        for agent in agents:
            pos = await agent.state_position(question)
            positions.append(pos)

        for round_num in range(self.max_rounds):
            rebuttals = []
            for agent in agents:
                others = [p for p in positions
                         if p.agent_id != agent.id]
                rebuttal = await agent.rebut(others)
                rebuttals.append(rebuttal)

            # Check for convergence
            if self._positions_converged(rebuttals):
                return self._extract_consensus(rebuttals)

            positions = rebuttals

        # No convergence, moderator decides
        return await self.moderator.adjudicate(positions)

The magic is in the rebuttals. Agents have to engage with opposing arguments, not just restate their position. A well-designed rebuttal prompt forces the agent to acknowledge counterpoints.

Why this works: LLMs are actually good at evaluating arguments when presented with structured pro/con evidence. The moderator agent doesn't need domain expertise. It needs the ability to evaluate argument quality, consistency, and evidence.

Why this is expensive: Multiple rounds of multi-agent communication. Each round burns context and tokens. A three-agent, three-round debate is 9 LLM calls just for the debate, plus the moderator's adjudication. Use this for high-stakes decisions, not routine ones.

Strategy 3: Confidence-Gated Escalation

Skip consensus when it's not needed. Most agent decisions aren't contentious. One agent has high confidence, the others are neutral or low-confidence. Consensus is only triggered when there's genuine disagreement.

class ConfidenceGate:
    def __init__(self, agreement_threshold=0.8,
                 confidence_gap=0.3):
        self.threshold = agreement_threshold
        self.gap = confidence_gap

    async def process(self, agents, question):
        opinions = [await a.evaluate(question)
                    for a in agents]

        # Check for clear winner
        high_conf = [o for o in opinions
                    if o.confidence > self.threshold]
        if len(high_conf) == 1:
            return high_conf[0]  # Clear expert, no debate

        # Check for agreement
        positions = [o.position for o in opinions]
        if len(set(positions)) == 1:
            return opinions[0]  # Everyone agrees

        # Check for confidence gap
        sorted_ops = sorted(opinions,
                          key=lambda o: o.confidence,
                          reverse=True)
        if (sorted_ops[0].confidence -
            sorted_ops[1].confidence > self.gap):
            return sorted_ops[0]  # Clear confidence lead

        # Genuine disagreement, escalate
        return await self.full_consensus(agents, question)

In practice, 70-80% of decisions pass through the gate without triggering consensus. The remaining 20-30% get the full treatment.

This is the approach I use. Consensus is expensive. Don't pay for it when you don't need it.

Strategy 4: Evidence-Based Resolution

Agents don't just argue. They provide testable evidence. The system runs the tests and the results settle the debate.

class EvidenceBasedConsensus:
    async def resolve(self, dispute):
        evidence_requests = []
        for position in dispute.positions:
            tests = await position.agent.propose_evidence()
            evidence_requests.extend(tests)

        results = await self.run_tests(evidence_requests)

        # Re-evaluate positions with evidence
        updated = []
        for position in dispute.positions:
            revised = await position.agent.revise_with_evidence(
                position, results
            )
            updated.append(revised)

        return self._select_best_supported(updated)

Security agent says there's a vulnerability? Write a test that exploits it. Performance agent says it's too slow? Run a benchmark. The data decides. It is worth reading about fault-tolerant multi-agent systems alongside this.

Limitation: Not everything is testable in the moment. "This architecture won't scale in six months" isn't something you can settle with a benchmark today. For those cases, fall back to structured debate.

Handling Persistent Disagreement

Sometimes agents won't converge. Three rounds of debate, evidence gathered, still no agreement. You need a termination condition.

My approach: a three-tier escalation.

Confidence gate catches easy decisions (70-80% of cases)
Weighted voting handles moderate disagreements (15-20%)
Structured debate resolves genuine conflicts (5-10%)
Human escalation for the truly irreconcilable (less than 1%)

That last tier is important. Sometimes the right answer is "I don't know, ask a human." Systems that never escalate to humans are systems that confidently make wrong decisions on edge cases.

Anti-Patterns

Majority rules without weights. The UI agent, the test agent, and the docs agent outvote the security agent on a security issue. Three against one. The majority is wrong.

Letting the loudest agent win. Some agents generate more verbose, more confident-sounding responses. Confidence in language doesn't equal confidence in accuracy. Measure confidence numerically, not linguistically. The related post on topology design goes further on this point.

Consensus on everything. Not every decision needs consensus. Formatting choices, variable names, comment style. Let individual agents own their domain decisions. Reserve consensus for cross-cutting concerns.

No time limit. Debates can run forever if agents are stubborn (and LLMs can absolutely be stubborn). Cap the rounds. Set a token budget. Have a tiebreaker.

The Human Lesson

The best consensus systems I've built mirror what good engineering teams do. Not endless democracy. Not dictatorship. Expertise-weighted input with clear escalation paths and someone (or something) empowered to make the final call when the clock runs out.

The agents don't need to agree. They need to reach a decision that accounts for all perspectives and moves forward. Perfect consensus is a myth. Good enough consensus that ships is the goal.