The Router Pattern: Intelligent Task Distribution for AI Agents
By Diesel
architectureroutingpatterns
## The One-Model Trap
There's a pattern I see constantly. Someone builds an agent, wires it to Claude Opus or GPT-4, and ships it. Every task, no matter how trivial, goes through the most powerful (and most expensive) model available.
Asking Opus to format a date is like hiring a structural engineer to hang a picture frame. It'll work. It's also absurdly wasteful.
The router pattern fixes this. Instead of sending everything to one model, you classify the task first and route it to the appropriate handler. Simple tasks get simple models. Complex reasoning gets the big guns. Everything in between gets something proportional.
## Anatomy of a Router
A router sits between the input and the agents. It looks at the task, decides what kind of work it is, and sends it somewhere. It is worth reading about [specialist agents it routes to](/blog/agent-specialization-vs-generalist) alongside this.
```typescript
interface RouteDecision {
agent: string;
model: string;
priority: "low" | "normal" | "high" | "critical";
reasoning: string;
confidence: number;
}
class TaskRouter {
private routes: Route[] = [];
addRoute(route: Route) {
this.routes.push(route);
}
async route(task: Task): Promise {
const classifications = await Promise.all(
this.routes.map(r => r.evaluate(task))
);
const best = classifications
.filter(c => c.confidence > 0.5)
.sort((a, b) => b.confidence - a.confidence)[0];
if (!best) {
return this.defaultRoute(task);
}
return best;
}
}
```
The key insight: routing doesn't need to be perfect. It needs to be good enough to save you money on the easy stuff and not screw up the hard stuff. A router that correctly classifies 80% of tasks still saves you significant compute on that 80%.
## Classification Strategies
There are three ways to classify tasks, and the best routers use all three.
### Rule-Based Classification
Fast, cheap, deterministic. Use it for the obvious cases.
```typescript
class RuleBasedClassifier implements Route {
private rules = [
{
pattern: /\b(format|convert|parse|extract)\b/i,
agent: "transformer",
model: "haiku",
priority: "low" as const,
},
{
pattern: /\b(analyze|compare|evaluate|reason)\b/i,
agent: "analyst",
model: "sonnet",
priority: "normal" as const,
},
{
pattern: /\b(design|architect|plan|strategy)\b/i,
agent: "architect",
model: "opus",
priority: "high" as const,
},
];
async evaluate(task: Task): Promise {
for (const rule of this.rules) {
if (rule.pattern.test(task.description)) {
return {
agent: rule.agent,
model: rule.model,
priority: rule.priority,
reasoning: `Matched rule: ${rule.pattern}`,
confidence: 0.7,
};
}
}
return { agent: "general", model: "sonnet", priority: "normal",
reasoning: "No rule matched", confidence: 0.3 };
}
}
```
### Embedding-Based Classification
More nuanced. You embed the task description and compare it against known task categories.
```typescript
class EmbeddingClassifier implements Route {
private categories: Map = new Map();
async evaluate(task: Task): Promise {
const taskEmbedding = await embed(task.description);
let bestMatch = { category: "", similarity: 0 };
for (const [category, embedding] of this.categories) {
const sim = cosineSimilarity(taskEmbedding, embedding);
if (sim > bestMatch.similarity) {
bestMatch = { category, similarity: sim };
}
}
const config = this.categoryConfig.get(bestMatch.category);
return {
agent: config.agent,
model: config.model,
priority: config.priority,
reasoning: `Embedding match: ${bestMatch.category} (${bestMatch.similarity.toFixed(3)})`,
confidence: bestMatch.similarity,
};
}
}
```
This catches the cases where the user doesn't use your expected keywords. "Can you look at this code and tell me what's wrong?" doesn't match "analyze" directly, but the embedding will be close to analysis-type tasks.
### LLM-Based Classification
The nuclear option. Use a cheap, fast model to classify before routing to the expensive one.
```typescript
class LLMClassifier implements Route {
async evaluate(task: Task): Promise {
const classification = await llm.generate({
model: "haiku",
messages: [{
role: "system",
content: `Classify this task into one category:
- simple_transform: data formatting, type conversion, extraction
- analysis: comparison, evaluation, reasoning over data
- creative: writing, design, strategy, open-ended generation
- code: writing, reviewing, debugging code
- research: multi-source information gathering
Respond with JSON: { "category": "...", "confidence": 0.0-1.0 }`
}, {
role: "user",
content: task.description,
}],
});
return this.mapToRoute(JSON.parse(classification));
}
}
```
The cost of a Haiku classification call is roughly 1/50th of an Opus generation call. Even if you classify every single task with an LLM, the savings from routing simple tasks to cheaper models more than pays for the classification.
## The Cascade Pattern
Sometimes you don't know how hard a task is until you try it. The cascade pattern starts with the cheapest option and escalates.
```typescript
class CascadeRouter {
private tiers = [
{ model: "haiku", maxAttempts: 1, timeout: 5000 },
{ model: "sonnet", maxAttempts: 1, timeout: 15000 },
{ model: "opus", maxAttempts: 1, timeout: 30000 },
];
async execute(task: Task): Promise {
for (const tier of this.tiers) {
try {
const result = await this.tryTier(task, tier);
if (result.quality >= task.minimumQuality) {
return result;
}
// Quality too low, escalate
} catch (error) {
if (error instanceof TimeoutError) continue;
throw error;
}
}
throw new EscalationExhaustedError("All tiers failed");
}
private async tryTier(task: Task, tier: Tier): Promise {
const result = await withTimeout(
agent.run(task, { model: tier.model }),
tier.timeout
);
return {
...result,
quality: await evaluateQuality(result, task),
model: tier.model,
tier: tier,
};
}
}
```
The quality evaluation is the tricky part. For some tasks, you can check quality programmatically (did the code compile? did the test pass?). For others, you need a lightweight LLM call to assess. The cost of evaluation needs to be less than the savings from using a cheaper model.
## Learning Routers
The best routers get better over time. Every routing decision is a learning opportunity.
```typescript
class LearningRouter extends TaskRouter {
private outcomes: Map = new Map();
async recordOutcome(
task: Task,
decision: RouteDecision,
result: AgentResult
) {
const key = `${decision.agent}:${decision.model}`;
const outcomes = this.outcomes.get(key) || [];
outcomes.push({
taskType: await classify(task),
success: result.success,
quality: result.quality,
cost: result.tokenUsage.totalCost,
latency: result.durationMs,
});
this.outcomes.set(key, outcomes);
} It is worth reading about [orchestration layer](/blog/multi-agent-orchestration-patterns) alongside this.
async route(task: Task): Promise {
const baseDecision = await super.route(task);
// Check if historical data suggests a different route
const historical = this.analyzeOutcomes(task);
if (historical && historical.confidence > baseDecision.confidence) {
return historical;
}
return baseDecision;
}
}
```
Over time, the router learns that certain task patterns succeed with cheaper models, or that certain agents consistently perform better on specific task types. This is the feedback loop that makes the system genuinely intelligent, not just rule-based.
## Multi-Agent Routing
When you have specialized agents, routing becomes even more important.
```typescript
const agentRegistry = {
coder: {
specialties: ["code-generation", "debugging", "refactoring"],
models: ["sonnet", "opus"],
concurrency: 3,
},
researcher: {
specialties: ["information-gathering", "summarization", "comparison"],
models: ["sonnet"],
concurrency: 5,
},
writer: {
specialties: ["content-creation", "editing", "translation"],
models: ["sonnet", "haiku"],
concurrency: 5,
},
};
```
The router considers not just what agent is best for the task, but which agents are available, what their current load is, and whether the task can be decomposed into sub-tasks for different agents.
## Load Balancing and Fairness
Routing isn't just about capability matching. It's about resource management.
```typescript
class LoadAwareRouter extends TaskRouter {
async route(task: Task): Promise {
const candidates = await this.getCandidates(task);
// Factor in current load
const scored = candidates.map(c => ({
...c,
score: c.confidence * (1 - c.agent.currentLoad),
}));
// Pick the best available option
return scored.sort((a, b) => b.score - a.score)[0];
}
}
```
If your best agent for a task is overloaded, send it to the second best. A slightly worse match that executes immediately beats the perfect match stuck in a queue. For a deeper look, see [load-aware routing](/blog/load-balancing-ai-agents).
## What Good Routing Gets You
A well-designed router gives you three things.
Cost efficiency: simple tasks go to cheap models. You stop paying Opus prices for Haiku work. Depending on your task distribution, this can cut costs by 40-60%.
Latency reduction: cheap models are fast models. Tasks that don't need deep reasoning get answered in milliseconds instead of seconds.
Resilience: when one model or agent is down, the router redirects. Your system degrades gracefully instead of falling over.
The router is the brain of your multi-agent system. Everything else is just muscles.