Planning Agents: Goal Decomposition and Task Graphs

"Build me a customer dashboard" is not an executable instruction. It's a goal. Between the goal and the working code, there are dozens of decisions, hundreds of steps, and at least three moments where the entire approach needs to change because something didn't work as expected.

Planning agents handle this decomposition. They take a high-level objective, break it into a graph of tasks with dependencies, execute those tasks (often in parallel), adapt the plan when reality intervenes, and converge on a result.

Chain-of-thought reasoning gets you partway there. But for anything beyond toy problems, you need real planning. Here's what that looks like.

Why Chain-of-Thought Isn't Enough

Chain-of-thought (CoT) prompting tells the model to "think step by step." And it works, sort of. The model generates a linear sequence of reasoning steps before arriving at an answer.

The problems start when the task isn't linear.

CoT produces sequential plans. Real work has parallelizable branches. While the frontend scaffolding is being built, the API schema can be designed simultaneously. A linear chain can't express this. The related post on multi-agent orchestration goes further on this point.

CoT doesn't handle dependencies. Step 5 might depend on steps 2 and 3 but not step 4. CoT just lists everything in order and hopes for the best. If step 3 fails, there's no mechanism to figure out which downstream steps are affected.

CoT doesn't adapt. A CoT plan is generated once and executed. If step 3 produces unexpected results that invalidate steps 4 through 8, the model doesn't naturally go back and replan. It keeps going, or it starts over from scratch.

CoT loses coherence at scale. For a 5-step task, CoT is fine. For a 30-step task, the model's reasoning degrades. It forgets earlier steps. It contradicts itself. The plan becomes internally inconsistent.

Planning agents solve these by representing the work as a graph, not a chain.

Task Graphs: The Core Abstraction

A task graph represents work as nodes (tasks) and edges (dependencies). Each task has a clear input, a clear output, and a defined relationship to other tasks.

Goal: Build customer dashboard

├── Design API schema (no deps)
├── Set up database tables (depends on: API schema)
├── Build API endpoints (depends on: database tables)
├── Design UI wireframes (no deps, parallel with API work)
├── Implement dashboard components (depends on: UI wireframes, API endpoints)
├── Write integration tests (depends on: dashboard components)
└── Deploy to staging (depends on: integration tests)

Tasks without dependencies can run in parallel. Tasks with unmet dependencies wait. When a task completes, its dependents become eligible for execution.

This is project management 101, applied to AI agents. And it works remarkably well because LLMs are actually quite good at decomposing goals into structured task lists. They've read enough project plans, technical specs, and how-to guides to understand how work breaks down.

Decomposition Strategies

Top-Down Recursive Decomposition

Start with the goal. Break it into 3-7 sub-goals. For each sub-goal, break it into 3-7 sub-tasks. Continue until each leaf node is directly executable (a single tool call or a simple action).

This is the most natural approach and the one LLMs handle best. It mirrors how humans plan: start broad, get specific.

The risk is over-decomposition. An agent that breaks "write a function" into 15 sub-tasks is wasting time planning work that should just be done. Set a depth limit and let leaf nodes be "chunky" enough to be meaningful.

Template-Based Decomposition

For common task types, use predefined templates. "Build an API endpoint" always involves: define the route, implement the handler, add validation, write tests, update documentation. The agent fills in the specifics for the current case.

This is faster and more reliable than generating plans from scratch every time. The agent still adapts the template to the specific situation, but it starts from a proven structure rather than a blank page.

Iterative Decomposition

Don't plan everything upfront. Plan the next 2-3 steps in detail, execute them, then plan the next 2-3 based on what you learned.

This is the planning equivalent of the ReAct pattern. It handles uncertainty well because later steps are planned with the benefit of earlier results. The tradeoff is that you lose the ability to estimate total effort upfront, and parallel execution is limited to the current planning horizon.

The Replanning Problem

No plan survives contact with the code.

The database schema you planned doesn't support a requirement you discovered mid-implementation. The API you planned to call doesn't exist. The library you planned to use has a breaking change in the version that's installed.

Good planning agents detect when the plan is broken and replan. Bad planning agents barrel forward and produce garbage.

Signals That Trigger Replanning

Task failure. A task fails and can't be retried. The plan needs to route around the failure. Maybe there's an alternative approach. Maybe a dependency needs to change.

New information. The agent discovers something during execution that changes the picture. The codebase uses a different framework than expected. The requirements document has a section the agent didn't read initially. The stakeholder clarified something that changes the scope.

Constraint violation. The plan would exceed the time budget. Or the token budget. Or the complexity budget. Replan with tighter constraints.

How to Replan Without Starting Over

The worst approach is to throw away the entire plan and start from scratch. Everything already completed gets forgotten. Everything in progress gets abandoned.

Better: identify the affected subgraph. Which tasks are invalidated? Which are still valid? Keep the valid ones, replan only the broken branch. This is why task graphs matter. They make partial replanning possible.

Original plan:
  A -> B -> C -> D -> E

B fails. C and D depend on B. E depends on D.

Replan: replace B with B' (alternative approach)
  A -> B' -> C' -> D -> E

A is already done. D and E structure unchanged. Only B and C need replanning.

Parallel Execution and Dependencies

One of the biggest advantages of task graphs over linear plans is parallel execution. Tasks without mutual dependencies can run simultaneously.

In a multi-agent system, this means assigning independent tasks to different agents. One agent builds the API while another builds the UI. One agent writes tests while another writes documentation. The coordination overhead is real, but the speedup can be significant for large tasks.

Dependency Types

Hard dependencies. Task B literally cannot start until Task A produces its output. The API can't be implemented until the schema is defined. For a deeper look, see the ReAct pattern.

Soft dependencies. Task B would benefit from Task A's output but can start without it. The UI can be built with mock data before the real API is ready.

Resource dependencies. Tasks A and B don't depend on each other logically but compete for the same resource (a single agent, a rate-limited API, a shared file). They need to be serialized even though they're logically independent.

Getting dependency types right determines how much parallelism you can extract.

Practical Implementation Tips

Keep tasks atomic. Each task should produce a verifiable output. "Research authentication options" is not atomic because you can't tell when it's done. "Produce a comparison table of three authentication approaches with pros, cons, and recommendation" is atomic.

Name tasks with verbs. "Database schema" is a noun. What about it? "Design database schema for customer orders" is an action with a clear scope.

Include acceptance criteria. How does the agent know a task is done? "API endpoint returns correct data for test cases X, Y, Z." Without criteria, agents either declare victory too early or iterate forever.

Set iteration limits per task. A task that hasn't completed after 5 attempts probably needs human input, not a 6th attempt. For a deeper look, see supervisor agents.

Log the plan evolution. Store every version of the plan. When something goes wrong, the plan history tells you when and why the approach changed. This is the agent equivalent of git history, and it's just as valuable.

The Honest Tradeoff

Planning adds overhead. Decomposition takes time and tokens. Dependency tracking takes infrastructure. Replanning adds complexity. For simple tasks, the overhead isn't worth it.

The breakpoint, in my experience, is around 5-7 steps. Below that, just let the agent work through it linearly with ReAct. Above that, invest in proper planning. The upfront cost pays for itself in reduced backtracking, better parallelism, and graceful failure handling.

Planning isn't about predicting the future perfectly. It's about having a structure that adapts when the future surprises you. And in agent systems, the future always surprises you.