Stateful vs Stateless Agents: When Persistence Matters

The Memory Question

Every agent conversation starts with a blank slate. The LLM has no memory of previous interactions. No knowledge of what worked last time. No awareness of the user's preferences, past mistakes, or ongoing projects.

That's stateless. And for a lot of use cases, it's perfectly fine.

But then you build something more sophisticated. An agent that manages a project over weeks. An agent that learns from user corrections. An agent that coordinates with other agents across multiple sessions. Suddenly, that blank slate isn't simplicity. It's amnesia.

The choice between stateful and stateless isn't binary. It's a spectrum. And understanding where your agent falls on that spectrum determines half your architecture.

What State Actually Means

State in an agent context breaks down into four categories:

Conversation state: the current interaction. Messages exchanged, tools called, decisions made in this session. Every agent has this, even "stateless" ones. It lives in the context window.

Session state: information that persists within a single session but dies when the session ends. Working memory. Intermediate results. Scratch space. This connects directly to memory patterns.

User state: preferences, history, and context specific to one user. Persists across sessions. "This user prefers concise answers." "This user's project uses TypeScript."

World state: facts about the external environment. Database schemas, API endpoints, document contents. Shared across all users. Changes over time.

interface AgentState {
  conversation: {
    messages: Message[];
    toolCalls: ToolCall[];
    currentPlan: Plan | null;
  };
  session: {
    workingMemory: Map<string, unknown>;
    scratchpad: string[];
    toolResults: Map<string, ToolResult>;
  };
  user: {
    preferences: UserPreferences;
    history: InteractionSummary[];
    corrections: Correction[];
  };
  world: {
    knowledge: KnowledgeBase;
    lastUpdated: Map<string, number>;
  };
}

A "stateless" agent only uses conversation state. A "stateful" agent uses some or all of the rest.

The Case for Stateless

Stateless agents are simpler. Dramatically simpler. And simplicity is a feature, not a limitation.

No database to manage. No state synchronization issues. No "what if the state is corrupted" edge cases. No cold start problem. Every request is independent. Scale horizontally by adding more instances. Replace any instance without coordination.

class StatelessAgent {
  async handle(request: AgentRequest): Promise<AgentResponse> {
    // Everything needed is in the request
    const context = buildContext(request.messages, request.systemPrompt);
    const response = await llm.generate(context);
    return { response, messages: [...request.messages, response] };
  }
}

For task-based agents that complete their work in a single interaction, stateless is the right call. Answer a question, transform some data, generate some code. Done. No persistence needed.

The moment you need to say "but remember when you..." to your agent, stateless breaks down.

The Case for Stateful

Stateful agents remember. That memory enables behaviors that stateless agents simply can't do.

Learning from corrections: User says "I told you last time, always use snake_case." A stateful agent stores that. A stateless agent hears it for the first time, every time.

Long-running tasks: A project that spans days or weeks can't fit in a single context window. State lets you compress and persist what matters.

Personalization: The difference between a generic assistant and one that feels like it knows you. That difference is state.

Coordination: Multiple agents working on the same project need shared state. What Agent A discovered, Agent B needs to know about. This connects directly to agent loop and state machines.

State Storage Patterns

There are three main patterns for storing agent state, each with different tradeoffs.

In-Context State

The simplest approach. Stuff everything into the prompt.

class InContextStateAgent {
  async handle(request: AgentRequest): Promise<AgentResponse> {
    const userState = await loadUserState(request.userId);
    const systemPrompt = `
      ${basePrompt}

      User preferences:
      ${JSON.stringify(userState.preferences)}

      Recent interaction summary:
      ${userState.history.slice(-5).map(h => h.summary).join("\n")}
    `;

    return llm.generate({
      system: systemPrompt,
      messages: request.messages,
    });
  }
}

Advantages: the LLM sees everything. No retrieval latency. Simple implementation.

Disadvantages: context windows are finite. Token costs increase with more state. You're paying to send the same information with every request.

Retrieval-Based State

Store state externally. Retrieve what's relevant for the current interaction.

class RetrievalStateAgent {
  async handle(request: AgentRequest): Promise<AgentResponse> {
    const query = extractIntent(request.messages);

    // Retrieve relevant state
    const relevantHistory = await vectorStore.search(query, {
      namespace: `user:${request.userId}`,
      limit: 5,
    });

    const relevantKnowledge = await vectorStore.search(query, {
      namespace: "world",
      limit: 3,
    });

    const context = buildContext(
      request.messages,
      relevantHistory,
      relevantKnowledge,
    );

    return llm.generate(context);
  }
}

This is RAG applied to agent state. It scales better because you only load what's relevant. But it introduces a new failure mode: retrieval quality. If the retrieval misses something important, the agent acts as if it doesn't know it. The user says "I told you about this" and the agent has no idea.

Hybrid State

The practical answer for most production agents. Keep critical state in-context. Retrieve supplementary state on demand.

class HybridStateAgent {
  async handle(request: AgentRequest): Promise<AgentResponse> {
    const user = await userStore.get(request.userId);

    // Always in context: core preferences, active tasks, recent corrections
    const coreState = {
      preferences: user.preferences,
      activeTasks: user.tasks.filter(t => t.status === "active"),
      recentCorrections: user.corrections.slice(-3),
    };

    // Retrieved on demand: historical context, knowledge base
    const query = extractIntent(request.messages);
    const supplementary = await vectorStore.search(query, {
      namespace: `user:${request.userId}`,
      limit: 5,
      threshold: 0.7,
    });

    return llm.generate({
      system: buildSystemPrompt(coreState),
      messages: [
        ...buildSupplementaryContext(supplementary),
        ...request.messages,
      ],
    });
  }
}

State Compression: The Context Window Problem

Context windows are large but not infinite. When state accumulates, you need to compress it.

class StateCompressor {
  async compress(state: AgentState): Promise<CompressedState> {
    // Summarize old conversation turns
    const summarized = await summarizeOldMessages(
      state.conversation.messages,
      keepRecent: 10,
    );

    // Merge duplicate knowledge entries
    const deduplicated = deduplicateKnowledge(state.world.knowledge); This connects directly to [event-driven patterns](/blog/event-driven-agent-architecture).

    // Prune stale session data
    const pruned = pruneStaleEntries(
      state.session.workingMemory,
      maxAge: 3600_000, // 1 hour
    );

    return {
      conversation: { ...state.conversation, messages: summarized },
      session: { ...state.session, workingMemory: pruned },
      user: state.user, // keep all user state
      world: { ...state.world, knowledge: deduplicated },
    };
  }
}

The trick is knowing what to keep and what to compress. User corrections are sacred. Never summarize those away. Working memory from three hours ago? Probably safe to prune. Conversation messages from ten turns back? Summarize them into a paragraph.

State Synchronization in Multi-Agent Systems

When multiple agents share state, you enter the world of distributed systems problems. And distributed systems problems are where engineer confidence goes to die.

class SharedAgentState {
  private version = 0;

  async update(agentId: string, changes: StateChange[]): Promise<void> {
    const currentVersion = await this.getVersion();

    if (currentVersion !== this.version) {
      // Another agent modified state since we last read it
      const conflicts = detectConflicts(changes, await this.getChangesSince(this.version));
      if (conflicts.length > 0) {
        throw new StateConflictError(conflicts);
      }
    }

    await this.applyChanges(changes);
    this.version = currentVersion + 1;
  }
}

Optimistic concurrency works well for agent state because conflicts are rare. Agents typically work on different parts of the state. When they do conflict, you want to know about it rather than silently overwriting.

The Decision Framework

Here's how I decide between stateless and stateful for any given agent:

Go stateless when: single-interaction tasks, no personalization needed, horizontal scaling is priority, state would be small enough to pass in every request anyway.

Go stateful when: multi-session interactions, user personalization matters, agent learns from feedback, long-running tasks that exceed context windows, multi-agent coordination.

Go hybrid when: most interactions are stateless but some users need persistence, you want the simplicity of stateless with the capability of stateful on demand.

The wrong choice isn't catastrophic. You can migrate from stateless to stateful later. But it's much harder to migrate from stateful to stateless because you have to answer "what about all the state we accumulated?" That question has no easy answer.

Start stateless. Add state when you hit the wall. The wall will tell you exactly what kind of state you need.