An abstract representation of AI cognitive patterns. A central glowing polygonal sphere is surrounded by four icons representing ReAct, Plan-and-Execute, Reflection, and Multi-Agent Collaboration, all on a dark background in a high-tech style

How Agents Think

Welcome back to The Agentic Shift. In our last post, “The Anatomy of an AI Agent,” we established that an agent is a system built around a model, defined by its ability to perceive (its senses), reason (its brain), and act (its hands). We settled on the analogy of a GPS navigator: a partner that doesn’t just show you a map, but actively senses traffic, thinks about the best route, and acts by giving you turn-by-turn directions to your goal.

That’s the “what.” Now, we’re diving into the “how.”

If the agent’s brain is a large model, how does it actually think? But here’s the interesting part: there isn’t just one way. An agent’s cognitive process is shaped by its underlying architecture—a kind of mental operating system that dictates its approach to solving a problem. Some agents are like meticulous planners, charting out every step of a journey before leaving the house. Others are more like improvisational travelers, figuring out their path as they go.

These cognitive frameworks are more than just academic curiosities; they are the fundamental patterns that enable an agent to tackle complex, multi-step goals. Understanding them is the key to building and working with this new generation of AI.

A Quick Note on Prompts

Before we dive in, it’s important to remember one thing: at the heart of every agentic pattern is a series of carefully crafted prompts. The logic we’re about to explore isn’t baked into the models themselves; it’s orchestrated by the application code. Each time you see a call to the llm in the pseudocode below, imagine a formatted prompt being sent to the model. The “magic” of an agent is really the art of conversation—asking the right questions, with the right context, at the right time.

It’s also important to note that the prompts included in our examples are intentionally simplified for clarity. In a real-world application, these prompts would be much more detailed, often including specific instructions on tone, format, and constraints, as well as examples to guide the model’s behavior. The art of creating these sophisticated instructions is a deep topic known as prompt engineering, which we’ll explore in a future post.

With that in mind, let’s explore four of the most foundational patterns being used today.

The ReAct Pattern: The Improviser

Imagine you’re a detective arriving at a crime scene. You don’t know the full story. You start with a goal—solve the case—but you can’t plan your entire investigation from the start. Instead, you look for a clue (observation), think about what it means (reason), and then take an action based on that thought (e.g., interview a witness). This iterative, adaptive loop is the essence of the ReAct (Reason + Act) pattern.

First formalized in a groundbreaking 2022 paper from collaborators at Princeton and Google Research, “ReAct: Synergizing Reasoning and Acting in Language Models,” this pattern is built on a simple, powerful cycle:

  1. Thought (Reason): The agent examines its goal and the information it has, then formulates an internal monologue. “The user wants the last Super Bowl score. First, I need to know which Super Bowl was the most recent.”
  2. Action (Act): Based on its thought, the agent chooses a tool and executes an action, like search(“most recent Super Bowl”).
  3. Observation: The agent gets a result from its action—”Super Bowl LVIII was played on February 11, 2024″—and adds this new information to its context.

This cycle repeats, with each observation informing the next thought. The ReAct pattern is incredibly effective for tasks where the path forward is unknown or the environment is constantly changing. Its main strength is its ability to course-correct. Of course, without careful prompting, ReAct agents can sometimes get stuck in repetitive loops—a challenge we’ll explore when we discuss debugging and productionizing agents later in the series.

Here’s what that loop looks like in pseudocode:

thought_prompt = """
Based on the following context, what is the next thought to move closer to the goal?
Context: {context}
Thought:
"""

action_prompt = """
Based on the following context, what is the next action to take? Choose from [search, finish].
Context: {context}
Action:
"""

context = "Goal: What was the score of the last Super Bowl?"
max_iterations = 10

for i in range(max_iterations):
    thought = llm.prompt(thought_prompt.format(context=context))
    context += thought

    action_text = llm.prompt(action_prompt.format(context=context))
    action = parse_action(action_text) # e.g., search("Super Bowl LVIII score")
    context += action_text

    if action.tool == "finish":
        return action.argument # Final answer
    
    observation = execute_tool(action.tool, action.argument)
    context += f"\nObservation: {observation}"

The Plan-and-Execute Pattern: The Meticulous Planner

While ReAct is the improviser, some tasks demand an architect. If you’re building a house, you don’t just start laying bricks. You begin with a detailed blueprint. This is the core idea behind the Plan-and-Execute pattern.

With this pattern, the agent operates in two distinct phases:

  1. Planning: First, the agent analyzes the high-level goal and generates a complete, step-by-step plan. It doesn’t take any action; it only thinks. This is often where a more powerful, sophisticated model is used to create a robust strategy.
  2. Execution: Once the plan is finalized, the agent (or a simpler, more cost-effective model) executes each step in sequence.

This approach offers predictability and control. It’s ideal for tasks in stable environments where the workflow is well-understood, like an automated software deployment. The primary drawback is its rigidity. If an unexpected error occurs, the entire plan might be invalidated, forcing a complete restart. This approach has been explored in academic research, such as in the paper “Plan-and-Solve Prompting,” which demonstrates how upfront planning can improve the reasoning of large language models.

A pseudocode implementation would separate these two phases clearly:

plan_prompt = """
Given the following goal, create a step-by-step plan to achieve it.
Goal: {goal}
Plan:
"""

# Phase 1: Planning
goal = "Deploy the new feature-x branch to staging."
plan_text = llm.prompt(plan_prompt.format(goal=goal))
plan = parse_plan(plan_text) # Turns numbered list into a list of strings

# Optional: Human-in-the-loop for approval
if not user.approve_plan(plan):
    exit("Deployment cancelled by user.")

# Phase 2: Execution
for step in plan:
    result = execute_step(step)
    if result.is_error():
        handle_error(result)
        break # Halt execution on failure

The Reflection Pattern: The Self-Critic

Even the best plans can have flaws. A great writer doesn’t just write a first draft; they revise it. They read their own work, critique it, and make it better. What if an agent could do the same? That’s the idea behind the Reflection pattern. It gives an agent a mechanism for self-critique and iterative refinement.

The process is straightforward but powerful:

  1. Generation: The agent produces an initial output—a block of code, a paragraph of text, or a plan.
  2. Critique (Reflection): The agent examines its own work, often guided by an external signal (like a failed unit test) or an internal set of principles. It generates feedback for itself.
  3. Refinement: The agent takes this feedback and generates a new, improved output.

This loop can be repeated until the output is satisfactory. The “Self-Refine” paper provides a formal framework for this, showing how iterative self-feedback can significantly improve performance. This ability to self-correct is powerful, but not foolproof. An agent can sometimes struggle to see its own blind spots, a failure mode we’ll look at how to mitigate later in this series.

Here’s how a reflection loop for code generation might look in pseudocode:

generation_prompt = "Write a Python function to {goal}."
reflection_prompt = """
The following code was generated to '{goal}'.
It failed with this error: {error_message}.
Please analyze the code and explain the bug.
Code: {code}
Reflection:
"""
refinement_prompt = """
Goal: '{goal}'.
The previous attempt failed. Here is a reflection on the bug:
{reflection}
Please generate a corrected version of the code.
Corrected Code:
"""

goal = "calculate the average of a list of numbers"
max_reflections = 5
context = ""

code = llm.prompt(generation_prompt.format(goal=goal))

for i in range(max_reflections):
    test_result, error_message = execute_unit_test(code)
    
    if test_result.is_pass():
        return code # Success
    
    reflection = llm.prompt(reflection_prompt.format(goal=goal, error_message=error_message, code=code))
    context += reflection

    code = llm.prompt(refinement_prompt.format(goal=goal, reflection=context))

Multi-Agent Collaboration: The Team of Specialists

So far, we’ve talked about single agents. But what about problems that are too big for one mind to handle alone? You’d assemble a team. The Multi-Agent Collaboration pattern does just that, creating a crew of specialized agents that work together.

This pattern typically involves a film crew-like structure:

  • The Orchestrator (The Director): This agent receives the main goal, breaks it down into smaller sub-tasks, and delegates them to the appropriate specialists.
  • Expert Agents (The Crew): These are agents designed for a specific function, like a Researcher or a Writer. Each has its own persona and a curated set of tools.

Frameworks like AutoGen from Microsoft and CrewAI are designed to facilitate this kind of collaborative workflow. As explored in surveys like “Demystifying and Advancing Collaborative AI,” this approach mirrors how human expert teams function. It’s powerful, but it introduces orchestration overhead. Miscommunication between agents can lead to cascading failures, a topic we’ll cover when we discuss building production-ready systems.

The pseudocode for this pattern looks like a director assigning tasks on a film set:

# Each agent is initialized with a system prompt that defines its expertise.
researcher_prompt = "You are an expert researcher. Use your search tools to find relevant information."
researcher = Agent(system_prompt=researcher_prompt, tools=[web_search])

writer_prompt = "You are an expert writer. Turn the provided data into a well-structured blog post."
writer = Agent(system_prompt=writer_prompt) # No tools needed for this agent

editor_prompt = "You are an expert editor. Review the text for clarity, grammar, and accuracy."
editor = Agent(system_prompt=editor_prompt)

# The Orchestrator manages the workflow
class Orchestrator:
    def run(self, goal):
        research_task = "Gather performance data for Llama 3 vs. GPT-4 on coding benchmarks."
        research_output = researcher.run(research_task)

        writing_task = f"Draft a blog post using this data: {research_output}"
        draft_post = writer.run(writing_task)

        editing_task = f"Review and polish this draft: {draft_post}"
        final_post = editor.run(editing_task)
        
        return final_post

# Kick off the process
goal = "Write a blog post comparing Llama 3 and GPT-4 on coding benchmarks."
orchestrator = Orchestrator()
result = orchestrator.run(goal)

Choosing the Right Pattern: A Quick Guide

Each pattern offers a different cognitive strategy, and the right choice depends entirely on the task. There’s a fundamental trade-off between adaptability and predictability. ReAct excels at exploration in unknown environments, while Plan-and-Execute provides reliability for known procedures. Here’s a simple guide to help you choose:

PatternCore IdeaBest For (Use Cases)Key LimitationPractical Considerations
ReActInterleaving reasoning, tool use, and observation in a tight, iterative loop.Exploratory tasks in dynamic environments. Web navigation, interactive Q&A, debugging a novel issue.Can be inefficient for predictable tasks; may get stuck in loops if not guided well.High cost/latency (many LLM calls).
Plan-and-ExecuteCreating a complete plan upfront and then executing it step-by-step.Predictable, multi-step procedures. Software builds, data processing pipelines, following a recipe.Brittle and inflexible; an early failure can invalidate the entire plan.Low cost/latency (often fewer LLM calls).
ReflectionCritiquing and iteratively refining its own output to improve quality.Tasks where the first draft isn’t enough. Code generation, creative writing, complex reasoning.Can suffer from self-bias; an agent can’t easily spot its own blind spots without an external signal.Variable cost/latency (depends on refinement loops).
Multi-AgentDecomposing a complex goal into roles for specialized agents to collaborate on.Complex, multifaceted projects. Writing a research report, financial analysis, large-scale software development.Adds significant orchestration overhead; success depends on clear communication protocols.Very high cost/latency (multiple agents making calls).

Beyond the Choice: Composing Patterns

The table above presents the patterns as a choice, but the most sophisticated agentic systems don’t just pick one. They compose them, creating a hierarchy of intelligence. This is where the true power of these architectures begins to emerge.

Imagine an orchestrator agent tasked with a complex goal, like “Write a complete market analysis report for our new product.” It might use a Plan-and-Execute pattern to create a high-level blueprint:

  1. Gather competitor data.
  2. Analyze market sentiment.
  3. Draft the report.
  4. Create visualizations.
  5. Finalize and edit the report.

This plan is predictable and structured. But the first step, “Gather competitor data,” is messy and unpredictable. For this specific task, the orchestrator might delegate the work to a subordinate ReAct agent, an “improviser” that is perfectly suited for navigating the web, dealing with unexpected website layouts, and finding information through exploration. In this way, the system gets the best of both worlds: the reliability of a high-level plan and the adaptability of an exploratory sub-agent.

The Human in the Loop: Our Role in the Age of Agents

While we’ve focused on how agents think, it’s crucial to remember that these patterns are not designed to operate in a vacuum. The goal is not to replace human oversight, but to elevate it. A key principle in building robust and responsible agents is ensuring there is always a human in the loop.

This partnership can take many forms, depending on the pattern:

  • In Plan-and-Execute, a human can review and approve the plan before any irreversible actions are taken, as shown in our pseudocode.
  • In a Reflection loop, a human can provide the external feedback, acting as a coach who points out subtle flaws the agent might miss on its own.
  • For a ReAct agent that gets stuck, a human can offer a hint or a new direction to get it back on track.
  • In a Multi-Agent system, a human can act as the ultimate orchestrator, resolving conflicts between agents or providing the strategic direction that guides the entire team.

Building these points of collaboration into an agent’s design transforms it from an autonomous black box into a transparent and steerable partner. This human-centric approach is not just a safety feature; it’s what will make these systems truly powerful.

Beyond the Foundations: A Glimpse of What’s Next

While these four patterns are the bedrock of modern agentic systems, the field is moving at a breathtaking pace. Researchers are already developing more sophisticated reasoning structures that build on these ideas.

One of the most exciting is Language Agent Tree Search (LATS). A standard ReAct agent follows a single, intuitive path. If it makes a wrong turn, it has to backtrack. LATS, inspired by classic search algorithms, allows an agent to explore multiple reasoning paths at once, like branches of a tree. It can evaluate different potential action sequences, discard unpromising ones, and pursue the path that seems most likely to lead to success. As detailed in the paper “Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models,” this makes agents more robust and capable of solving complex problems where a simple greedy approach might fail. This move from “single-path” to “multi-path” reasoning is a crucial step toward building more deliberative and strategic agents.

From Code to Conversation: The Next Abstraction

For those of us with a background in software engineering, these patterns might feel familiar in a surprising way. The history of programming is a story of ever-increasing abstraction. We moved from the raw bits of machine code to the symbolic representation of assembly. Then came procedural languages like C, which let us think in functions. Object-oriented languages like Java and C++ allowed us to model the world in classes. More recently, scripting languages like Python and JavaScript made development even more dynamic.

At each step, we’ve moved further away from telling the machine how to do something and closer to simply stating what we want to achieve.

Agentic patterns are the next logical step in this evolution.

When we use these patterns, we are engaging in a form of meta-programming. The “code” we write is no longer a precise sequence of commands but a set of goals, constraints, and tools expressed in natural language. The loops and logic in the pseudocode examples are the new “interpreters,” orchestrating the model’s reasoning to achieve a high-level objective. We are, in essence, programming with intent. It’s not a stretch to imagine a future where programming languages evolve to natively incorporate these concepts, allowing developers to define goals and delegate tasks using a grammar that blends traditional code with structured natural language.

Conclusion: A Pattern for Every Problem, and a Role for Everyone

We’ve journeyed through the cognitive architecture of AI agents, moving beyond the simple “what” to the complex “how.” We’ve seen that an agent’s “thinking” isn’t monolithic; it’s a choice between foundational patterns. From the adaptive improvisation of ReAct to the structured reliability of Plan-and-Execute, the self-correcting loop of Reflection, and the collaborative power of Multi-Agent systems, these patterns form a toolkit for building intelligence.

Choosing the right pattern is a critical design decision—a trade-off between adaptability and predictability, speed and cost. But the most sophisticated systems won’t just choose one; they will compose them, creating hierarchies of intelligence that leverage the strengths of each. And in the most effective systems, there will always be a role for the most intelligent component of all: the human in the loop. This isn’t a future where we are sidelined; it’s one where our role evolves from direct implementer to strategic collaborator—the coach, the reviewer, and the guide who provides the crucial oversight that turns a powerful tool into a trusted partner.

Perhaps the most profound realization is that in designing these systems, we are participating in the next great abstraction in software development. We are moving from writing explicit code to orchestrating intent, sculpting behavior through conversation and structured prompts. And this field is not standing still. The evolution from the single-path reasoning of ReAct to the multi-path exploration of emerging patterns like LATS shows a clear trajectory toward more robust, deliberative AI.

This brings our exploration of the agent’s brain to a close. We now have a blueprint for how an agent thinks. But a brain without memory is fleeting. To learn, adapt, and build upon its experiences, an agent needs to remember. In our next post, we’ll dive into the crucial component that makes this possible: Part 3: The Agent’s Memory. The foundation is set, and the truly exciting part is just beginning.

6 thoughts on “How Agents Think

Leave a Reply