Welcome back to The Agentic Shift, our tour through the new era of AI. We’ve covered a lot of ground. We’ve taken apart the Anatomy of an AI Agent, looked at How Agents Think, given them Memory, and finally, a Toolkit to interact with the world. Our agent is now a capable apprentice: it has a brain, memory, and hands.
But a capable apprentice with no direction is a liability. Now that our agent can do things, how do we make sure it does the right things?
The best mental model I’ve found is to treat the agent as an incredibly smart intern. They’ve read every book but have zero real-world experience. They know facts, but not how to start. Give an intern a vague goal, and you’ll get a vague result. But if you provide clear, structured instructions — the same way you would a junior employee — you get solid performance. I wrote about this recently in “The Manager’s Edge in the Age of AI.”
This is the point where we have to stop “prompting” and start “programming.” If agents are the new applications, our instructions are their source code. Guiding an agent isn’t just “prompt engineering.” We’re not asking for one static output; we’re giving a mission briefing and rules of engagement for a complex, multi-step task. In this post, we’ll cover the two main instruments we have for this: the system prompt, its constitution, and the tool descriptions, the user manual for its abilities.
The Division of Labor: System Prompts vs. Tool Descriptions
To build a reliable agent, we have to understand the jobs of its two main instructional components. A common mistake is to cram everything into one place, which leads to confused agents and unpredictable behavior. A better model is a set of concentric circles. At the core is the System Prompt, defining the agent’s identity and purpose. Wrapped around that is the Conversation History, providing session-specific context. The outermost layer is the set of Tool Descriptions, the agent’s interface for acting on the world.

The System Prompt As The Agent’s Constitution
The system prompt is the agent’s North Star. It’s the first and most persistent context it gets, establishing its identity, purpose, and principles. Think of it as the agent’s constitution. An effective system prompt defines:
- Persona/Role: Who the agent is. “You are a senior DevOps engineer.” This focuses its knowledge and style.
- High-Level Goal: Its mission. “Your goal is to help users safely deploy and monitor applications.”
- Constraints: The rules. “Never delete files without user confirmation.”
- Tone: How it communicates. “Your tone is professional, concise, and helpful.”
This instruction sets the strategic foundation for everything that follows.
Conversation History Is The Session’s Working Context
If the system prompt is the job description, the first few turns of the conversation are the project brief. This is the place for context that’s critical for the immediate task but isn’t a permanent part of the agent’s identity.
This is perfect for providing large blobs of data: a codebase, a long document to summarize, or logs to analyze. Stuffing this kind of temporary, session-specific data into the system prompt is an anti-pattern. It dilutes the core mission and mixes permanent rules with temporary data.
Put simply: the system prompt tells the agent how to be. The initial user turns tell it what to work with now. Keeping them separate is cleaner.
Tool Descriptions Are The User Manual for the Agent’s Hands
If the system prompt is the constitution, tool descriptions are the legal code for specific actions. As we covered in Part 4, an agent suggests a tool to be called. The natural language description is how it decides which tool to use.
The quality of these descriptions is everything. A vague description is an invitation for failure. “Searches the database” is weak. A strong description gives clarity:
“Searches the customer support ticket database by ticket ID. Use this to get the status, priority, and description of a specific support ticket.”
This detail gives the model the semantic hooks it needs to map a request to the right action. The full set of these “manuals” defines everything the agent can do.
Engineering Effective Instructions
The art of instruction is growing up. It’s moving from a collection of clever hacks into a formal engineering discipline. The major AI labs — Google, OpenAI, and Anthropic — have all published detailed guides on the topic. To build reliable systems, we have to treat our prompts like code, with the same rigor we apply to traditional software.
A word of caution, though. There’s a fine line between clear direction and over-constraining the agent. Under-instruction leads to vague results, but over-instruction can stifle the model’s reasoning. We need to find the balance: enough structure for reliability, but enough freedom to allow for creative solutions. Good instructions aren’t just written; they’re engineered.
1. Be Clear, Specific, and Direct
This is the bedrock, like writing clean code. You wouldn’t tell an intern to “handle the deployment.” You’d give them a checklist.
- Clarity: Use simple, unambiguous language. The model is a literal interpreter.
- Specificity: Instead of “Write a short summary,” use “Summarize this article in a three-sentence paragraph.”
- Directness: Use action verbs. “Analyze the following log file for errors” is better than “I would like you to look at this log file.”
2. Structure is Your Friend So Use Delimiters
A common failure mode is the model confusing its instructions with the data it’s meant to process. A fix is to create clear boundaries with delimiters.
- Instructions First: Put core instructions at the top.
- Use Separators: Triple backticks (“`) or XML-like tags (
<instructions>,<data>) create a machine-readable structure that dramatically improves reliability.
Ineffective:
Summarize the following text in one sentence. The quick brown fox jumps over the lazy dog.
Effective:
<instructions>
Summarize the following text in one sentence.
</instructions>
<text>
The quick brown fox jumps over the lazy dog.
</text>
3. Show, Don’t Just Tell: The Power of Few-Shot Examples
Sometimes, the best way to guide an agent is with a few good examples. This “few-shot” prompting is a powerful way to condition the model. Provide a small, diverse set of examples that show the pattern you want it to follow. This is often more effective than writing complex instructions.
4. Frame Instructions Positively
Models respond better to positive commands than negative ones. Tell the agent what to do, not what not to do.
Ineffective:
Don’t ask for the user’s password.
Effective:
If a password reset is needed, direct the user to example.com/reset.
This positive framing gives the agent a clear, constructive action.
5. Test and Version Your Instructions
You wouldn’t ship code without tests. Don’t deploy an agent with untested instructions. Create a small evaluation suite of examples to see how prompt changes affect performance. Store your prompts in Git to track changes and roll back if a new instruction breaks things.
6. Refactor and Prune Your Prompts
Prompts, like code, suffer from cruft. We add a line here, a paragraph there, and soon our clean instruction set is a bloated mess. This “prompt cruft” isn’t just inefficient; it’s harmful. As models improve, yesterday’s necessary instructions can become today’s confusing constraints.
Prompt maintenance is as critical as code maintenance. Every so often, refactor. Run an experiment. Remove instructions one by one, or try your eval with a minimal prompt. You’ll often find that the model’s baseline has improved and many of your instructions are no longer needed. Aggressively prune what you don’t need. A lean prompt is easier to maintain and often works better.
7. Listen to Your Tools Always Use Errors as Instructions
In programming, a compiler error is a message telling you how to fix your code. Treat tool errors the same way. When an agent calls a tool with bad parameters, the error message is critical feedback.
Don’t return a generic Error: Invalid input. Make your error messages instructive.
Ineffective Error:
Error: Failed to retrieve user.
Instructive Error:
Error: Invalid ‘user_id’. The ID must be a numeric integer (e.g., 12345). You provided ‘john-doe’. Use the ‘search_user_by_name’ tool to find the correct ID first.
This turns the tool into part of the guidance system. The agent learns from the feedback loop of its own actions.
8. Know When to Ask for Help: The Human in the Loop
No set of instructions is perfect. Eventually, an agent will face a situation that’s ambiguous or novel. In those moments, the smartest move is to ask for help.
A “human in the loop” (HITL) workflow isn’t a bug; it’s a feature of a robust system. It’s the agent’s escape hatch. Instruct the agent to ask for confirmation before taking a risky action or to ask for clarification when it’s not confident. This keeps a human expert in control.
From Prompting to Behavioral Architecture
Instructing an AI agent is a huge step up from traditional prompt engineering. We’re moving from the art of getting a single output to the discipline of orchestrating complex behavior. By treating prompts as code — with versioning, testing, and a high standard for clarity — we become programmers, not just prompters.
By using a clear division of labor—the system prompt for strategic identity, tool descriptions for tactical capabilities; we can build a coherent instructional hierarchy. This is the core of what some call “context engineering” which is the design of the entire information environment an agent operates in. Our job is shifting from prompter to behavioral architect, carefully sculpting the context that guides our agents to act intelligently and safely.
Now that we have a framework for guiding our agent, we have to ask: how do we protect it? An agent with powerful tools is a double-edged sword. In our next post, we’ll tackle that head-on as we explore Part 6: Putting Up the Guardrails.
[…] [article] 8 Tips for Writing Agent Skills. Writing skills is easy, but writing effective skills is much harder. My colleague Philipp has some great advice on how to craft instructions that agents will actually follow, which is a topic I’ve spent a lot of time thinking about recently. […]
[…] thing Paul emphasized that really resonated with my own philosophy of Guiding the Agent’s Behavior is the “Human in the Loop”. When the agent suggests a change or creates a new command, […]
[…] Part 5: Guiding the Agent’s Behavior […]