The Observability Gap
How do you debug a system that thinks in natural language? The Observability Gap is the distance between traditional logging and what agents actually need: full visibility into their reasoning, tool use, and decision-making. In this installment, we explore how to build the flight recorder that turns black boxes into transparent systems.
Keep reading
The Fingerprint of Sound
Text embeddings map meaning, but speech embeddings map identity. Discover how treating speaker recognition as a geometry problem turns the messy task of diarization into a clean clustering algorithm. It’s a powerful reminder that understanding the primitives is key to becoming an architect of AI systems, not just a consumer.
Keep readingGreat Video on Gemini Scribe and Obsidian
When a user’s repository feedback leads to a YouTube deep dive, you know you’ve found something special. I recently discovered how Paul O’Malley is using Gemini Scribe as the autonomous engine for his ‘Self-Organizing Second Brain’—a powerful reminder of the satisfying magic that happens when small tools meet big ideas.
Keep reading
Everything Becomes an Agent
Every AI project I built last year ended up becoming an agent. What starts as a simple script inevitably grows into a loop with tools, memory, and autonomy. I’ve learned to recognize the signs—and when to skip the intermediate steps and embrace the agent from the start.
Keep reading
When Agents Talk to Each Other
Our agents are brilliant but isolated. In Part 9 of The Agentic Shift, we explore the three protocols transforming how AI systems connect: MCP for tools, ACP for interfaces, and A2A for collaboration. The Internet of Agents is booting up, and our digital Robinson Crusoes are finally getting a radio
Keep reading
Bringing Deep Research to the Terminal
I lost a research report switching between the Gemini app and my terminal. Frustrated, I built what I needed: a Gemini CLI extension that brings deep research directly into my workflow. No more browser tabs, no lost formatting—just markdown files appearing where I actually work. Personal software for an audience of one.
Keep reading
The Era of Personal Software
Building bespoke tools is now often faster than searching for them. We are entering the era of Personal Software: applications built for an audience of one. Explore the shift from discovery to creation, and why AI makes it easier than ever to fit the handle to your own grip.
Keep reading
The Guardrails of Autonomy
Giving AI agents terminal access creates a tension between autonomy and safety. To cure “Confirmation Fatigue,” I built the Gemini CLI Policy Engine. It acts as a firewall for tool calls, allowing you to define granular guardrails—from safe exploration to hard stops—enabling true autonomy without the anxiety.
Keep reading
Bringing the Office to the Terminal
Leaving the terminal to check a doc or calendar breaks your flow and invites a flood of notifications. The Google Workspace extension for Gemini CLI solves this by bringing your office tools directly into the command line. Search docs, check schedules, and send messages—all without ever hitting Alt-Tab.
Keep reading
The Joy of Deleting Code: Rebuilding My Podcast Memory
I replaced my complex, self-managed podcast AI pipeline with Gemini File Search. The result? I deleted thousands of lines of code, migrated 18,000 transcripts, and turned a fragile system into a robust service. It wasn’t just an update; it was permission to stop doing the busy work and focus on building.
Keep reading
Choosing Your Agent Framework
Building an AI agent from scratch teaches you the fundamentals, but the real work begins when you choose a framework. This post explores the landscape of agentic frameworks—from enterprise-grade systems like Google’s ADK to collaborative tools like CrewAI—helping you select the right scaffolding for your next intelligent application.
Keep reading
On Context, Agents, and a Path to a Standard
AI agents need context to be true partners. This post explores the GEMINI.md system built for the Gemini CLI and contrasts it with the promising Agents.md standard. I outline three key proposals—file includes, model-specific pragmas, and context hierarchy—needed for a universal standard that all tools can adopt.
Keep reading
Managing the Agent’s Attention
We’ve given our agents senses, memory, and hands to act. But there’s a hidden bottleneck: attention. An agent’s context window is its workbench, and a cluttered bench leads to confusion. We’ll explore why the true art of building agents isn’t about infinite space, but mastering the focus within it.
Keep reading
Putting Up the Guardrails
As AI agents transition from suggesting to acting, our responsibility as builders shifts. This post explores the new security landscape, from battling prompt injections—where language itself is a vulnerability—to implementing vital guardrails like human-in-the-loop confirmations and the Principle of Least Privilege. Crafting secure agents means building a resilient, multi-layered defense to ensure powerful AI remains trustworthy.
Keep reading
Guiding the Agent’s Behavior
An agent with tools is like a smart intern: brilliant but needs direction. Guiding it requires more than prompting; it’s an engineering discipline. We’ll explore how to architect an agent’s behavior using a clear division of labor between system prompts, conversation history, and tool descriptions for reliable results.
Keep reading
An Agent’s Toolkit
An agent that can think but not act is still trapped in its own mind. In this post, we explore the most critical step in our journey: giving our agent hands. We dive into the world of tools, the secure loop that governs their use, and how they transform an AI from a passive knower to an active doer, turning queries into real-world workflows.
Keep reading
The Manager’s Edge in the Age of AI
There’s a subtle art to getting the best out of people, and the same is true for AI. This post explores why the “soft skills” of management—clarity, context-setting, and iterative feedback—are becoming the essential “hard skills” for guiding our AI partners in this new era of collaboration.
Keep reading
The Agent’s Memory
How does an AI agent remember? We dive into the architecture of agentic memory, moving beyond the limitation of stateless APIs. This post explores the layers—working memory for the now, episodic for the past, and semantic for deep knowledge—that are essential to transforming a forgetful tool into a capable partner.
Keep reading
What I Did On My Summer Vacation
My summer trip to Australia and Fiji had a memorable detour: a two-week creative journey into code. From a lounge chair by the beach, I built Agent Mode for Gemini Scribe—a new, persistent AI partner for writing. It turns out the best souvenirs are the ones you build yourself.
Keep reading
How Agents Think
How does an AI agent actually “think”? It’s not a single process, but a choice between cognitive patterns. This post explores four foundational architectures: the improvisational ReAct loop, the meticulous Plan-and-Execute strategy, and more. Discover the “mental operating systems” driving the next evolution of software development.
Keep readingSomething went wrong. Please refresh the page and/or try again.