Why MCP is Key for Non-CLI Agent Integration and Security

I was debugging a connection issue between Gemini Scribe and the Google Calendar integration in my Workspace MCP server last month when a friend sent me a link. “Have you seen this? MCP is dead apparently.” It was Eric Holmes’ post, MCP is dead. Long live the CLI, which had just hit the top of Hacker News. I read it while waiting for a server restart, which felt appropriate.

His argument is clean and persuasive: CLI tools are simpler, more reliable, and battle-tested. LLMs are trained on millions of man pages and Stack Overflow answers, so they already know how to use gh and kubectl and aws. MCP introduces flaky server processes, opinionated authentication, and an all-or-nothing permissions model. His conclusion is that companies should ship a good API, then a good CLI, and skip MCP entirely.

I agree with about half of that. And the half I agree with is the part that doesn’t matter.

The Shell is a Privilege

Holmes is writing from the perspective of a developer sitting in a terminal. From that vantage point, everything he says is correct. If your agent is Claude Code or Gemini CLI, running in a shell session on your laptop with your credentials loaded, then yes, gh pr view is faster and more capable than any MCP wrapper around the GitHub API. I made exactly this observation in my own post on the Internet of Agents. Simon Willison said as much in his year-end review, noting that for coding agents, “the best possible tool for any situation is Bash.”

But here’s the thing: not every agent has a shell. And not every agent is an interactive coding assistant.

I wrote in Everything Becomes an Agent that the agentic pattern is showing up everywhere: classifiers that need to call tools, data pipelines that need to make decisions, background processes that orchestrate workflows without a human watching. The “MCP is dead” argument treats agents as though they are all developer tools running in a terminal session. That’s one pattern, and it’s the pattern that gets the most attention because developers are writing the blog posts. But the agentic shift is much broader than that.

I’ve been building Gemini Scribe for nearly a year and a half now. It’s an AI agent that lives inside Obsidian, a note-taking application built on Electron. On desktop, Gemini Scribe runs in the renderer process of a sandboxed app. It has no terminal. It has no $PATH. It cannot reliably shell out to gh or kubectl or anything else. Its entire world is the Obsidian plugin API, the vault on disk, and whatever external capabilities I wire up for it. And on mobile, the constraints are even tighter. Obsidian runs on iOS and Android, where there is no shell at all, no subprocess spawning, no local binary execution. The app sandbox on mobile is absolute. If your answer to “how does an agent use tools?” begins with “just call the CLI,” you’ve already lost half your user base.

When I wanted Gemini Scribe to be able to read my Google Calendar, search my email, or pull context from Google Drive, I didn’t have the option of “just use the CLI.” There is no gcal CLI that runs inside a browser runtime. There is no gmail binary I can spawn from an Electron sandbox, let alone from an iPhone. MCP gave me a way to expose those capabilities through a protocol that works over stdio or HTTP, regardless of where my agent happens to be running.

The same is true of my Podcast RAG system. The query agent runs on the server, orchestrating retrieval, re-ranking, and synthesis in a Python process that has no interactive shell session. I could wire up every capability as a bespoke function call, and in some cases I do. But when I want that same retrieval pipeline to be accessible from Gemini CLI on my laptop, from Gemini Scribe in Obsidian, and from the web frontend, MCP gives me one implementation that serves all three. The alternative is writing and maintaining three separate integration layers.

Or consider a less obvious case: a background agent that monitors a codebase for security vulnerabilities and files tickets when it finds them. This agent runs on a schedule, not in response to a human typing a command. It needs to read files from a repository, query a vulnerability database, and create issues in a project tracker. You could give it a shell, but you shouldn’t. An autonomous agent running unattended with shell access is a privilege escalation vector. A crafted comment in a pull request, a malicious string in a dependency manifest, any of these could become a prompt injection that turns bash into an attack surface. Structured tool protocols are the natural interface for this kind of autonomous workflow precisely because they constrain what the agent can do. The agent gets read_file and create_issue, not bash -c. The narrower the interface, the smaller the blast radius.

The N-by-M Problem Doesn’t Go Away

Holmes frames MCP as solving a problem that doesn’t exist. CLIs already work, so why add a protocol?

But CLIs work for a very specific topology: one human (or one human-like agent) driving one tool at a time through a shell. The moment you step outside that topology, CLIs stop being the answer.

Even if every service had a CLI (and Holmes is right that more should), you still have the consumer problem. A CLI is consumable by exactly one kind of agent: one with shell access. The moment you need that same capability accessible from an Electron plugin, a mobile app, a server-side orchestrator, and a terminal agent, you’re back to writing integration code for each consumer. MCP lets you write the server once and expose it to all of them through a common protocol.

This is the same insight behind LSP, which I wrote about in the context of ACP. Before LSP, every editor had to implement its own Python linter, its own Go formatter, its own TypeScript type-checker. The N-by-M integration problem was a nightmare. LSP didn’t replace the underlying tools. It standardized the interface between the tools and the editors. MCP does the same thing for the interface between capabilities and agents.

Holmes might respond that the N-by-M problem is overstated, that most developers just need one agent talking to a handful of tools. Fair enough for a personal workflow. But the industry isn’t building personal workflows. It’s building platforms where agents need to discover and compose capabilities dynamically, where the set of available tools changes based on the user’s permissions, their organization’s policies, and the context of the current task. That’s the world MCP is designed for.

Authentication is the Feature, Not the Bug

One of Holmes’ sharpest critiques is that MCP is “unnecessarily opinionated about auth.” CLI tools, he notes, use battle-tested flows like gh auth login and AWS SSO that work the same whether a human or an agent is driving.

This is true when the agent is acting as you. But the moment the agent stops acting as you and starts acting on behalf of other people, everything changes.

Imagine you’re building a product where an AI assistant helps your customers manage their calendars. Each customer has their own Google account. You cannot ask each of them to run gcloud auth login in a terminal. You need per-user OAuth tokens, tenant isolation, and an auditable record of every action the agent takes on each user’s behalf. This is not a niche enterprise concern. This is the basic architecture of any multi-tenant agent system.

Or think about something simpler: a shared documentation service protected by OAuth. Your team’s internal knowledge base, your company’s Confluence, your organization’s Google Drive. An agent that needs to search those resources on behalf of a user has to present that user’s credentials, not the developer’s, not a shared service account. This is a solved problem in the web world (every SaaS app does it), but it requires a protocol that understands identity delegation. curl with a hardcoded token doesn’t cut it.

MCP’s authentication specification isn’t trying to replace gh auth login for developers who already have credentials loaded. It’s trying to solve the problem of how an agent running in a hosted environment acquires and manages credentials for users who will never see a terminal. Dismissing this as unnecessary complexity is like dismissing HTTPS because curl works fine over HTTP on your local network.

Where I Actually Agree

I want to be clear that Holmes isn’t wrong about the pain points. MCP server initialization is genuinely flaky. I’ve lost hours to servers that didn’t start, connections that dropped, and state that got corrupted between restarts. The tooling is immature. The debugging experience is terrible. As I wrote in my post on the observability gap, the moment you rely on an agent for something that matters, you realize you’re flying blind. MCP’s opacity makes that worse.

And the context window overhead is real. Benchmarks from ScaleKit show that an MCP agent injecting 43 tool definitions consumed 44,026 tokens before doing any work, while a CLI agent doing the same task needed 1,365. When you’re paying per token, that’s not an abstraction tax you can ignore.

But these are maturity problems, not architecture problems. The early days of LSP were rough too. Language servers crashed, features were spotty, and half the community said “just use the built-in tooling.” The protocol won anyway, because the abstraction was right even when the implementation wasn’t.

The Bridge Pattern

Here’s what I think the mature answer looks like, and it’s neither “use MCP for everything” nor “use CLIs for everything.” It’s building your core capability as a shared library, then exposing it through multiple transports.

Think about how you’d design a tool that queries your internal knowledge base. The business logic (authentication, retrieval, re-ranking) lives in a Python module or a Go package. From that shared core, you generate three thin wrappers. A streaming HTTP MCP server for agents running in web runtimes and hosted environments. A local stdio MCP server for desktop agents like Gemini Scribe or Claude Desktop that communicate over standard input/output. And a CLI binary for developers who want to pipe results through jq or use it from Gemini CLI’s bash tool.

All three share the same code paths. A bug fix in the retrieval logic propagates everywhere. The auth layer adapts to context: the CLI reads your local credentials, the HTTP server handles OAuth tokens, and the stdio server inherits the host process’s permissions. You get the CLI’s simplicity where a shell exists, and MCP’s universality where it doesn’t.

This isn’t hypothetical. It’s what I’m already doing. My gemini-utils library is the shared core: it handles file uploads, deep research, audio transcription, and querying against Gemini’s APIs. It exposes all of that as a set of CLI commands (research, transcribe, query, upload) that I use directly from the terminal every day. But when I wanted those same research capabilities available to Gemini CLI as an agent tool, I built gemini-cli-deep-research, an extension that wraps the same underlying library as an MCP service. The core logic is shared. The CLI is for me at a terminal. The MCP server is for agents that need to invoke deep research as a tool in a larger workflow. Same capability, different transports, each suited to its context.

I think this is the pattern that tool developers should be building toward. The best agent tools of the next few years won’t be “MCP servers” or “CLI tools.” They’ll be capability libraries with multiple faces.

The Real Question

The CLI-vs-MCP debate, as Tobias Pfuetze argued, is the wrong fight. The question isn’t “which is better?” It’s “where does each one belong?”

For a developer in a terminal with their own credentials, driving a coding agent? Use the CLI. It’s faster, cheaper, and the agent already knows how. Holmes is right about that.

For an agent embedded in an application runtime without shell access? For a multi-tenant platform where the agent acts on behalf of users who will never open a terminal? For a system where you need one capability implementation discoverable by multiple heterogeneous agent hosts? That’s where MCP earns its complexity.

And for the tool developer who wants to serve all of these audiences? Build the core once, expose it three ways: CLI, stdio MCP, and streaming HTTP MCP. Let the runtime decide.

The mistake is assuming that because your agent has a shell, every agent has a shell. The terminal is one runtime among many. And as agents move from developer tools into products that serve non-technical users, the fraction of agents that can rely on a $PATH and a .bashrc is going to shrink rapidly.

MCP isn’t dead. It’s just not for you yet. But it might be soon.

Letters from Silicon Valley

In “Letters from Silicon Valley,” I write about the convergence of technology, life, and creativity, sharing insights from my extensive experience in the tech industry along with my personal adventures in woodworking, music, and beyond.

MCP Isn’t Dead You Just Aren’t the Target Audience

The Shell is a Privilege

The N-by-M Problem Doesn’t Go Away

Authentication is the Feature, Not the Bug

Where I Actually Agree

The Bridge Pattern

The Real Question

Like this:

Related

One thought on “MCP Isn’t Dead You Just Aren’t the Target Audience”

Leave a ReplyCancel reply

The Shell is a Privilege

The N-by-M Problem Doesn’t Go Away

Authentication is the Feature, Not the Bug

Where I Actually Agree

The Bridge Pattern

The Real Question

Share this:

Like this:

Related

One thought on “MCP Isn’t Dead You Just Aren’t the Target Audience”

Leave a ReplyCancel reply

Discover more from Letters from Silicon Valley