There is a specific kind of friction that every developer knows. It’s the friction of the “Alt-Tab.”
You’re deep in the code, holding a complex mental model of a system in your head, when you realize you need to check a requirement. That requirement lives in a Google Doc. Or maybe you need to see if you have time to finish a feature before your next meeting. That information lives in Google Calendar.
So you leave the terminal. You open the browser. You navigate the tabs. You find the info. And in those thirty seconds, the mental model you were holding starts to evaporate. The flow is broken.
But it’s not just the context switch that kills your momentum—it’s the ambush. The moment you open that browser window, the red dots appear. Chat pings, new emails, unresolved comments on a doc you haven’t looked at in two days—they all clamor for your attention. Before you know it, the quick thing you needed to look up has morphed into an hour of answering questions and putting out fires. You didn’t just lose your place in the code; you lost your afternoon.
I’ve been thinking a lot about this friction lately, especially as I’ve moved more of my workflow into the Gemini CLI. If we want AI to be a true partner in our development process, it can’t just live in a silo. It needs access to the context of our work—and for most of us, that context is locked away in the cloud, in documents, chats, and calendars.
That’s why I built the Google Workspace extension for Gemini CLI.
Giving the Agent “Senses“
We often talk about AI agents in the abstract, but their utility is defined by their boundaries. An agent that can only see your code is a great coding partner. An agent that can see your code and your design documents and your team’s chat history? That’s a teammate.
This extension connects the Gemini CLI to the Google Workspace APIs, effectively giving your terminal-based AI a set of digital senses and hands. It’s not just about reading data; it’s about integrating that data into your active workflow.
Here is what that looks like in practice:
1. Contextual Coding
Instead of copying and pasting requirements from a browser window, you can now ask Gemini to pull the context directly.
“Find the ‘Project Atlas Design Doc’ in Drive, read the section on API authentication, and help me scaffold the middleware based on those specs.”
2. Managing the Day
I often get lost in work and lose track of time. Now, I can simply ask my terminal:
“Check my calendar for the rest of the day. Do I have any blocks of free time longer than two hours to focus on this migration?”
3. Seamless Communication
Sometimes you just need to drop a quick note without leaving your environment.
“Send a message to the ‘Core Eng’ chat space letting them know the deployment is starting now.”
The Accidental Product
Truth be told, I didn’t set out to build a product. When I first joined Google DeepMind, this was simply my “starter project.” My manager suggested I spend a few weeks experimenting with Google Workspace and our agentic capabilities, and the Gemini CLI seemed like the perfect sandbox for that kind of exploration.
I started building purely for myself, guided by my own daily friction. I wanted to see if I could check my calendar without leaving the terminal. Then I wanted to see if I could pull specs from a Doc. I followed the path of my own curiosity, adding tools one by one.
But when I shared this little experiment with a few colleagues, the reaction was immediate. They didn’t just think it was cool; they wanted to install it. That’s when I realized this wasn’t just a personal hack—it was a shared need. It snowballed from a few scripts into a full-fledged extension that we knew we had to ship.
Under the Hood
The extension is built as a Model Context Protocol (MCP) server, which means it runs locally on your machine. It uses your own OAuth credentials, so your data never passes through a third-party server. It’s direct communication between your local CLI and the Google APIs.
It currently supports a wide range of tools across the Workspace suite:
- Docs & Drive: Search for files, read content, and even create new docs from markdown.
- Calendar: List events, find free time, and schedule meetings.
- Gmail: Search threads, read emails, and draft replies.
- Chat: Send messages and list spaces.
Why This Matters
This goes back to the idea of “Small Tools, Big Ideas.” Individually, a command-line tool to read a calendar isn’t revolutionary. But when you combine that capability with the reasoning engine of a large language model, it becomes something else entirely.
It turns your terminal into a cockpit for your entire digital work life. It allows you to script interactions between your code and your company’s knowledge base. It reduces the friction of context switching, letting you stay where you are most productive.
If you want to try it out, the extension is open source and available now. You can install it directly into the Gemini CLI:
gemini extensions install https://github.com/gemini-cli-extensions/workspace
I’m curious to see how you all use this. Does it change your workflow? Does it keep you in the flow longer? Give it a spin and let me know.
Start service
[…] — Google Workspace: Интегрируйте Gemini CLI с данными вашего рабочего пространства. Пишите документы, создавайте презентации, общайтесь с другими пользователями или даже проводите вычисления в таблицах: gemini extensions install https://github.com/gemini-cli-extensions/workspace Блог […]
[…] make sense? I see three clear cases. First, when there’s no CLI (for example with my MCP service for Google Workspace), since many SaaS products expose APIs but no command-line interface. An MCP server is the natural […]