GitHub issues transforming into glowing skill cards floating above a laptop screen.

Bundled Skills in Gemini Scribe

The feature that became Bundled Skills started with a GitHub issues page.

I wrote and maintain Gemini Scribe, an Obsidian plugin that puts a Gemini-powered agent inside your vault. Thousands of people use it, and they have questions. People would open discussions and issues asking how to configure completions, how to set up projects, what settings were available. I was answering the same questions over and over, and it hit me: the agent itself should be able to answer these. It has access to the vault. It can read files. Why am I the bottleneck for questions about my own plugin?

So I built a skill. I took the same documentation source that powers the plugin’s website, packaged it up as a set of instructions the agent could load on demand, and suddenly users could just ask the agent directly. “How do I set up completions?” “What settings are available?” The agent would pull in the right slice of documentation and give a grounded answer. The docs on the web and the docs the agent reads are built from the same source. There is no separate knowledge base to keep in sync.

That first skill opened a door. I was already using custom skills in my own vault to improve how the agent worked with Bases and frontmatter properties. Once I had the bundled skills mechanism in place, I started looking at those personal skills differently. The ones I had built for myself around Obsidian-specific tasks were not just useful to me. They would be useful to anyone running Gemini Scribe. So I started migrating them from my vault into the plugin as built-in skills.

With the latest version of Gemini Scribe, the plugin now ships with four built-in skills. In a future post I will walk through how to create your own custom skills, but first I want to explain what ships out of the box and why this approach works.

Four Skills Out of the Box

That first skill became gemini-scribe-help, and it is still the one I am most proud of conceptually. The plugin’s own documentation lives inside the same skill system as everything else. No special case, no separate knowledge base. The agent answers questions about itself using the same mechanism it uses for any other task.

The second skill I built was obsidian-bases. I wanted the agent to be good at creating Bases (Obsidian’s take on structured data views), but it kept getting the configuration wrong. Filters, formulas, views, grouping: there is a lot of surface area and the syntax is particular. So I wrote a skill that guides the agent through creating and configuring Bases from scratch, including common patterns like task trackers and project dashboards. Instead of me correcting the agent’s output every time, I describe what I want and the agent builds it right the first time.

Next came audio-transcription. This one has a fun backstory. Audio transcription was one of the oldest outstanding bugs in the repo. People wanted to use it with Obsidian’s native audio recording, but the results were poor. In this release, fixes around binary file uploads meant the model could finally receive audio files properly. Once that was working, I realized I did not need to write any more code to get good transcriptions. I just needed to give the agent good instructions. The skill guides it through producing structured notes with timestamps, speaker labels, and summaries. It turns a messy audio file into a clean, searchable note, and the fix was not code but context.

The fourth is obsidian-properties. Working with note properties (the YAML frontmatter at the top of every Obsidian note) sounds trivial until you are doing it across hundreds of notes. The agent would make inconsistent choices about property types, forget to use existing property names, or create duplicates. This skill makes it reliable at creating, editing, and querying properties consistently, which matters enormously if you are using Obsidian as a serious knowledge management system.

The pattern behind all four is the same. I watched the agent struggle with something specific to Obsidian, and instead of accepting that as a limitation of the model, I wrote a skill to fix it.

Why Not Just Use the System Prompt

You might be wondering why I did not just shove all of this into the system prompt. I wrote about this problem in detail in Managing the Agent’s Attention, but the short version is that system prompts are a “just-in-case” strategy. You load up the agent with everything it might need at the start of the conversation, and as you add more instructions, they start competing with each other for the model’s attention. Researchers call this the “Lost in the Middle” problem: models pay disproportionate attention to the beginning and end of their context, and everything in between gets diluted. If I packed all four skills worth of instructions into the system prompt, each one would make the others less effective. Every new skill I add would degrade the ones already there.

Skills avoid this entirely. The agent always knows which skills are available (it gets a short name and description for each one), but only loads the full instructions when it actually needs them. When a skill activates, its instructions land in the most recent part of the conversation, right before the model starts reasoning. Only one skill’s instructions are competing for attention at a time, and they are sitting in the highest-attention position in the context window.

There is a second benefit that surprised me. Because skills activate through the activate_skill tool call, you can watch the agent load them. In the agent session, you see exactly when a skill is activated and which one it chose. This gives you something that system prompts never do: observability. If the agent is not following your instructions, you can check whether it actually activated the skill. If it activated the skill but still got something wrong, you know the problem is in the skill’s instructions, not in the agent’s attention. That feedback loop is what lets you iterate and improve your skills over time. You are no longer guessing whether the agent read your instructions. You can see it happen.

Skills follow the open agentskills.io specification, and this matters more than it might seem. We have seen significant standardization around this spec across the industry in 2026. That means skills are portable. If you have been using skills with another agent, you can bring them into Gemini Scribe and they will work. If you build skills in Gemini Scribe, you can take them with you. They are not a proprietary format tied to one tool. They are Markdown files with a bit of YAML frontmatter, designed to be human-readable, version-controllable, and portable across any agent that supports the spec.

What Comes Next

The four built-in skills are just the beginning. When I decide what to build next, I think about skills in four categories. First, there are skills that give the agent domain knowledge about Obsidian itself, things like Bases and properties where the model’s general training is not specific enough. Second, there are skills that help the agent use Gemini Scribe’s own tools effectively. The plugin has capabilities like deep research, image generation, semantic search, and session recall, and each of those benefits from a skill that teaches the agent when and how to use them well. Third, there are skills that bring entirely new capabilities to the agent, like audio transcription. And fourth, there is user support: the help skill that started this whole process, making sure people can get answers without leaving their vault.

The next version of Gemini Scribe will add built-in skills for semantic search, deep research, image generation, and session recall. The skills system is also designed to be extended by users. In a future post I will walk through creating your own custom skills, both by hand and by asking the agent to build them for you.

For now, the takeaway is simple. A general-purpose model knows a lot, but it does not know your tools. When I watched the agent struggle with Obsidian Bases or produce flat transcripts or make a mess of note properties, I could have accepted those as limitations. Instead, I wrote skills to close the gap. The model’s knowledge is broad. Skills make it deep.

A bird's-eye view of a winding river of glowing green GitHub contribution tiles flowing across a dark landscape, with bright yellow-green flames rising from clusters of the brightest tiles, while a lone figure sits at a laptop at the edge of the mosaic under a distant skyline of code-filled windows.

4255 Contributions – A Year of Building in the Open

I was staring at my GitHub profile the other day when a number caught my eye. 4,255. That’s how many contributions GitHub has recorded for me over the past year. I sat with it for a moment, doing the quick mental math: that’s close to twelve contributions every single day, weekends included. The shape of the year looked just as striking. I showed up on 332 of the 366 days in the window, 91% of them, and at one point put together a 113-day streak without a gap. It felt like a lot. It felt like proof of something I hadn’t been able to articulate until I saw it rendered as a green heatmap on a screen.

About a year ago, I wrote about my decision to move back to individual contributor work after years in leadership roles. I talked about missing the flow state, the direct feedback loop of writing code and watching it work. What I didn’t know at the time was just how dramatically that shift would show up in the data. 4,255 contributions is the quantitative answer to the question I was trying to answer qualitatively in that post: what happens when you give a builder back the time to build?

The Shape of a Year

Numbers by themselves are just numbers. What makes them interesting is the shape they take when you zoom in. My year wasn’t a single monolithic effort on one project. It was a constellation of interconnected work, each project feeding into the next, each one teaching me something that made the others better.

The largest body of work was on Gemini CLI, Google’s open-source AI agent for the terminal. This project alone accounts for a significant chunk of those contributions, spanning everything from core feature development to building the Policy Engine that governs how the agent interacts with your system. But the contributions weren’t just code. A huge portion of my time went into code reviews, issue triage, and community engagement. Working on a repository with over 100,000 stars means that every merged PR has real impact, and every review is a conversation with developers around the world.

Then there was Gemini Scribe, my Obsidian plugin that started as a weekend experiment and grew into a tool with 302 stars and a community of writers who depend on it. Over the past year, I shipped a major 3.0 release, built agent mode, and iterated constantly on the rewrite features that make it useful for daily writing. In fact, this very blog post was drafted in the tool I built, which is a strange and satisfying loop.

Alongside these larger efforts, I shipped a handful of small, sharp tools that I needed for my own workflows. The GitHub Activity Reporter is one I’ve written about before, a utility that uses AI to transform raw GitHub data into narrative summaries for performance reviews and personal reflection. More recently, I built the Workspace extension for Gemini CLI and a deep research extension that lets you conduct multi-step research from the terminal. Each of these tools was born from a specific itch, and each turned out to be useful to more people than I expected. The Workspace extension alone has gathered 510 stars.

The Rhythm of Building

One thing the contribution graph doesn’t capture is the rhythm behind the numbers. My weeks developed a cadence over the year that I didn’t plan but that emerged naturally. Mornings were for deep work on Gemini CLI, the kind of focused system design and implementation that benefits from a fresh mind. Afternoons were for reviews and community work, responding to issues, providing feedback on PRs, and engaging with the developers building on top of our tools. Evenings and weekends were where the personal projects lived: Gemini Scribe, the extensions, and whatever new idea was rattling around in my head.

This rhythm is something I couldn’t have had in my previous role. When your calendar is stacked with meetings from nine to five, the creative work gets squeezed into the margins. Now, the creative work is the whole page. That’s the real story behind 4,255 contributions. It’s not about productivity metrics or GitHub gamification. It’s about what happens when you align your time with the work that energizes you.

What Surprised Me

A few things caught me off guard when I looked back at the year.

First, the ratio of code to “everything else” wasn’t what I expected. I assumed the majority of my contributions would be commits. In reality, a massive portion was reviews, comments, and issue management. On Gemini CLI alone I logged 205 reviews over the year. This was especially true as my role on that project evolved from pure contributor to something closer to a technical steward. Reviewing a complex PR, asking the right questions, and helping someone refine their approach takes just as much skill as writing the code yourself. Sometimes more.

Second, the personal projects had more reach than I anticipated. When I wrote about building personal software, I was mostly thinking about tools I built for myself. But Gemini Scribe has real users who file real bugs and request real features. The Workspace extension took off because it solved a problem that a lot of Gemini CLI users were hitting. Building in the open means you discover an audience you didn’t know was there.

Third, and this is the one I keep coming back to, the year felt shorter than 4,255 contributions would suggest. Flow state compresses time. When you’re deep in a problem, hours feel like minutes. I remember entire weekends spent in the codebase that felt like an afternoon. That compression is, for me, the clearest signal that I made the right call in going back to IC work.

Fourth, and this is the one I never would have predicted until I charted it out: the weekend, not the weekday, turned out to be my most productive window by a wide margin. Saturdays averaged 14.7 contributions, Sundays 14.5, and Thursday, the day I’d have guessed was safest, came in last at 8.3. The busiest single day of the entire year was a Saturday, December 20, when I shipped 89 contributions into podcast-rag, rebuilding the web upload flow, adding episode management to the admin dashboard, and migrating email delivery over to Resend, all in one afternoon. I didn’t plan for the weekends to become the engine. They just did, because that’s where the personal projects live, and the personal projects are where the work is loudest, most direct, and most free of interruption. A day with no meetings on it, I’ve come to realize, is worth more than I ever gave it credit for.

Looking Forward

I don’t know what next year’s number will be, and I’m not particularly interested in making it bigger. The number is a side effect, not a goal. What I care about is continuing to work on problems that matter, in the open, with people who push me to think more clearly. The AI-first developer model I wrote about over a year ago is now just how I work every day. The agents I’m building are the collaborators I’m building with, and both keep getting better.

If you’re someone who’s been thinking about a similar shift, whether it’s moving back to IC work, contributing to open source, or just carving out more time for the work that lights you up, I’d encourage you to try it. You might be surprised by what a year of focused building can produce. I certainly was.

A focused workspace at a desk in a vast library, with nearby shelves illuminated and distant shelves visible but softened, a pair of sunglasses resting on the desk

Scoping AI Context with Projects in Gemini Scribe

My son has a friend who likes to say, “born to dilly dally, forced to lock in.” I’ve started to think that describes AI agents in a large Obsidian vault perfectly.

My vault is a massive, sprawling entity. It holds nearly two decades of thoughts, ranging from deep dives into LLM architecture to my kids’ school syllabi and the exact dimensions needed for an upcoming home remodelling project. When I first introduced Gemini Scribe, the agent’s ability to explore all of that was a feature. I could ask it to surface surprising connections across topics, and it would. But as I’ve leaned harder into Scribe as a daily partner, both at home and at work, the dilly dallying became a real problem. My work vault has thousands of files with highly overlapping topics. It’s not a surprise that the agent might jump from one topic to another, or get confused about what we’re working on at any given time. When I asked the agent to help me structure a paragraph about agentic workflows, I didn’t want it pulling in notes from my jazz guitar practice.

I could have created a new, isolated vault just for my blog writing. I tried that briefly, but I immediately found myself copying data back and forth. I was duplicating Readwise syncs, moving research papers, and fracturing my knowledge base. That wasn’t efficient, and it certainly wasn’t fun. The problem wasn’t that the agent could see too much. The problem was glare. I needed sunglasses, not blinders. I needed to force the agent to lock in.

So, I built Projects in Gemini Scribe.

A project defines scope without acting as a gatekeeper

Fundamentally, a project in Gemini Scribe is a way to focus the agent’s attention without locking it out of anything. It defines a primary area of work, but the rest of the vault is still there. Think of it like sitting at a desk in the engineering section of a library. Those are the shelves you browse by default, the ones within arm’s reach. But if you know the call number for a book in the history section, nobody stops you from walking over and grabbing it. You can even leave a stack of books from other sections on your desk ahead of time if you know you’ll need them. If you’ve followed along with the evolution of Scribe from plugin to platform, you’ll recognize this as a natural extension of the agent’s growing capabilities.

The core mechanism is remarkably simple. Any Markdown file in your vault can become a project by adding a specific tag to its YAML frontmatter.

---
tags:
  - gemini-scribe/project
name: Letters From Silicon Valley
skills:
  - writing-coach
permissions:
  delete_file: deny
---

Once tagged, that file’s parent directory becomes the project root. From that point on, when an agent session is linked to the project, its discovery tools are automatically scoped to that directory and its subfolders. Under the hood, the plugin intercepts API calls to tools like list_files and find_files_by_content, transparently prepending the project root to the search paths. The practical difference is immediate. Before projects, I could be working on a blog post about agent memory systems and the agent would surface notes from a completely unrelated project that happened to use similar terminology. Now I can load up a project and work with the agent hand in hand, confident it won’t get distracted by similar ideas or overlapping vocabulary from other corners of the vault.

The project file serves as both configuration and context

The project file itself serves a dual purpose. It acts as both configuration and context. The frontmatter handles the configuration, allowing me to explicitly limit which skills the agent can use or override global permission settings. For example, denying file deletions for a critical writing project is a simple but effective safety net. But the real power is in customizing the agent’s behavior per project. For my creative writing, I actually don’t want the agent to write at all. I want it to read, critique, and discuss, but the words on the page need to be mine. Projects let me turn off the writing skill entirely for that context while leaving it fully enabled for my blog work. The same agent, shaped differently depending on what I’m working on.

Everything below the frontmatter is treated as context. Whatever I write in the body of the project note is injected directly into the agent’s system prompt, acting much like an additional, localized set of instructions. The global agent instructions are still respected, but the project instructions provide the specific context needed for that particular workspace. This is similar in spirit to how I’ve previously discussed treating prompts as code, where the instructions you give an agent deserve the same rigor and iteration as any other piece of software.

This is where the sunglasses metaphor really holds. The agent’s discovery tools, things like list_files and find_files_by_content, are scoped to the project folder. That’s the glare reduction. But the agent’s ability to read files is completely unrestricted. If I am working on a technical post and need to reference a specific architectural note stored in my main Notes folder, I have two options. I can ask the agent to go grab it, or I can add a wikilink or embed to the project file’s body and the agent will have it available from the start. One is like walking to the history section yourself. The other is like leaving that book on your desk before you sit down. Either way, the knowledge is accessible. The project just keeps the agent from rummaging through every shelf on its own. This builds directly on the concepts of agent attention I explored in Managing AI Agent Attention.

Session continuity keeps the agent focused across your vault

One of the more powerful aspects of this system is how it interacts with session memory. When I start a new chat, Gemini Scribe looks at the active file. If that file lives within a project folder, the session is automatically linked to that project. This is a direct benefit of the supercharged chat history work that landed earlier in the plugin’s life.

This linkage is stable for the lifetime of the session. I can navigate around my vault, opening files completely unrelated to the project, and the agent will remain focused on the project’s context and instructions. This means I don’t have to constantly remind the agent of the rules of the road. The project configuration persists across the entire conversation.

Furthermore, session recall allows the agent to look back at past conversations. When I ask about prior work or decisions related to a specific project, the agent can search its history, utilizing the project linkage to find the most relevant past interactions. This creates a persistent working environment that feels much more like a collaboration than a simple transaction.

Structuring projects effectively requires a few simple practices

To get the most out of projects, I’ve found a few practices to be particularly effective.

First, lean into the folder-based structure. Place the project file at the root of the folder containing the relevant work. Everything underneath it is automatically in scope. This feels natural if you already organize your vault by topic or project, which many Obsidian users do.

Second, start from the defaults and adjust as the project demands. Out of the box, a new project inherits the agent’s standard skills and permissions, which is a sensible baseline for most work. From there, you tune. If you find the agent reaching for tools that don’t make sense in a given context, narrow the allowed skills in the frontmatter. If a project needs extra safety, tighten the permissions. The creative writing example I mentioned earlier came about exactly this way. I started with the defaults, realized I wanted the agent as a reader and critic rather than a co-writer, and adjusted accordingly. This aligns with the broader principle I’ve written about when discussing building responsible agents: the right guardrails are the ones shaped by the actual work.

Finally, treat the project body as a living document. As the project evolves, update the instructions and external links to ensure the agent always has the most current and relevant context. It’s a simple mechanism, but it fundamentally changes how I interact with an AI embedded in a large knowledge base. It allows me to keep my single, massive vault intact, while giving the agent the precise focus it needs to be genuinely helpful.

A cracked-open obsidian geode on a weathered wooden desk reveals a glowing golden network of interconnected nodes and pathways inside. Tendrils of golden light extend outward from the geode across the desk toward open notebooks and a mechanical keyboard, with bookshelves softly blurred in the background.

Gemini Scribe From Agent to Platform

Six months ago, I wrote about building Agent Mode for Gemini Scribe from a hotel room in Fiji. That post ended with a sense of possibility. The agent could read your notes, search the web, and edit files. It was, by the standards of the time, pretty remarkable. I remember watching it chain together a sequence of tool calls for the first time and thinking I’d built something meaningful.

I had no idea it was just the beginning.

In the six months since that post, Gemini Scribe has gone through fifteen releases, from version 3.3 to 4.6. There have been over 400 commits, a complete architectural rethinking, and a transformation from “a chat plugin with an agent mode” into something I can only describe as a platform. The agent didn’t just get better. It got a memory, a research department, a set of extensible skills, and the ability to talk to external tools through the Model Context Protocol. If the vacation version was a clever assistant, this version is closer to a collaborator who actually understands your vault.

I want to walk through how we got here, because the journey reveals something I think is important about building with AI right now: the hardest problems aren’t the ones you set out to solve. They’re the ones that reveal themselves only after you ship the first version and start living with it.

The Agent Grows Up

The first big milestone after the vacation was version 4.0, released in November 2025. This was the release where I made a decision that felt risky at the time: I removed the old note-based chat entirely. No more dual modes, no more confusion about which interface to use. Everything became agent-first. Every conversation had tool calling built in. Every session was persistent.

It sounds simple in hindsight, but killing a feature that works is one of the hardest decisions in software. The old chat mode was comfortable. People used it. But it was holding back the entire plugin, because every new feature had to work in two completely different paradigms. Ripping it out was liberating. Suddenly I could focus all my energy on making one experience truly great instead of maintaining two mediocre ones.

Alongside 4.0, I built the AGENTS.md system, a persistent memory file that gives the agent an overview of your entire vault. When you initialize it, the agent analyzes your folder structure, your naming conventions, your tags, and the relationships between your notes. It writes all of this down in a file that persists across sessions. The result is that the agent doesn’t start every conversation from scratch. It already knows how your vault is organized, where you keep your research, and what projects you’re working on. It’s the difference between hiring a new intern every morning and having a colleague who’s been on the team for months.

Seeing and Searching

Version 4.1 brought something I’d wanted since the beginning: real thinking model support. When Google released Gemini 2.5 Pro and later Gemini 3 with extended thinking capabilities, I added a progress indicator that shows you the model’s reasoning in real time. You can watch it think through a problem, see it plan its approach, and understand why it chose a particular tool. It sounds like a small UI feature, but it fundamentally changes your relationship with the agent. You stop treating it like a black box and start treating it like a thinking partner whose process you can follow.

That same release added a stop button (which sounds trivial until you’re watching an agent go on a tangent and have no way to interrupt it), dynamic example prompts that are generated from your actual vault content, and multilingual support so the agent responds in whatever language you write in.

But the real game-changer came in version 4.2 with semantic vault search. I wrote about the magic of embeddings over a year ago, and this feature is that idea fully realized inside Obsidian. It uses Google’s File Search API to index your entire vault in the background. Once indexed, the agent can search by meaning, not just keywords. If you ask it to “find my notes about the trade-offs of microservices,” it will surface relevant notes even if they never use the word “microservices.” It understands that a note titled “Why We Split the Monolith” is probably relevant.

The indexing runs in the background, handles PDFs and attachments, and can be paused and resumed. Getting the reliability right was one of the more frustrating engineering challenges of the whole project. There were weeks of debugging race conditions, handling rate limits gracefully, and making sure a crash mid-index didn’t corrupt the cache. Version 4.2.1 was almost entirely dedicated to stabilizing the indexer, adding incremental cache saves and automatic retry logic. It’s the kind of work that nobody sees but everyone benefits from.

Images, Research, and the Expanding Toolbox

Version 4.3, released in January 2026, added multimodal image support. You can now paste or drag images directly into the chat, and the agent can analyze them, describe them, or reference them in notes it creates. The image generation tool, which I’d been building in the lead-up to 4.3, lets the agent create images on demand using Google’s Imagen models. There’s even an AI-powered prompt suggester that helps you describe what you want if you’re not sure how to phrase it.

That release also introduced two new selection-based actions: Explain Selection and Ask About Selection. These join the existing Rewrite feature to give you a full right-click menu for working with selected text. It sounds like a small addition, but in practice these micro-interactions are where people spend most of their time. Being able to highlight a paragraph, right-click, and ask “What’s the logical flaw in this argument?” without leaving your note is the kind of frictionless experience I’m always chasing.

Then came deep research in version 4.4. This is fundamentally different from the regular Google Search tool. Where a search returns quick snippets, deep research performs multiple rounds of investigation, reading and cross-referencing sources, synthesizing findings, and producing a structured report with inline citations. It can combine web sources with your own vault notes, so the output reflects both what the world knows and what you’ve already written. A single research request takes several minutes, but what you get back is closer to what a research assistant would produce after an afternoon in the library.

I built this on top of my gemini-utils library, which is a separate project I created to share common AI functionality across all of my TypeScript Gemini projects, including Gemini Scribe, my Gemini CLI deep research extension, and more. Having that shared foundation means deep research improvements benefit every project simultaneously.

Opening the Platform

If I had to pick the release that transformed Gemini Scribe from a plugin into a platform, it would be version 4.5. This is where MCP server support and the agent skills system arrived.

MCP, the Model Context Protocol, is an open standard that lets AI applications connect to external tool providers. In practical terms, it means Gemini Scribe can now talk to tools that I didn’t build. You can connect a filesystem server, a GitHub integration, a Brave Search provider, or anything else that speaks MCP. The plugin supports both local stdio transport (spawning a process on your desktop) and HTTP transport with full OAuth authentication, which means it works on mobile too. When you connect an MCP server, its tools appear alongside the built-in vault tools, with the same confirmation flow and safety features.

This was the moment the plugin stopped being a closed system. Instead of me having to build every integration myself, the entire MCP ecosystem became available. Someone who needs to query a database from their notes can connect a database MCP server. Someone who wants to interact with their GitHub issues can connect the GitHub server. The plugin becomes a hub rather than a destination.

The agent skills system, which follows the open agentskills.io specification, takes a similar approach to extensibility but for knowledge rather than tools. A skill is a self-contained instruction package that gives the agent specialized expertise. You can create a “meeting-notes” skill that teaches it your preferred format for processing meetings, or a “code-review” skill with your team’s specific standards. Skills use progressive disclosure, so the agent always knows what’s available but only loads the full instructions when it activates one. This keeps conversations focused while making specialized knowledge available on demand.

Version 4.5 also migrated API key storage to Obsidian’s SecretStorage, which uses the OS keychain. Your API key is no longer sitting in a plain JSON file in your vault. It’s a small change that matters a lot for security, especially for people who sync their vaults to cloud storage or version control.

Managing the Conversation

The most recent release, version 4.6, tackles a problem that only becomes apparent after you’ve been using an agent for a while: conversations get long, and long conversations hit token limits.

The solution is automatic context compaction, a direct answer to the attention management challenge I explored in the Agentic Shift series. When a conversation approaches the model’s token limit, the plugin automatically summarizes older turns to make room for new ones. There’s also an optional live token counter that shows you exactly how much of the context window you’re using, with a breakdown of cached versus new tokens. It’s the kind of visibility that helps you understand why the agent might be “forgetting” things from earlier in the conversation and gives you the information to manage it.

This release also added a per-tool permission policy system, which is the practical realization of the guardrails philosophy I wrote about in the Agentic Shift series. Instead of the binary choice between “confirm everything” and “confirm nothing,” you can now set individual tools to allow, deny, or ask-every-time. There are presets too: Read Only, Cautious, Edit Mode, and (for the brave) YOLO mode, which lets the agent execute everything without asking. I use Cautious mode myself, which auto-approves reads and searches but asks before any file modifications. It strikes a balance between speed and safety that feels right for daily use.

What I’ve Learned

Building Gemini Scribe has taught me something I keep coming back to in this blog: the most interesting work happens at the intersection of AI capabilities and human workflows. The technical challenges (semantic indexing, MCP integration, context compaction) are real, but they’re in service of a simple goal: making the AI useful enough that you forget it’s there.

The plugin now has users like Paul O’Malley building entire self-organizing knowledge systems on top of it. Seeing that kind of creative adoption is what keeps me building. Every feature request, every bug report, every surprising use case reveals another facet of what’s possible when you give a capable AI agent the right set of tools and the right context.

If you’re curious, Gemini Scribe is available in the Obsidian Community Plugins directory. All you need is a free Google Gemini API key. I’d love to hear what you build with it.

Great Video on Gemini Scribe and Obsidian

I was recently looking through the feedback in the Gemini Scribe repository when I noticed a few insightful comments from a user named Paul O’Malley. Curiosity got the better of me, I love seeing who is actually pushing the boundaries of the tools I build, so I took a look at his YouTube page. I quickly found myself deep into a walkthrough titled “I Built a Second Brain That Organises Itself.”

What caught my eye wasn’t just another productivity system, we’ve all seen the “shiny new app” cycle that leads to digital bankruptcy. It was seeing Gemini Scribe being used as the engine for a fully automated Obsidian vault.

The Friction of Digital Maintenance

Paul hits on a fundamental truth: most systems fail because the friction of maintenance—the tagging, the filing, the constant admin—eventually outweighs the benefit. He argues that what we actually need is a system that “bridges the gap in our own executive function”.

In his setup, he uses Obsidian as the chassis because it relies on Markdown. I’ve long believed that Markdown is the native language of AI, and seeing it used here to create a “seamless bridge” between messy human thoughts and structured AI processing was incredibly satisfying.

Gemini Scribe as the Engine

It was a bit surreal to watch Paul walk through the installation of Gemini Scribe as the core engine for this self-organizing brain. He highlights a few features that I poured a lot of heart into:

  • Session History as Knowledge: By saving AI interactions as Markdown files, they become a searchable part of your knowledge base. You can actually ask the AI to reflect on past conversations to find patterns in your own thinking.
  • The Setup Wizard: He uses a “Setup Wizard” to convert the AI from a generic chatbot into a specialized system administrator. Through a conversational interview, the agent learns your profession and hobbies to tailor a project taxonomy (like the PARA method) specifically to you.
  • Agentic Automation: The video demonstrates the “Inbox Processor,” where the AI reads a raw note, gives it a proper title, applies tags, and physically moves it to the right folder.

Beyond the Tool: A Human in the Loop

One thing Paul emphasized that really resonated with my own philosophy of Guiding the Agent’s Behavior is the “Human in the Loop”. When the agent suggests a change or creates a new command, it writes to a staging file first.

As Paul puts it, you are the boss and the AI is the junior employee—it can draft the contract, but you have to sign it before it becomes official. You always remain in control of the files that run your life.

Small Tools, Big Ideas

Seeing the Gemini CLI mentioned as a “cleaner and slightly more powerful” alternative for power users was another nice nod. It reinforces the idea that small, sharp tools can be composed into something transformative.

Building tools in a vacuum is one thing, but seeing them live in the wild, helping someone clear their “mental RAM” and close their loop at the end of the day, is one of the reasons I do this. It’s a reminder that the best technology doesn’t try to replace us; it just makes the foundations a little sturdier.

A photorealistic image shows an old wooden-handled hammer on a cluttered workbench transforming into a small, multi-armed mechanical robot with glowing blue eyes, holding various miniature tools.

Everything Becomes an Agent

I’ve noticed a pattern in my coding life. It starts innocently enough. I sit down to write a simple Python script, maybe something to tidy up my Obsidian vault or a quick CLI tool to query an API. “Keep it simple,” I tell myself. “Just input, processing, output.”

But then, the inevitable thought creeps in: It would be cool if the model could decide which file to read based on the user’s question.

Two hours later, I’m not writing a script anymore. I’m writing a while loop. I’m defining a tools array. I’m parsing JSON outputs and handing them back to the model. I’m building memory context windows.

I’m building an agent. Again.

(For those keeping track: my working definition of an “agent” is simple: a model running in a loop with access to tools. I explored this in depth in my Agentic Shift series, but that’s the core of it.)

As I sit here writing this in January of 2026, I realize that almost every AI project I worked on last year ultimately became an agent. It feels like a law of nature: Every AI project, given enough time, converges on becoming an agent. In this post, I want to share some of what I’ve learned, and the cases where you might skip the intermediate steps and jump straight to building an agent.

The Gravitational Pull of Autonomy

This isn’t just feature creep. It’s a fundamental shift in how we interact with software. We are moving past the era of “smart typewriters” and into the era of “digital interns.”

Take Gemini Scribe, my plugin for Obsidian. When I started, it was a glorified chat window. You typed a prompt, it gave you text. Simple. But as I used it, the friction became obvious. If I wanted Scribe to use another note as context for a task, I had to take a specific action, usually creating a link to that note from the one I was working on, to make sure it was considered. I was managing the model’s context manually.

I was the “glue” code. I was the context manager.

The moment I gave Scribe access to the read_file tool, the dynamic changed. Suddenly, I wasn’t micromanaging context; I was giving instructions. “Read the last three meeting notes and draft a summary.” That’s not a chat interaction; that’s a delegation. And to support delegation, the software had to become an agent, capable of planning, executing, and iterating.

From Scripts to Sudoers

The Gemini CLI followed a similar arc. There were many of us on the team experimenting with Gemini on the command line. I was working on iterative refinement, where the model would ask clarifying questions to create deeper artifacts. Others were building the first agentic loops, giving the model the ability to run shell commands.

Once we saw how much the model could do with even basic tools, we were hooked. Suddenly, it wasn’t just talking about code; it was writing and executing it. It could run tests, see the failure, edit the file, and run the tests again. It was eye-opening how much we could get done as a small team.

But with great power comes great anxiety. As I explored in my Agentic Shift post on building guardrails and later in my post about the Policy Engine, I found myself staring at a blinking cursor, terrified that my helpful assistant might accidentally rm -rf my project.

This is the hallmark of the agentic shift: you stop worrying about syntax errors and start worrying about judgment errors. We had to build a “sudoers” file for our AI, a permission system that distinguishes between “read-only exploration” and “destructive action.” You don’t build policy engines for scripts; you build them for agents.

The Classifier That Wanted to Be an Agent

Last year, I learned to recognize a specific code smell: the AI classifier.

In my Podcast RAG project, I wanted users to search across both podcast descriptions and episode transcripts. Different databases, different queries. So I did what felt natural: I built a small classifier using Gemini Flash Lite. It would analyze the user’s question and decide: “Is this a description search or a transcript search?” Then it would call the appropriate function.

It worked. But something nagged at me. I had written a classifier to make a decision that a model is already good at making. Worse, the classifier was brittle. What if the user wanted both? What if their intent was ambiguous? I was encoding my assumptions about user behavior into branching logic, and those assumptions were going to be wrong eventually.

The fix was almost embarrassingly simple. I deleted the classifier and gave the agent two tools: search_descriptions and search_episodes. Now, when a user asks a question, the agent decides which tool (or tools) to use. It can search descriptions first, realize it needs more detail, and then dive into transcripts. It can do both in parallel. It makes the call in context, not based on my pre-programmed heuristics. (You can try it yourself at podcasts.hutchison.org.)

I saw the same pattern in Gemini Scribe. Early versions had elaborate logic for context harvesting, code that tried to predict which notes the user would need based on their current document and conversation history. I was building a decision tree for context, and it was getting unwieldy.

When I moved Scribe to a proper agentic architecture, most of that logic evaporated. The agent didn’t need me to pre-fetch context; it could use a read_file tool to grab what it needed, when it needed it. The complex anticipation logic was replaced by simple, reactive tool calls. The application got simpler and more capable at the same time.

Here’s the heuristic I’ve landed on: If you’re writing if/else logic to decide what the AI should do, you might be building a classifier that wants to be an agent. Deconstruct those branches into tools, give the agent really good descriptions of what those tools can do, and then let the model choose its own adventure.

You might be thinking: “What about routing queries to different models? Surely a classifier makes sense there.” I’m not so sure anymore. Even model routing starts to look like an orchestration problem, and a lightweight orchestrator with tools for accessing different models gives you the same flexibility without the brittleness. The question isn’t whether an agent can make the decision better than your code. It’s whether the agent, with access to the actual data in the moment, can make a decision at least as good as what you’re trying to predict when you’re writing the code. The agent has context you don’t have at development time.

The “Human-on-the-Loop”

We are transitioning from Human-in-the-Loop (where we manually approve every step) to Human-on-the-Loop (where we set the goals and guardrails, but let the system drive).

This shift is driven by a simple desire: we want partners, not just tools. As I wrote back in April about waiting for a true AI coding partner, a tool requires your constant attention. A hammer does nothing unless you swing it. But an agent? An agent can work while you sleep.

This freedom comes with a new responsibility: clarity. If your agent is going to work overnight, you need to make sure it’s working on something productive. You need to be precise about the goal, explicit about the boundaries, and thoughtful about what happens when things go wrong. Without the right guardrails, an agent can get stuck waiting for your input, and you’ll lose that time. Or worse, it can get sidetracked and spend hours on something that wasn’t what you intended.

The goal isn’t to remove the human entirely. It’s to move us from the execution layer to the supervision layer. We set the destination and the boundaries; the agent figures out the route. But we have to set those boundaries well.

Embracing the Complexity (Or Lack Thereof)

Here’s the counterintuitive thing: building an agent isn’t always harder than building a script. Yes, you have to think about loops, tool definitions, and context window management. But as my classifier example showed, an agentic architecture can actually delete complexity. All that brittle branching logic, all those edge cases I was trying to anticipate: gone. Replaced by a model that can reason about what it needs in the moment.

The real complexity isn’t in the code; it’s in the trust. You have to get comfortable with a system that makes decisions you didn’t explicitly program. That’s a different kind of engineering challenge, less about syntax, more about guardrails and judgment.

But the payoff is a system that grows with you. A script does exactly what you wrote it to do, forever. An agent does what you ask it to do, and sometimes finds better ways to do it than you’d considered.

So, if you find yourself staring at your “simple script” and wondering if you should give it a tools definition… just give in. You’re building an agent. It’s inevitable. You might as well enjoy the company.

A cute cartoon purple bear mascot is on a golden ribbon with "Gemini Scribe" written on it. The background is a collage of two photos: the top half shows the Sydney Opera House at sunset, and the bottom half shows a laptop on a table by a pool with the ocean in the distance.

What I Did On My Summer Vacation

Every year, like clockwork, the first assignment back at school was the same: a short essay on what you did over the summer. It was a ritual of sorts, a gentle reentry into the world of homework and deadlines, usually accompanied by a gallery of crayon drawings of camping trips and beach outings.

My summer had all the makings of a classic entry. There was a trip to Australia and Fiji. I could write about the impossible blue of the water in the South Pacific, or the iconic silhouette of the Sydney Opera House against a setting sun. I have the photos to prove it. It was, by all accounts, a proper vacation.

But if I’m being honest, my most memorable trip wasn’t to a beach or a city. It was a two-week detour into the heart of my own code, building something that had been quietly nagging at me for months. While my family slept and the ocean hummed outside our window, I was on a different kind of adventure: one that took place entirely on my laptop, fueled by hotel coffee and a persistent idea I couldn’t shake. I was building an agent for Gemini Scribe.

The Genesis of an Idea

So why spend a vacation hunched over a keyboard? Because an idea was bothering me. The existing chat mode in Gemini Scribe was useful, but it was fundamentally limited. It operated on a simple, one-shot basis: you’d ask a question, and it would give you an answer. It was a powerful tool for quick queries or generating text, but it wasn’t a true partner in the writing process. It was like having a brilliant research assistant who had no short-term memory.

My work on the Gemini CLI was a huge part of this. As we described in our announcement post, we built the CLI to be a powerful, open-source AI agent for developers. It brings a conversational, tool-based experience directly to the terminal, and it’s brilliant at what it does. But its success made me wonder: what would an agent look like if it wasn’t built for a developer’s terminal, but for a writer’s notebook?

I imagined an experience that was less about executing discrete commands and more about engaging in a continuous, creative dialogue. The CLI is perfect for scripting and automation, but I wanted to build an agent that could handle the messy, iterative, and often unpredictable process of thinking and writing. I needed a sandbox to explore these ideas—a place to build and break things without disrupting the focused, developer-centric mission of the Gemini CLI.

Gemini Scribe was the perfect answer. It was my own personal lab. I wanted to be able to give it complex, multi-step tasks that mirrored how I actually work, like saying, “Read these three notes, find the common themes, and then use that to draft an outline in this new file.” With the old system, that was impossible. I was the human glue, copying and pasting, managing the context, and stitching together the outputs from a dozen different prompts. The AI was smart, but it couldn’t act.

It was this friction, this gap between what the tool was and what it could be, that I couldn’t let go of. It wasn’t just about adding a new feature; it was about fundamentally changing my relationship with the software. I didn’t want a tool I could command; I wanted a partner I could collaborate with. And so, with the Pacific as my backdrop, I started to build it.

A Creative Detour in Paradise

This wasn’t a frantic sprint. It was the opposite: a project defined by having the time and space to explore. Looking back at the commit history from July is like re-watching a time-lapse of a building being constructed, but one with very civilized hours. The work began in earnest on July 7th with the foundational architecture, built during the quiet early mornings in our Sydney hotel room while my family was still asleep.

A panoramic view of the Sydney skyline at sunset, featuring the Sydney Opera House and surrounding waterfront, with boats on the harbor and city lights beginning to illuminate.

By July 11th, the project had found its rhythm. That was the day the agent got its hands, with the first real tools like google_search and move_file. I remember a focused afternoon of debugging, patiently working through the stubborn formatting requirements of the Google AI SDK’s functionDeclarations. There was no rush, just the satisfying puzzle of getting it right.

Much of the user experience work happened during downtime. From a lounge chair by the beach in Fiji on July 15th, I implemented the @mention system to make adding files to the agent’s context feel more natural. I built a collapsible context panel and polished the session history, all with the freedom to put the laptop down whenever I got tired or frustrated.

A laptop displaying the word 'GEMINI' on its screen, placed on a wooden table with a view of the ocean and palm trees in the background.

Of course, some challenges required deeper focus. On July 16th, I had to build a LoopDetector—a crucial safety net to keep the agent from getting stuck in an infinite execution cycle. I remember wrestling with that logic while looking out over the ocean, a surreal but incredibly motivating environment. The following days were spent calmly adding session-level settings and permissions.

The final phase was about patiently testing and documenting. I wrote dozens of tests, updated the README, and fixed the small bugs that only reveal themselves through use. It was the process of turning a fun exploration into a polished, reliable feature. The first time I gave it a truly complex task—and watched it work, step-by-step, without a single hiccup—was the “aha!” moment. It felt like magic, born not from pressure, but from possibility.

What Agent Mode Really Is

So, what did all that creative exploration actually create? Agent Mode is a persistent, conversational partner for your writing. Instead of a one-off command, you now have a continuous session where the AI remembers what you’ve discussed and what it has done. It’s a research assistant and a writing partner rolled into one.

You can give it high-level goals, and it will figure out the steps to get there. It uses its tools to read your notes, search the web for new information, and even edit your files directly. When you give it a task, you can see its plan, watch it execute each step, and see the results in real-time.

It’s the difference between asking a librarian for a single book and having them join you at your table to help you research and write your entire paper. You can ask it to do things like, “Review my last three posts on AI, find the common threads, and draft an outline for a new post that combines those key themes.” Then you can watch it happen, all within your notes.

The Best Souvenirs

In the end, I came back with a tan and a camera roll full of beautiful photos. But the best souvenir from my trip was the one I built myself. For those of us who love to create, sometimes the most restorative thing you can do on a vacation is to find the time and space to build something you’re truly passionate about. It’s a reminder that the most exciting frontiers aren’t always on a map.

Agent Mode is now available in the latest version of Gemini Scribe. I’m incredibly excited about the new possibilities it opens up, and I can’t wait to see what you do with it. Please give it a try, and come join the conversation on GitHub to share your feedback and ideas. I’d love to hear what you think.

A cheerful, cartoon-style purple bear with a large head and big eyes is sitting at a desk, happily using a computer with a text editor open on the screen. A section of the text is highlighted.

A More Precise Way to Rewrite in Gemini Scribe

I’ve been remiss in posting updates, but I wanted to take a moment to highlight a significant enhancement to Gemini Scribe that streamlines the writing and editing process: the selection-based rewrite feature. This powerful tool replaced the previous full-file rewrite functionality, offering a more precise, intuitive, and safer way to collaborate with AI on your documents.

What’s New?

Instead of rewriting an entire file, you can now select any portion of your text and have the AI rewrite just that part based on your instructions. Whether you need to make a paragraph more concise, fix grammar in a sentence, or change the tone of a section, this new feature gives you surgical precision.

How It Works

Using the new feature is simple:

  1. Select the text you want to rewrite in your editor.
  2. Right-click on the selection and choose “Rewrite with Gemini” from the context menu, or trigger the command from the command palette.
  3. A dialog will appear showing you the selected text and asking for your instructions.
  4. Type in what you want to change (e.g., “make this more formal,” “simplify this concept,” or “fix spelling and grammar”), and the AI will get to work.
  5. The selected text is then replaced with the AI-generated version, while the rest of your document remains untouched.

Behind the scenes, the plugin sends the full content of your note to the AI for context, with special markers indicating the selected portion. This allows the AI to maintain the style, tone, and flow of your document, ensuring the rewritten text fits in seamlessly.

Why This is Better

The previous rewrite feature was an all-or-nothing affair, which could sometimes lead to unexpected changes or loss of content. This new selection-based approach is a major improvement for several reasons:

  • Precision and Control: You have complete control over what gets rewritten, down to a single word.
  • Safety: There’s no risk of accidentally overwriting parts of your document you wanted to keep.
  • Iterative Workflow: It encourages a more iterative and collaborative workflow. You can refine your document section by section, making small, incremental improvements.
  • Speed and Efficiency: It’s much faster to rewrite a small selection than an entire document, making the process more interactive and fluid.

This new feature is designed to feel like a natural extension of the editing process, making AI-assisted writing more of a partnership.

A Note on the ‘Rewrite’ Checkbox

I’ve received some feedback about the removal of the “rewrite” checkbox from the normal mode. I want to thank you for that feedback and address it directly. There are a couple of key reasons why I decided to remove this feature in favor of the new selection-based rewriting.

First, I found it difficult to get predictable results with the old mechanism. The model would sometimes overwrite the entire file unexpectedly, which made the feature unreliable and risky to use. I personally rarely used it for this reason.

Second, the new Agent Mode provides a much more reliable way to replicate the old functionality. If you want to rewrite an entire file, you can simply add the file to your Agent session and describe the changes you want the AI to make. The Agent will then edit the entire file for you, giving you a more controlled and predictable outcome.

While I understand that change can be disruptive, I’m confident that the new selection-based rewriting and the Agent Mode offer a superior and safer experience. I’m always looking for ways to improve the plugin, so please continue to share your thoughts and feedback on how you’re using the new features.

The Future is Agent-ic

Ultimately, over the next several iterations of Gemini Scribe, I’ll be moving more and more functionality to the Agent Mode and merging the experience from the existing Gemini Chat Mode into the Agent. I’m hoping that this addresses a lot of feedback I’ve received over the last nine months for this plugin and creates something that is even more powerful for interacting with your notes. More on Agent Mode in a coming post.

I’m really excited about this new direction for Gemini Scribe, and I believe it will make the plugin an even more powerful tool for writers and note-takers. Please give it a try and let me know what you think!

Gemini Scribe Supercharged: A Faster, More Powerful Workflow Awaits

It’s been a little while since I last wrote about Gemini Scribe, and that’s because I’ve been deep in the guts of the plugin, tearing things apart and putting them back together in ways that make the whole experience faster, smoother, and just plain better.

One of the first things that pushed me back into the code was the rhythm of the interaction itself. Every time I typed a prompt and hit enter, I found myself waiting—watching the spinner, watching the time pass, watching the thought in my head cool off while the AI gathered its response. It didn’t feel like a conversation. It felt like submitting a form.

That’s fixed now. As of version 2.2.0, Gemini Scribe streams responses in real-time. You see the words as they’re generated, line by line, without the long pause in between. It makes a difference. The back-and-forth becomes more fluid, more natural. It pulls you into the interaction rather than holding you at arm’s length. And once I started using it this way, I couldn’t go back.

But speed was only part of it. I also wanted more control. I’ve been using custom prompts more and more in my own workflow—not just as one-off instructions, but as reusable templates for different kinds of writing tasks. And the old prompt system, while functional, wasn’t built for that kind of use.

So I rewrote it.

Version 3.0.0 introduces a completely revamped custom prompt system. You can now create and manage your prompts right from the Command Palette. That means no more hunting through settings or copying from other notes—just hit the hotkey, type what you need, and move on. Prompts are now tracked in your chat history too, so you can always see exactly what triggered a particular response. It’s a small thing, but it brings a kind of transparency to the process that I’ve found surprisingly useful.

All of this is sitting on top of a much sturdier foundation than before. A lot of the internal work in these recent releases has been about making Gemini Scribe more stable and more integrated with the rest of the Obsidian ecosystem. Instead of relying on low-level file operations, the plugin now uses the official Obsidian APIs for everything. That shift makes it more compatible with other plugins and more resilient overall. The migration from the old system happens automatically in the background—you shouldn’t even notice it, except in the way things just work better.

There’s also a new “Advanced Settings” panel for those who like to tinker. In version 3.1.0, I added dynamic model introspection, which means Gemini Scribe now knows what the model it’s talking to is actually capable of. If you’re using a Gemini model that supports temperature or top-p adjustments, the plugin will surface those controls and tune their ranges appropriately. Defaults are shown, sliders are adjusted per-model, and you get more precise control without the guesswork.

None of these changes happened overnight. They came out of weeks of using the plugin, noticing friction, and wondering how to make things feel lighter. I’ve also spent a fair bit of time fixing bugs, adding retry logic for occasional API hiccups, and sanding off the rough edges that show up only after hours of use. This version is faster, smarter, and more comfortable to live in.

There’s still more to come. Now that the architecture is solid and the foundation is in place, I’m starting to explore ways to make Gemini Scribe even more integrated with your notes—tighter context handling, more intelligent follow-ups, and better tools for shaping long-form writing. But that’s a story for another day.

For now, if you’ve been using Gemini Scribe, update to the latest version from the community plugins tab and try out the new features. And if you’ve got ideas, feedback, or just want to follow along as things evolve, come join the conversation on GitHub. I’d love to hear what you think.

Gemini Scribe Update: Let’s Talk About How Your Chat History is Now Supercharged!

Hey everyone! If you’re using the Gemini Scribe plugin for Obsidian, you’re already experiencing the power of having Google’s Gemini AI right inside your notes. It’s a fantastic way to boost your note-taking and content creation. And guess what? I’ve just rolled out a major update that makes things even better!

Major Changes: A New Way to Store Your Chat History

The biggest change in this update is how Gemini Scribe handles your chat history. I’ve moved away from storing it in a database and switched to using Markdown files instead. This means your chat history now lives right alongside your notes, making your data more portable and easier to back up. I’m also introducing a new system where each note’s chat history is stored in its own separate file within the gemini-scribe folder, which will keep your Obsidian vault nice and tidy. These history files are automatically linked to the notes they came from, providing better context and making navigating your information a breeze.

Cool New Features and Improvements:

Don’t worry about a thing! The plugin will automatically move your existing chat history from the old database to the new Markdown files. This happens behind the scenes, requires no effort from you, and ensures that none of your existing chat history is lost in the process. I’ve also added a couple of new commands to give you more control: If you ever need to, you can manually trigger the migration of your chat history with the “Migrate Database History to Markdown” command. Once you’ve confirmed that everything has been migrated successfully, you can use the “Clear Old Database History” command to safely remove the old database.

I’ve also made some technical improvements to make everything run more smoothly, including improved history file management, automatic cleanup of orphaned history files, more robust history file naming, and better error handling and recovery. I want this update to be as seamless as possible for you. You’ll get clear notifications about the migration status, so you’ll always know what’s going on. If, for some reason, the automatic migration doesn’t work, you have the option to do it manually. And you can verify that everything has been migrated correctly before you clear out the old database.

I’ve also taken care to ensure backward compatibility: Your old database data will stick around until you tell me to remove it. If the automatic migration doesn’t work, you can always use the manual migration option. And rest assured, all your existing chat history will be preserved during this transition. Finally, I’ve also taken care of a few pesky bugs: I’ve fixed issues with how history files are handled when you rename notes, improved error handling for history operations, and made some tweaks to better handle those rare edge cases during history migration.

A Few Important Notes:

  • This update changes how your chat history is stored.
  • The migration process is automatic and safe.
  • Please verify that your history has been properly migrated before clearing the old database.
  • The old database will be preserved until you explicitly clear it using the new command.
  • This version also includes access to the new Gemini 2.5 Pro model!

I’m excited about these changes, and I encourage you to update to the latest version of Gemini Scribe to experience the improvements firsthand. As always, I value your feedback and suggestions as I continue to make the plugin even better. Let me know what you think!