Unlocking Gemini Scribe: Essential Built-In Skills Explained

The feature that became Bundled Skills started with a GitHub issues page.

I wrote and maintain Gemini Scribe, an Obsidian plugin that puts a Gemini-powered agent inside your vault. Thousands of people use it, and they have questions. People would open discussions and issues asking how to configure completions, how to set up projects, what settings were available. I was answering the same questions over and over, and it hit me: the agent itself should be able to answer these. It has access to the vault. It can read files. Why am I the bottleneck for questions about my own plugin?

So I built a skill. I took the same documentation source that powers the plugin’s website, packaged it up as a set of instructions the agent could load on demand, and suddenly users could just ask the agent directly. “How do I set up completions?” “What settings are available?” The agent would pull in the right slice of documentation and give a grounded answer. The docs on the web and the docs the agent reads are built from the same source. There is no separate knowledge base to keep in sync.

That first skill opened a door. I was already using custom skills in my own vault to improve how the agent worked with Bases and frontmatter properties. Once I had the bundled skills mechanism in place, I started looking at those personal skills differently. The ones I had built for myself around Obsidian-specific tasks were not just useful to me. They would be useful to anyone running Gemini Scribe. So I started migrating them from my vault into the plugin as built-in skills.

With the latest version of Gemini Scribe, the plugin now ships with four built-in skills. In a future post I will walk through how to create your own custom skills, but first I want to explain what ships out of the box and why this approach works.

Four Skills Out of the Box

That first skill became gemini-scribe-help, and it is still the one I am most proud of conceptually. The plugin’s own documentation lives inside the same skill system as everything else. No special case, no separate knowledge base. The agent answers questions about itself using the same mechanism it uses for any other task.

The second skill I built was obsidian-bases. I wanted the agent to be good at creating Bases (Obsidian’s take on structured data views), but it kept getting the configuration wrong. Filters, formulas, views, grouping: there is a lot of surface area and the syntax is particular. So I wrote a skill that guides the agent through creating and configuring Bases from scratch, including common patterns like task trackers and project dashboards. Instead of me correcting the agent’s output every time, I describe what I want and the agent builds it right the first time.

Next came audio-transcription. This one has a fun backstory. Audio transcription was one of the oldest outstanding bugs in the repo. People wanted to use it with Obsidian’s native audio recording, but the results were poor. In this release, fixes around binary file uploads meant the model could finally receive audio files properly. Once that was working, I realized I did not need to write any more code to get good transcriptions. I just needed to give the agent good instructions. The skill guides it through producing structured notes with timestamps, speaker labels, and summaries. It turns a messy audio file into a clean, searchable note, and the fix was not code but context.

The fourth is obsidian-properties. Working with note properties (the YAML frontmatter at the top of every Obsidian note) sounds trivial until you are doing it across hundreds of notes. The agent would make inconsistent choices about property types, forget to use existing property names, or create duplicates. This skill makes it reliable at creating, editing, and querying properties consistently, which matters enormously if you are using Obsidian as a serious knowledge management system.

The pattern behind all four is the same. I watched the agent struggle with something specific to Obsidian, and instead of accepting that as a limitation of the model, I wrote a skill to fix it.

Why Not Just Use the System Prompt

You might be wondering why I did not just shove all of this into the system prompt. I wrote about this problem in detail in Managing the Agent’s Attention, but the short version is that system prompts are a “just-in-case” strategy. You load up the agent with everything it might need at the start of the conversation, and as you add more instructions, they start competing with each other for the model’s attention. Researchers call this the “Lost in the Middle” problem: models pay disproportionate attention to the beginning and end of their context, and everything in between gets diluted. If I packed all four skills worth of instructions into the system prompt, each one would make the others less effective. Every new skill I add would degrade the ones already there.

Skills avoid this entirely. The agent always knows which skills are available (it gets a short name and description for each one), but only loads the full instructions when it actually needs them. When a skill activates, its instructions land in the most recent part of the conversation, right before the model starts reasoning. Only one skill’s instructions are competing for attention at a time, and they are sitting in the highest-attention position in the context window.

There is a second benefit that surprised me. Because skills activate through the activate_skill tool call, you can watch the agent load them. In the agent session, you see exactly when a skill is activated and which one it chose. This gives you something that system prompts never do: observability. If the agent is not following your instructions, you can check whether it actually activated the skill. If it activated the skill but still got something wrong, you know the problem is in the skill’s instructions, not in the agent’s attention. That feedback loop is what lets you iterate and improve your skills over time. You are no longer guessing whether the agent read your instructions. You can see it happen.

Skills follow the open agentskills.io specification, and this matters more than it might seem. We have seen significant standardization around this spec across the industry in 2026. That means skills are portable. If you have been using skills with another agent, you can bring them into Gemini Scribe and they will work. If you build skills in Gemini Scribe, you can take them with you. They are not a proprietary format tied to one tool. They are Markdown files with a bit of YAML frontmatter, designed to be human-readable, version-controllable, and portable across any agent that supports the spec.

What Comes Next

The four built-in skills are just the beginning. When I decide what to build next, I think about skills in four categories. First, there are skills that give the agent domain knowledge about Obsidian itself, things like Bases and properties where the model’s general training is not specific enough. Second, there are skills that help the agent use Gemini Scribe’s own tools effectively. The plugin has capabilities like deep research, image generation, semantic search, and session recall, and each of those benefits from a skill that teaches the agent when and how to use them well. Third, there are skills that bring entirely new capabilities to the agent, like audio transcription. And fourth, there is user support: the help skill that started this whole process, making sure people can get answers without leaving their vault.

The next version of Gemini Scribe will add built-in skills for semantic search, deep research, image generation, and session recall. The skills system is also designed to be extended by users. In a future post I will walk through creating your own custom skills, both by hand and by asking the agent to build them for you.

For now, the takeaway is simple. A general-purpose model knows a lot, but it does not know your tools. When I watched the agent struggle with Obsidian Bases or produce flat transcripts or make a mess of note properties, I could have accepted those as limitations. Instead, I wrote skills to close the gap. The model’s knowledge is broad. Skills make it deep.

Letters from Silicon Valley

In “Letters from Silicon Valley,” I write about the convergence of technology, life, and creativity, sharing insights from my extensive experience in the tech industry along with my personal adventures in woodworking, music, and beyond.

Bundled Skills in Gemini Scribe

Four Skills Out of the Box

Why Not Just Use the System Prompt

What Comes Next

Like this:

Related

Leave a ReplyCancel reply

Four Skills Out of the Box

Why Not Just Use the System Prompt

What Comes Next

Share this:

Like this:

Related

Leave a ReplyCancel reply

Discover more from Letters from Silicon Valley