How Drasi Uses AI Agents to Catch Documentation Bugs

Published: 2026-05-04 16:27:23 | Category: Open Source

Introduction

For early-stage open-source projects, the Getting started guide is often the first real interaction a developer has with the project. If a command fails, an output doesn't match, or a step is unclear, most users won't file a bug report—they will just move on. That's a huge risk for any project, especially one that wants to grow its community.

How Drasi Uses AI Agents to Catch Documentation Bugs — Source: azure.microsoft.com

Drasi, a CNCF sandbox project that detects changes in your data and triggers immediate reactions, is maintained by a small team of four engineers in Microsoft Azure's Office of the Chief Technology Officer. They move fast. They have comprehensive tutorials, but they are shipping code faster than they can manually test them.

The Incident That Sparked Change

The team didn't realize how big this gap was until late 2025, when GitHub updated its Dev Container infrastructure, bumping the minimum Docker version. The update broke the Docker daemon connection—and every single tutorial stopped working. Because they relied on manual testing, they didn't immediately know the extent of the damage. Any developer trying Drasi during that window would have hit a wall.

This incident forced a realization: with advanced AI coding assistants, documentation testing can be converted to a monitoring problem.

The Problem: Why Does Documentation Break?

Documentation usually breaks for two main reasons:

The Curse of Knowledge

Experienced developers write documentation with implicit context. When they write "wait for the query to bootstrap", they know to run drasi list query and watch for the Running status, or even better—run the drasi wait command. A new user has no such context. Neither does an AI agent. They read the instructions literally and don't know what to do. They get stuck on the how, while the docs only document the what.

Silent Drift

Documentation doesn't fail loudly like code does. When you rename a configuration file in your codebase, the build fails immediately. But when your documentation still references the old filename, nothing happens. The drift accumulates silently until a user reports confusion.

This is compounded for tutorials like Drasi's, which spin up sandbox environments with Docker, k3d, and sample databases. When any upstream dependency changes—a deprecated flag, a bumped version, or a new default—the tutorials can break silently.

The Solution: Agents as Synthetic Users

To solve this, the team treated tutorial testing as a simulation problem. They built an AI agent that acts as a "synthetic new user."

This agent has three critical characteristics:

It is naïve: It has no prior knowledge of Drasi—it knows only what is explicitly written in the tutorial.
It is literal: It executes every command exactly as written. If a step is missing, it fails.
It is unforgiving: It verifies every expected output. If the doc says, "You should see 'Success'", and the CLI just returns silently—the agent flags it and fails fast.

The Stack: GitHub Copilot CLI and Dev Containers

The team built a solution using GitHub Copilot CLI and Dev Containers. They set up a continuous integration pipeline that runs the agent against the tutorials on every code change. The agent reads the tutorial markdown, translates steps into commands, runs them in a clean Dev Container environment, and compares expected outputs with actual results. Any mismatch triggers a build failure, notifying the team immediately.

Results and Lessons Learned

Since implementing this approach, Drasi has caught several documentation bugs before they reached users. The agent identified missing steps, incorrect commands, and outdated assumptions. The team now treats documentation as code—versioned, tested, and monitored.

The key takeaway: AI agents can act as tireless reviewers for your documentation. By simulating a new user, they surface bugs that manual reviews miss. For fast-moving open-source projects, this approach turns documentation maintenance from a reactive chore into a proactive, data-driven process.

If you're maintaining guides for an evolving project, consider building your own synthetic user. It's like having a QA team that never sleeps—and never assumes prior knowledge.

Thchere