Custom Error Taxonomies with Percival
For developers and product teams working with agentic systems, one of the biggest challenges is observability. Teams need to know how and when their agentic systems are failing. Often, these failures are domain-specific — requiring a custom taxonomy of errors.
Patronus provides Percival, an agentic debugger with a flexible error taxonomy. In this walkthrough, we’ll build a simple mock sales Q&A chatbot with a custom error taxonomy. By the end of this walkthrough, you’ll know how to:
- Define a custom error taxonomy with Percival
- Trace an agent and log to the Patronus platform
- Use Percival to catch errors in your taxonomy
0. Define Error Taxonomy
In the Traces tab, select your relevant project. Then navigate to the Taxonomy tab.
From here, you can view the base taxonomy Percival uses to categorize errors. You can extend this by clicking “Define New.”
Within a taxonomy you can define:
- Errors: specific ways your agent may fail that you intend to catch
- Categories: groups of similar errors. These don’t affect how Percival analyzes your trace, but help keep the taxonomy organized (similar to folders in a file system).
When defining an error, provide a short description, similar to how you would describe pass criteria for a judge. You can view examples by expanding the Patronus AI Error Taxonomy.
Let’s define a custom error for our sales Q&A Agent. We wouldn’t want the agent making promises about roadmap or unreleased features, so we’ll add a custom error mode to catch this:
- Name:
Roadmap Promise
- Description:
The final output includes product roadmap, unreleased features, makes commitments about future functionality, or promises functionality that does not exist yet.
Once saved, this new error taxonomy is ready to use on real traces!
1. Set Up Mock Agent Environment
Now we can develop our agent. We start by importing the necessary packages.
We’ll also define a simple one-element golden dataset. Notice that the gold answer for this question does not commit to a specific roadmap date. Instead, it defers to the sales team and avoids roadmap commitments.
2. Build Mock Agent and Run a Trace
Next, we'll build a mock agent with a couple of tools to emulate answering the user's question. Each tool is decorated with the Patronus @traced decorator, and we wrap them in a simple task function.
With our mock agent defined, we can run an experiment and log the trace to the Patronus UI.
3. Get Percival Error Insights
From within the UI, click into the trace and select “Analyze with Percival.” We can see that Percival successfully surfaces our custom error:
4. Improving Our Agent
Percival also provides specific prompt fixes to improve the agent. These prompt recommendations can be applied directly to prevent future roadmap promises.
For more details on implementing these fixes, see the related guide: Tracing and Debugging Agents with Percival.
Wrap Up
This flow — define taxonomy → trace agent → analyze with Percival → improve prompts — is the standard loop for catching domain-specific failures with Patronus.
- Custom error taxonomies let you encode the exact failure modes your team cares about.
- Tracing ensures you have visibility into how your agent is behaving in practice.
- Percival analysis surfaces errors and provides actionable fixes.
- Prompt improvements close the loop, making agents safer and more reliable over time.