Concepts
Understanding the key capabilities of Percival
What is Percival?
Percival is Patronus's intelligent agent debugger that automatically analyzes traces from agent workflows to identify issues, suggest optimizations, and help you build better evaluators. It combines automated error detection with conversational AI to make agent development faster and more reliable.
Three core capabilities
Percival offers three distinct but complementary ways to work with your agent traces:
Percival for debugging
Percival analyzes agent traces to detect errors, cluster failures, and recommend prompt fixes automatically. When you click "Analyze with Percival" on any trace, it processes the full execution to identify 20+ failure modes across reasoning errors, system execution issues, and planning problems.
How it works:
- Trace collection: Your agent workflows are traced using OpenTelemetry or Patronus SDK
- Analysis: Percival ingests the trace and processes all spans
- Error detection: Identifies specific failure modes using its error taxonomy
- Clustering: Groups similar errors together for systematic resolution
- Optimization: Suggests concrete prompt improvements to fix detected issues
Common use cases:
- Debug why agent workflows fail
- Identify inefficient tool usage patterns
- Find coordination issues between multiple agents
- Get actionable prompt fixes for detected errors
Example workflow:
See the complete workflow: Tracing and debugging agents with Percival
Percival Chat
Percival Chat is an interactive AI assistant that lets you explore traces and build evaluators through natural conversation. Instead of clicking through spans and logs, you can ask questions like "Why did this trace fail?" or "Help me build an eval for hallucinations" and get immediate, actionable answers.
How it works:
Percival Chat operates at three levels of analysis:
- High-level: Insight-powered analysis across multiple traces, pattern recognition, and trend analysis
- Mid-level: Span analysis with error detection and performance tracking
- Low-level: Pinpoint inspection of individual attributes, logs, and raw data
The system automatically selects the right analysis depth based on your question and switches between specialized agents to provide the most relevant answers.
Common use cases:
- Build domain-specific evaluators by describing success criteria
- Query traces using natural language
- Understand patterns across multiple traces
- Iterate on eval criteria collaboratively
Example workflow:
Access Percival Chat:
- Direct: https://chat.patronus.ai/
- From navigation: Click "Chat with Percival" in the main menu
- From a trace: Select a trace → Click "Chat with Percival" in the side panel
See the complete workflow: Build evals with Percival Chat
Custom error taxonomy
While Percival detects 20+ errors out of the box, you can extend its error taxonomy with domain-specific failure modes. Custom taxonomies let you encode the exact failures your team cares about, making Percival aware of your application's unique requirements.
How it works:
- Define taxonomy: Navigate to Traces → Taxonomy tab → Click "Define New"
- Create errors: Add specific error types with clear descriptions (like pass criteria for a judge)
- Organize categories: Group similar errors for better organization
- Trace agent: Run your agent with tracing enabled
- Analyze: Click "Analyze with Percival" to detect your custom errors
When to use custom taxonomies:
- Domain-specific applications (medical, legal, financial, sales)
- Compliance requirements (HIPAA, PCI-DSS, regulatory constraints)
- Product-specific failure modes
- Company policy violations
Example workflow:
See the complete workflow: Custom error taxonomies with Percival
To view the complete list of errors Percival can detect, see: Error taxonomy
Framework integrations
Percival works with popular agent frameworks through OpenTelemetry and OpenInference tracing conventions:
- LangChain - Full tracing support
- CrewAI - Multi-agent coordination tracking
- OpenAI Agents - Assistant API integration
- Pydantic AI - Structured output tracking
- smolagents - Lightweight agent monitoring
- Custom - OpenAI and Anthropic clients (compatible with OpenAIInstrumentor and AnthropicInstrumentor)
