Concepts
Understanding datasets in Patronus AI
What are datasets?
A dataset is a collection of test examples used to evaluate your LLM application. Each example typically includes inputs (like questions or prompts) and optionally expected outputs, context, or metadata.
Datasets let you systematically test your AI system against real-world scenarios to measure quality, safety, and performance.
Dataset schema
Datasets in Patronus use a standard schema with these fields:
- task_input: The main input to your LLM (required)
- task_output: The expected or actual output from your LLM
- task_context: Additional context like retrieved documents
- gold_answer: Reference answer for comparison
- system_prompt: System message or instructions
- tags: Labels for organizing examples
- sid: Sample ID for tracking specific examples
- task_metadata: Additional custom fields
You can include any of these fields in your dataset, and you can also add custom fields that your tasks or evaluators can access.
How to get datasets
There are two main ways to create or obtain datasets:
Upload existing data
Upload your own datasets via the UI or load them programmatically:
- Supported formats: CSV, JSONL, pandas DataFrames, or Python lists
- Size limit: 30,000 rows via UI upload
- Field mapping: Map your column names to Patronus schema fields
See uploading datasets for details.
Use off-the-shelf datasets
Patronus provides pre-built datasets for common use cases:
- Safety datasets: PII detection, toxic content, OWASP security tests
- Benchmarks: FinanceBench, HaluBench
- Domain-specific: Financial, legal, and more
Each dataset contains curated examples you can download and use immediately.
See off-the-shelf datasets for the complete list.
Next steps
- Learn how to upload datasets
- Understand using datasets in experiments
- Explore off-the-shelf datasets
- Work with large datasets
