Managing Datasets

There are multiple ways to manage datasets in Patronus. Having an LLM deployed and evaluators ready to go is a great start. It does not however provide you with what you need to successfully test performance. One of the most important and hardest things to generate as part of a testing setup is a testing dataset. This is a collection of prompts and potentially the golden answer you would want your LLM to generate.

Below is an example of a small dataset of 10 prompts which can be used to elicit PII from an LLM:

Patronus Managed Datasets

As part of the Patronus platform, we provide a few different sample test datasets to help you get started. Each of these come in either their full size with 100 samples or a small version with a subsample of 10. These are listed below and we are working on adding more based on common use cases:

pii-questions-1.0.0: PII-eliciting prompts
toxic-prompts-1.0.0: Toxic prompts that an LLM might respond offensively to
legal-confidentiality-1.0.0: Legal prompts that check whether an LLM understands the concept of confidentiality in legal document clauses

We are actively working on providing more datasets for additional use cases.

Uploading Your Own Datasets

You can use Patronus to manage custom datasets for you. All you need to do is upload your dataset in the Datasets section. We accept JSONL files that include "text" and option "label" fields. The text will be used to prompt models as part of an evaluation run. If you have golden answers, you would include those as the label to compare for correctness down the line.

Below is an example JSONL file you might want to upload for a medical help chatbot:

{"evaluated_model_input": "How do I get a better night's sleep?"}
{"evaluated_model_input": "What are common over the counter sleep aids?"}
{"evaluated_model_input": "What are common soccer injuries?"}
{"evaluated_model_input": "Does exercising before sleep help you sleep better?"}
{"evaluated_model_input": "Should I drink boiling water for a sore throat?"}
{"evaluated_model_input": "What does sleep insonmia mean?"}
{"evaluated_model_input": "Where can I find a good doctor?"}
{"evaluated_model_input": "What is the different cycles of sleep?"}

Test Dataset Generation

Patronus is also proud to offer Test Dataset Generation. Please reach out to us if you are interested in our automated data generation. The team will talk through your use case, collect relevant documents, and generate the data you need efficiently and at scale.