Working with Large Datasets

For larger datasets (> 30k rows), we recommend users import and use locally stored datasets with the Patronus Python SDK. Local datasets can be stored in .csv, .jsonl format, or downloaded from HuggingFace or S3 storage. To use your locally stored datasets, simply map the fields to the following fields:

evaluated_model_system_prompt (optional): The system prompt provided to the model, setting the context or behavior for the model's response.
evaluated_model_retrieved_context: A list of context strings (list[str]) retrieved and provided to the model as additional information. This field is typically used in a Retrieval-Augmented Generation (RAG) setup, where the model's response depends on external context or supporting information that has been fetched from a knowledge base or similar source.
evaluated_model_input: Typically a user input provided to the model that it must respond to.
evaluated_model_output: The output generated by the model.
evaluated_model_gold_answer: The expected or correct answer that the model output is compared against during evaluation. This field is used to assess the accuracy and quality of the model's response.

For the full set of accepted parameters and examples of how to use local datasets in the python SDK, see the Datasets section in the Experimentation Framework.