Our Python SDK got smarter. We developed a Typscript SDK too. We are updating our SDK code blocks. Python SDKhere.Typscript SDKhere.
Description
TutorialsDatasets

Working with Large Datasets

Guidelines for efficiently processing datasets with more than 30k rows

For larger datasets (> 30k rows), we recommend users import and use locally stored datasets with the Patronus Python SDK. Local datasets can be stored in .csv, .jsonl format, or downloaded from HuggingFace or S3 storage. To use your locally stored datasets, simply map the fields to the following standard fields:

  • system_prompt (optional): The system prompt provided to the model, setting the context or behavior for the model's response.
  • task_context: Additional information or context provided to the model. This can be a string or list of strings, typically used in Retrieval-Augmented Generation (RAG) setups, where the model's response depends on external information that has been fetched from a knowledge base or similar source.
  • task_input: The primary input to the model or task, typically a user query or instruction.
  • task_output: The output generated by the model or task being evaluated.
  • gold_answer: The expected or correct answer that the model output is compared against during evaluation. This field is used to assess the accuracy and quality of the model's response.
  • tags (optional): Key-value pairs for categorizing and filtering samples.
  • task_metadata (optional): Additional structured information about the task.

Here's a simple example of mapping fields when loading a large CSV file:

from patronus.datasets import read_csv
from patronus.experiments import run_experiment
 
# Load a large CSV file with custom field mapping
dataset = read_csv(
    "path/to/large_dataset.csv",
    task_input_field="user_query",
    task_output_field="model_response",
    gold_answer_field="reference_answer",
    task_context_field="retrieved_documents"
)
 
# Use the dataset in an experiment
experiment = run_experiment(
    dataset=dataset,
    task=my_task,
    evaluators=[my_evaluator],
    experiment_name="Large Dataset Evaluation"
)

For the full set of accepted parameters and examples of how to use local datasets in the Python SDK, see the Using Datasets section in the Experimentation Framework documentation.

On this page

No Headings