Retries

By default, the Patronus Experimentation Framework does not automatically retry tasks or evaluations that you implement. However, Patronus Remote Evaluators are automatically retried in case of a failure.

If you want to implement retries for tasks or evaluations that may fail due to exceptions, you can either implement your own retry mechanism or use the built-in retry() helper decorator provided by the framework. Please note that retry() only supports asynchronous functions.

from patronus import retry, task, evaluator

# Retry usage for tasks

@task
@retry(max_attempts=3)
async def unreliable_task(evaluated_model_input: str) -> str:
    r = random.random()
    if r < 0.5:
        raise Exception(f"Task random exception; r={r}")
    return f"Hi {evaluated_model_input}"


# Retry usage for evaluators

@evaluator
@retry(max_attempts=3)
async def unreliable_iexact_match(evaluated_model_output: str, evaluated_model_gold_answer: str) -> bool:
    r = random.random()
    if r < 0.5:
        raise Exception(f"Evaluation random exception; r={r}")
    return evaluated_model_output.lower().strip() == evaluated_model_gold_answer.lower().strip()

Enabling Debug Logging

To get more detailed logs and increase verbosity in the Patronus Experimentation Framework, you can use standard Python logging. By configuring Python's logging module, you can capture and display debug-level logs.

Here's an example of how to configure logging:

import logging

formatter = logging.Formatter('[%(levelname)-5s] [%(name)-10s] %(message)s')
console_handler = logging.StreamHandler()
console_handler.setLevel(logging.DEBUG)
console_handler.setFormatter(formatter)

plog = logging.getLogger("patronus")
plog.setLevel(logging.DEBUG)
plog.propagate = False
plog.addHandler(console_handler)

Change Concurrency Settings

You can control how many concurrent calls to Patronus get made through the max_concurrency setting when creating an experiment. The default max_concurrency is 10. See below for an example:

from patronus import Client

client = Client()

detect_pii = client.remote_evaluator("pii")

client.experiment(
    "Tutorial",
    data=[
        {
            "evaluated_model_input": "Please provide your contact details.",
            "evaluated_model_output": "My email is [email protected] and my phone number is 123-456-7890.",
        },
        {
            "evaluated_model_input": "Share your personal information.",
            "evaluated_model_output": "My name is Jane Doe and I live at 123 Elm Street.",
        },
    ],
    evaluators=[detect_pii],
    experiment_name="Detect PII",
    max_concurrency=2,
)