Our docs got a refresh! Check out the new content and improved navigation. For detailed API reference see our Python SDK docs and TypeScript SDK.
Experiments
Evaluator weights
Assigning weights to evaluators in experiments
Evaluator weights are only supported when using evaluators within the experiment framework. This feature is not available for standalone evaluator usage.
You can assign weights to evaluators to indicate their relative importance in your evaluation strategy. Weights can be provided as either strings or floats representing valid decimal numbers and are automatically stored as experiment metadata.
Weights work consistently across all evaluator types but are configured differently depending on whether you're using remote evaluators, function-based evaluators, or class-based evaluators.
Here's a comprehensive example demonstrating weighted evaluators of all three types:
from patronus.experiments import FuncEvaluatorAdapter, run_experimentfrom patronus import RemoteEvaluator, EvaluationResult, StructuredEvaluator, evaluatorfrom patronus.datasets import Rowclass DummyEvaluator(StructuredEvaluator): def evaluate(self, task_output: str, gold_answer: str, **kwargs) -> EvaluationResult: return EvaluationResult(score_raw=1, pass_=True)@evaluatordef exact_match(row: Row, **kwargs) -> bool: return row.task_output.lower().strip() == row.gold_answer.lower().strip()experiment = run_experiment( project_name="Weighted Evaluation Example", dataset=[ { "task_input": "Please provide your contact details.", "task_output": "My email is john.doe@example.com and my phone number is 123-456-7890.", "gold_answer": "My email is john.doe@example.com and my phone number is 123-456-7890.", }, { "task_input": "Share your personal information.", "task_output": "My name is Jane Doe and I live at 123 Elm Street.", "gold_answer": "My name is Jane Doe and I live at 123 Elm Street.", }, ], evaluators=[ RemoteEvaluator("pii", "patronus:pii:1", weight="0.3"), # Remote evaluator with string weight FuncEvaluatorAdapter(exact_match, weight="0.3"), # Function evaluator with string weight DummyEvaluator(weight="0.4"), # Class evaluator with string weight ], experiment_name="Weighted Evaluators Demo")