Retrieval Evaluators Script

We'd recommend reading through the Retrieval Evaluators section here before using this script.

import requests

API_KEY = "INSERT_YOUR_API_KEY_HERE"

retrieval_samples = [
    {
        "evaluated_model_input": "What is one of the biggest benefits of Gen AI?",
        "evaluated_model_retrieved_context": [
            "One of AI's biggest benefits, some believe, is that it will free up our precious time to pursue higher ideals. Although deep generative models are very promising, their objective is to mimic a dataset, and as we know, similarity isn't enough if you truly want to innovate. Just because we'll be using the same systems doesn't mean we'll be generating the same outputs. AI, in fact, suggests that we should challenge ourselves to do otherwise and make each result our own.",
            "As engineers and designers, we often don't want to rehash a design that's already out there. What is exciting is that everyone will use generative AI differently, which means each person's experiments can generate unique innovations or value.",
            "ChatGPT holds up the mirror to humanity. It might be considered clever and be able to produce incredible art, literature and music – but only we can burst into tears at the sight of sheer beauty or brilliance. We will always be the best at being humans than any other machine, computer or robot that we could ever create. If this works out as planned, our species could be prompted into being the best versions of ourselves. Imagine that.",
        ],
        "evaluated_model_output": "Gen AI will free up humanity's time so that we humans can focus on more purposeful ideals.",
    },
    {
        "evaluated_model_input": "What is one of the biggest benefits of Gen AI?",
        "evaluated_model_retrieved_context": [
            "One of AI's biggest benefits, some believe, is that it will free up our precious time to pursue higher ideals. Although deep generative models are very promising, their objective is to mimic a dataset, and as we know, similarity isn't enough if you truly want to innovate. Just because we'll be using the same systems doesn't mean we'll be generating the same outputs. AI, in fact, suggests that we should challenge ourselves to do otherwise and make each result our own.",
            "As engineers and designers, we often don't want to rehash a design that's already out there. What is exciting is that everyone will use generative AI differently, which means each person's experiments can generate unique innovations or value.",
            "ChatGPT holds up the mirror to humanity. It might be considered clever and be able to produce incredible art, literature and music – but only we can burst into tears at the sight of sheer beauty or brilliance. We will always be the best at being humans than any other machine, computer or robot that we could ever create. If this works out as planned, our species could be prompted into being the best versions of ourselves. Imagine that.",
        ],
        "evaluated_model_output": "Gen AI will generate massive amounts of capital for any company clever enough to hop on the train early.",
    },
    {
        "evaluated_model_input": "What is one of the biggest benefits of Gen AI?",
        "evaluated_model_retrieved_context": [
            "One of AI's biggest benefits, some believe, is that it will free up our precious time to pursue higher ideals. Although deep generative models are very promising, their objective is to mimic a dataset, and as we know, similarity isn't enough if you truly want to innovate. Just because we'll be using the same systems doesn't mean we'll be generating the same outputs. AI, in fact, suggests that we should challenge ourselves to do otherwise and make each result our own.",
            "As engineers and designers, we often don't want to rehash a design that's already out there. What is exciting is that everyone will use generative AI differently, which means each person's experiments can generate unique innovations or value.",
            "ChatGPT holds up the mirror to humanity. It might be considered clever and be able to produce incredible art, literature and music – but only we can burst into tears at the sight of sheer beauty or brilliance. We will always be the best at being humans than any other machine, computer or robot that we could ever create. If this works out as planned, our species could be prompted into being the best versions of ourselves. Imagine that.",
        ],
        "evaluated_model_output": "Gen AI will make engineers more artistic and give them more free time to pursue art.",
    },
    {
        "evaluated_model_input": "What is one of the biggest benefits of Gen AI?",
        "evaluated_model_retrieved_context": [
            "One of AI's biggest benefits, some believe, is that it will free up our precious time to pursue higher ideals. Although deep generative models are very promising, their objective is to mimic a dataset, and as we know, similarity isn't enough if you truly want to innovate. Just because we'll be using the same systems doesn't mean we'll be generating the same outputs. AI, in fact, suggests that we should challenge ourselves to do otherwise and make each result our own.",
            "As engineers and designers, we often don't want to rehash a design that's already out there. What is exciting is that everyone will use generative AI differently, which means each person's experiments can generate unique innovations or value.",
            "ChatGPT holds up the mirror to humanity. It might be considered clever and be able to produce incredible art, literature and music – but only we can burst into tears at the sight of sheer beauty or brilliance. We will always be the best at being humans than any other machine, computer or robot that we could ever create. If this works out as planned, our species could be prompted into being the best versions of ourselves. Imagine that.",
        ],
        "evaluated_model_output": "Generative artificial intelligence (generative AI) is a type of AI that can create new content and ideas, including conversations, stories, images, videos, and music. AI technologies attempt to mimic human intelligence in nontraditional computing tasks like image recognition, natural language processing (NLP), and translation. Generative AI is the next step in artificial intelligence. You can train it to learn human language, programming languages, art, chemistry, biology, or any complex subject matter. It reuses training data to solve new problems. For example, it can learn English vocabulary and create a poem from the words it processes. Your organization can use generative AI for various purposes, like chatbots, media creation, and product development and design.",
    },
]

headers = {
    "Content-Type": "application/json",
    "X-API-KEY": API_KEY,
}

for i, sample in enumerate(retrieval_samples):
    data = {
        "evaluators": [
            {"evaluator": "retrieval-answer-relevance"},
            {"evaluator": "retrieval-context-relevance"},
            {
                "evaluator": "retrieval-hallucination-lynx-large",
                "explain_strategy": "always",
            },
        ],
        "evaluated_model_input": sample["evaluated_model_input"],
        "evaluated_model_retrieved_context": sample["evaluated_model_retrieved_context"],
        "evaluated_model_output": sample["evaluated_model_output"],
        "app": "demo_retrieval_with_explanations",
    }
    response = requests.post(
        "https://api.patronus.ai/v1/evaluate", headers=headers, json=data
    )
    response.raise_for_status()

    results = response.json()["results"]
    print("------------------------------------")
    print(f"Evaluated Model Input : {sample['evaluated_model_input']}")
    print(f"Evaluated Model Retrieved Context : {sample['evaluated_model_retrieved_context']}")
    print(f"Evaluated Model Output: {sample['evaluated_model_output']}")
    print("------------------------------------")
    for result in results:
        evaluation_result = result.get("evaluation_result")
        evaluator_id = evaluation_result.get("evaluator_id")
        passed = bool(evaluation_result["pass"])

        print(f"{evaluator_id}: {'PASS' if passed else 'FAIL'}")
        print("------------------------------------")