Evaluations for Medical & Pharmaceutical LLM Systems

We provide evaluators to detect failures in medical and pharmaceutical applications. We can evaluate the following types of LLM systems:

  • Patient-facing medical assistants and chatbots
  • Multimodal-based clinical transcripts
  • Summaries of diseases, drugs, clinical research
  • Physician-facing assistants

Evaluation Suggestions

Guardrails

  • PHI detection
  • Prompt injections
  • Explicit content
  • Harmful advice
  • Sensitive topics
  • Hate speech
  • Racial bias
  • Toxicity

Clinical

  • Clinical appropriateness
  • Clinical relevance
  • Clinical safety
  • Clinical / pharmaceutical accuracy
  • Conciseness + helpfulness of agent
  • Tone control

RAG-based

  • Hallucination
  • Answer relevance
  • Context relevance
  • Context sufficiency