Our Python SDK got smarter. We developed a Typscript SDK too. We are updating our SDK code blocks. Python SDKhere.Typscript SDKhere.
Description

Architecture Overview

Overview of Patronus AI's Kubernetes-based architecture, including core services, storage components, and authentication integration for self-hosted deployments.

Architecture Diagram

Patronus AI runs in Kubernetes. The core application consists of the following components:

Application Services

  • Patronus App (Frontend)
  • Patronus Backend API
  • Evaluation API
  • Accounts API
  • Evaluators
  • Patronus OpenTelemetry Collector
  • Patronus Trace Insights
  • Admin Portal
  • Containerized Models (optional, requires GPU resources)

Storage Services

  • PostgreSQL (can be external for production)
  • Redis (can be external for production)
  • ClickHouse (optional, recommended for tracing features)
  • Weaviate (optional, recommended for Patronus Trace Insights)

Service Details

Core Services

Patronus App (Frontend)

  • This is the entrypoint to the application. The frontend serves the Patronus platform and routes API requests to other services.

Backend API

  • The primary API gateway that routes calls to internal services. It enforces authentication and authorization, handles requests from the frontend application and Python SDK, and exposes public /v1/... routes.

Evaluation API

  • Provides the /evaluate API that powers evaluation requests in LLM monitoring, experiments and evaluation runs. It supports the registry of evaluators, including the management of evaluator profiles and prompt templates.

Evaluators

  • Microservices that provide evaluations such as PII detection, hallucination detection, and more. All Patronus evaluators follow a standardized API schema for use by the Evaluation API.

Accounts API

  • Manages organizations, accounts and users. Integrates with the authentication provider.

Admin Portal

  • Administrative interface for managing the Patronus platform configuration and settings.

Observability Services

Patronus OpenTelemetry Collector

  • Collects, processes, and exports telemetry data (traces, metrics, and logs) from LLM applications. This service enables observability for monitoring application performance and behavior.

Patronus Trace Insights

  • Analyzes and visualizes trace data collected by the OpenTelemetry Collector. Provides insights into LLM application execution patterns, latency, and dependencies to help identify performance bottlenecks and issues.

Optional Services

Containerized Models

  • Containerized model services that require GPU resources. Installation is optional and depends on your security requirements and infrastructure capabilities.

Storage Components

PostgreSQL

  • Patronus uses PostgreSQL to store data across all workloads, including evaluation logs and associated metadata.

Note: PostgreSQL can be configured as an external service. This is recommended for production environments.

Redis

  • Used for caching operations and storing state in evaluation runs.

Note: Redis can be configured as an external service. This is recommended for production environments.

ClickHouse

  • Used for storing evaluation logs and associated metadata for tracing features. This is an optional service that can be replaced by PostgreSQL, but ClickHouse is recommended for better performance.

Weaviate

  • Vector database used by Patronus Trace Insights for efficient similarity search and semantic analysis. This is an optional service that can be replaced by PostgreSQL with pgvector extension, but Weaviate is recommended for better performance.

Authentication Provider

  • For production environments, we integrate with vouch-proxy to enable customers to use their own Identity Provider (IDP) for authentication. Vouch-proxy supports most OIDC and OAuth2 providers out of the box. Learn more about supported SSO providers in the vouch-proxy documentation.

Note: For POC environments, simple username and password authentication is also supported.


← Back to Self Hosting Guide

On this page