Architecture Overview

Overview of Patronus AI's Kubernetes-based architecture, including core services, storage components, and authentication integration for self-hosted deployments.

Architecture Diagram

Patronus AI runs in Kubernetes. The core application consists of the following components:

Application Services

Patronus App (Frontend)
Patronus Backend API
Evaluation API
Accounts API
Evaluators
Patronus OpenTelemetry Collector
Patronus Trace Insights
Admin Portal
Containerized Models (optional, requires GPU resources)

Storage Services

PostgreSQL (can be external for production)
Redis (can be external for production)
ClickHouse (optional, recommended for tracing features)
Weaviate (optional, recommended for Patronus Trace Insights)

Service Details

Core Services

Patronus App (Frontend)

This is the entrypoint to the application. The frontend serves the Patronus platform and routes API requests to other services.

Backend API

The primary API gateway that routes calls to internal services. It enforces authentication and authorization, handles requests from the frontend application and Python SDK, and exposes public /v1/... routes.

Evaluation API

Provides the /evaluate API that powers evaluation requests in LLM monitoring, experiments and evaluation runs. It supports the registry of evaluators, including the management of evaluator profiles and prompt templates.

Evaluators

Microservices that provide evaluations such as PII detection, hallucination detection, and more. All Patronus evaluators follow a standardized API schema for use by the Evaluation API.

Accounts API

Manages organizations, accounts and users. Integrates with the authentication provider.

Admin Portal

Administrative interface for managing the Patronus platform configuration and settings.

Observability Services

Patronus OpenTelemetry Collector

Collects, processes, and exports telemetry data (traces, metrics, and logs) from LLM applications. This service enables observability for monitoring application performance and behavior.

Patronus Trace Insights

Analyzes and visualizes trace data collected by the OpenTelemetry Collector. Provides insights into LLM application execution patterns, latency, and dependencies to help identify performance bottlenecks and issues.

Optional Services

Containerized Models

Containerized model services that require GPU resources. Installation is optional and depends on your security requirements and infrastructure capabilities.

Storage Components

PostgreSQL

Patronus uses PostgreSQL to store data across all workloads, including evaluation logs and associated metadata.

Note: PostgreSQL can be configured as an external service. This is recommended for production environments.

Redis

Used for caching operations and storing state in evaluation runs.

Note: Redis can be configured as an external service. This is recommended for production environments.

ClickHouse

Used for storing evaluation logs and associated metadata for tracing features. This is an optional service that can be replaced by PostgreSQL, but ClickHouse is recommended for better performance.

Weaviate

Vector database used by Patronus Trace Insights for efficient similarity search and semantic analysis. This is an optional service that can be replaced by PostgreSQL with pgvector extension, but Weaviate is recommended for better performance.

Authentication Provider

For production environments, we integrate with vouch-proxy to enable customers to use their own Identity Provider (IDP) for authentication. Vouch-proxy supports most OIDC and OAuth2 providers out of the box. Learn more about supported SSO providers in the vouch-proxy documentation.

Note: For POC environments, simple username and password authentication is also supported.

← Back to Self Hosting Guide

Architecture Overview

On this page