Databricks

1. What the integration does

When you run MLflow inside Databricks, every experiment emits OpenTelemetry‑compatible traces. By directing those traces to Patronus AI’s managed OTel Collector, you gain real‑time visibility, evaluation metrics, and alerting—with zero extra code beyond three environment variables.

2. One-minute setup (3 env-vars)

# Cluster / Job environment variables
 
export OTEL_EXPORTER_OTLP_ENDPOINT="https://otel.patronus.ai:4317"
 
export OTEL_EXPORTER_OTLP_HEADERS="x-api-key=<YOUR_PATRONUS_API_KEY>,pat-project-name=<YOUR_PROJECT_NAME>"
 
export OTEL_SERVICE_NAME="<YOUR_SERVICE_NAME>"

Variable guide

OTEL_EXPORTER_OTLP_ENDPOINT (required) — Tells MLflow where to ship traces (Patronus Collector).

OTEL_EXPORTER_OTLP_HEADERS (required) — Authentication header (append pat-project-name=... only if you want to group traces by a Patronus project).

OTEL_SERVICE_NAME (required) — Human‑readable label (e.g., feature-store, recs-model-train) shown in Patronus UI.

Databricks tip: Set these under Cluster → Configuration → Environment Variables or inject them at job runtime via spark_env_vars.

3. Install the OTLP exporter (once per cluster)

%pip install --quiet opentelemetry-exporter-otlp

Databricks caches the wheel, so subsequent jobs start fast.

4. Smoke-test the connection (optional)

import mlflow, time
 
with mlflow.start_span(name="databricks-smoke-test") as span:
 
    span.set_inputs({"check": "otel"})
 
    time.sleep(1)
 
    span.set_outputs({"status": "ok"})

Open Patronus AI → Traces and confirm the span appears under the service name you set.

5. How it works under the hood

MLflow generates OTel traces as experiments run.
The OTLP exporter sends those traces to Patronus’s public endpoint: "https://otel.patronus.ai:4317".
The Collector authenticates with your API key, tags data with pat-project-name (if provided), and forwards it to Patronus ingestion.
Patronus AI renders dashboards, stores history, and triggers alerts—no on‑prem infra required.

6. Advanced tuning (optional)

Common tweaks

Switch protocol — OTEL_EXPORTER_OTLP_PROTOCOL=grpc (default) or http/protobuf.

Custom headers — OTEL_EXPORTER_OTLP_HEADERS="x-api-key=…,pat-project-name=…,team=mlops".

Separate endpoints — OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=… and OTEL_EXPORTER_OTLP_METRICS_ENDPOINT=…

Note on mlflow.autolog() behavior
MLflow ≥ 2.12 no longer logs underlying model inputs/outputs when you call mlflow.autolog(). If your Databricks notebooks depend on those logs, enable the OpenAI‑specific helper instead:

import mlflow.openai
 
mlflow.openai.autolog()

A WARN line is printed to the console when this change is detected, but the snippet above is the quickest fix.

All OpenTelemetry configuration options are supported. See the OpenTelemetry spec for the full list.

7. Troubleshooting quick fixes

Common tweaks

No spans in Patronus after 5 min — Ensure the Databricks cluster has outbound access on ports 443 / 4317.

“UNAUTHENTICATED” error — Verify your API key is correct and active in Patronus AI → Settings → API Keys.

High latency or dropped spans — Batch traces by adding OTEL_BSP_SCHEDULE_DELAY=5000 (ms).