Our Python SDK got smarter. We developed a Typscript SDK too. We are updating our SDK code blocks. Python SDKhere.Typscript SDKhere.
Description
Integrations

Databricks

1. What the integration does

When you run MLflow inside Databricks, every experiment emits OpenTelemetry‑compatible traces. By directing those traces to Patronus AI’s managed OTel Collector, you gain real‑time visibility, evaluation metrics, and alerting—with zero extra code beyond three environment variables.

2. One-minute setup (3 env-vars)

# Cluster / Job environment variables
 
export OTEL_EXPORTER_OTLP_ENDPOINT="https://otel.patronus.ai:4317"
 
export OTEL_EXPORTER_OTLP_HEADERS="x-api-key=<YOUR_PATRONUS_API_KEY>,pat-project-name=<YOUR_PROJECT_NAME>"
 
export OTEL_SERVICE_NAME="<YOUR_SERVICE_NAME>"
Variable guide
  • OTEL_EXPORTER_OTLP_ENDPOINT (required) — Tells MLflow where to ship traces (Patronus Collector).
  • OTEL_EXPORTER_OTLP_HEADERS (required) — Authentication header (append pat-project-name=... only if you want to group traces by a Patronus project).
  • OTEL_SERVICE_NAME (required) — Human‑readable label (e.g., feature-store, recs-model-train) shown in Patronus UI.

  • Databricks tip: Set these under Cluster → Configuration → Environment Variables or inject them at job runtime via spark_env_vars.

    3. Install the OTLP exporter (once per cluster)

    %pip install --quiet opentelemetry-exporter-otlp

    Databricks caches the wheel, so subsequent jobs start fast.

    4. Smoke-test the connection (optional)

    import mlflow, time
     
    with mlflow.start_span(name="databricks-smoke-test") as span:
     
        span.set_inputs({"check": "otel"})
     
        time.sleep(1)
     
        span.set_outputs({"status": "ok"})

    Open Patronus AI → Traces and confirm the span appears under the service name you set.

    5. How it works under the hood

    1. MLflow generates OTel traces as experiments run.
    2. The OTLP exporter sends those traces to Patronus’s public endpoint: "https://otel.patronus.ai:4317".
    3. The Collector authenticates with your API key, tags data with pat-project-name (if provided), and forwards it to Patronus ingestion.
    4. Patronus AI renders dashboards, stores history, and triggers alerts—no on‑prem infra required.

    6. Advanced tuning (optional)

    Common tweaks
  • Switch protocolOTEL_EXPORTER_OTLP_PROTOCOL=grpc (default) or http/protobuf.
  • Custom headersOTEL_EXPORTER_OTLP_HEADERS="x-api-key=…,pat-project-name=…,team=mlops".
  • Separate endpointsOTEL_EXPORTER_OTLP_TRACES_ENDPOINT=… and OTEL_EXPORTER_OTLP_METRICS_ENDPOINT=…

  • Note on mlflow.autolog() behavior
    MLflow ≥ 2.12 no longer logs underlying model inputs/outputs when you call mlflow.autolog(). If your Databricks notebooks depend on those logs, enable the OpenAI‑specific helper instead:

    import mlflow.openai
     
    mlflow.openai.autolog()

    A WARN line is printed to the console when this change is detected, but the snippet above is the quickest fix.


    All OpenTelemetry configuration options are supported. See the OpenTelemetry spec for the full list.

    7. Troubleshooting quick fixes

    Common tweaks
  • No spans in Patronus after 5 min — Ensure the Databricks cluster has outbound access on ports 443 / 4317.
  • “UNAUTHENTICATED” error — Verify your API key is correct and active in Patronus AI → Settings → API Keys.
  • High latency or dropped spans — Batch traces by adding OTEL_BSP_SCHEDULE_DELAY=5000 (ms).

  • On this page