Azure OpenAI Enrichment Layer

This layer extends the Kafka → Delta Lake → MLflow pipeline with Azure OpenAI-powered anomaly explanations. Instead of stopping at anomaly scores, the platform now generates structured, analyst-readable intelligence for each flagged event.

AI Enrichment Flow

After anomalous events are written to gold_anomaly_predictions, a Databricks notebook calls a deployed Azure OpenAI gpt-4.1-mini model to generate structured explanations, risk levels, and confidence scores.

Confluent Kafka → Bronze Delta → Silver Delta → MLflow Isolation Forest → gold_anomaly_predictions → Azure OpenAI GPT-4.1-mini → Pydantic Validation → gold_events_enriched → MLflow AI Evaluation → ai_enrichment_eval_metrics

Azure OpenAI Deployment

The GPT-4.1-mini model is deployed through Azure AI Foundry and used as the language model behind the anomaly enrichment step.

Azure OpenAI GPT-4.1-mini Deployment

Azure OpenAI deployment showing the GPT-4.1-mini model used for structured anomaly explanations.

Databricks to Azure OpenAI Integration

The enrichment notebook runs in Databricks and sends inference requests to Azure OpenAI. This confirms that the Databricks analytics environment can call the AI model directly as part of the downstream processing workflow.

Databricks Azure OpenAI Integration

Databricks successfully connects to Azure OpenAI and receives a model response.

Structured Enrichment Output

The AI output is validated with Pydantic and written to the gold_events_enriched Delta table. This creates a structured intelligence layer on top of the existing Gold anomaly predictions.

Gold Events Enriched Delta Table

AI-generated summaries and inferred user intent persisted to the Gold enrichment table.

AI Risk Assessment

The enrichment layer generates risk levels and confidence scores so downstream analysts can prioritize which anomalies require review.

AI Risk Assessment Output

Structured risk levels and confidence scores produced by the Azure OpenAI enrichment step.

Structured Output Validation

The notebook validates model responses before writing to Delta. In the initial sample enrichment run, all generated outputs passed structured validation.

Structured Output Validity Metric

Validation query showing 100% structured output validity across the initial enrichment sample.

AI Evaluation and Prompt Tracking

Building AI features is only part of the challenge. Production systems also need visibility into response quality, confidence levels, prompt versions, and model behavior over time.

To support observability and future prompt experimentation, the project includes a dedicated AI evaluation workflow that validates structured outputs and records enrichment quality metrics to both Delta Lake and MLflow.

gold_events_enriched

Structured Output Validation

MLflow Experiment Tracking

ai_enrichment_eval_metrics

Delta-Based Evaluation History

Evaluation results are written to a dedicated Delta table, creating a persistent audit trail of AI enrichment quality across runs. Metrics include structured output validity, confidence scores, model version, prompt version, and runtime metadata.

AI Evaluation Metrics Delta Table

AI enrichment evaluation metrics persisted to Delta Lake for historical tracking and reporting.

MLflow Experiment Tracking

MLflow is used to track AI evaluation runs, allowing prompt versions, model versions, confidence metrics, and validation rates to be compared over time.

MLflow AI Evaluation Run

MLflow experiment tracking for AI enrichment quality, confidence metrics, prompt versions, and evaluation metadata.

Key Evaluation Metrics
  • Structured output validity percentage
  • Average confidence score
  • Prompt version tracking
  • Model version tracking
  • Runtime measurements
  • Historical evaluation audit trail

Why This Layer Matters

The original pipeline identified anomalous events using MLflow-tracked Isolation Forest scoring. The Azure OpenAI layer turns those model outputs into operational intelligence by generating summaries, inferred user intent, risk explanations, confidence scores, validation metadata, and MLflow-tracked AI evaluation metrics.

```