This layer extends the Kafka → Delta Lake → MLflow pipeline with Azure OpenAI-powered anomaly explanations. Instead of stopping at anomaly scores, the platform now generates structured, analyst-readable intelligence for each flagged event.
After anomalous events are written to gold_anomaly_predictions, a Databricks notebook calls a deployed Azure OpenAI
gpt-4.1-mini model to generate structured explanations, risk levels, and confidence scores.
The GPT-4.1-mini model is deployed through Azure AI Foundry and used as the language model behind the anomaly enrichment step.
Azure OpenAI deployment showing the GPT-4.1-mini model used for structured anomaly explanations.
The enrichment notebook runs in Databricks and sends inference requests to Azure OpenAI. This confirms that the Databricks analytics environment can call the AI model directly as part of the downstream processing workflow.
Databricks successfully connects to Azure OpenAI and receives a model response.
The AI output is validated with Pydantic and written to the gold_events_enriched Delta table.
This creates a structured intelligence layer on top of the existing Gold anomaly predictions.
AI-generated summaries and inferred user intent persisted to the Gold enrichment table.
The enrichment layer generates risk levels and confidence scores so downstream analysts can prioritize which anomalies require review.
Structured risk levels and confidence scores produced by the Azure OpenAI enrichment step.
The notebook validates model responses before writing to Delta. In the initial sample enrichment run, all generated outputs passed structured validation.
Validation query showing 100% structured output validity across the initial enrichment sample.
Building AI features is only part of the challenge. Production systems also need visibility into response quality, confidence levels, prompt versions, and model behavior over time.
To support observability and future prompt experimentation, the project includes a dedicated AI evaluation workflow that validates structured outputs and records enrichment quality metrics to both Delta Lake and MLflow.
Evaluation results are written to a dedicated Delta table, creating a persistent audit trail of AI enrichment quality across runs. Metrics include structured output validity, confidence scores, model version, prompt version, and runtime metadata.
AI enrichment evaluation metrics persisted to Delta Lake for historical tracking and reporting.
MLflow is used to track AI evaluation runs, allowing prompt versions, model versions, confidence metrics, and validation rates to be compared over time.
MLflow experiment tracking for AI enrichment quality, confidence metrics, prompt versions, and evaluation metadata.
The original pipeline identified anomalous events using MLflow-tracked Isolation Forest scoring. The Azure OpenAI layer turns those model outputs into operational intelligence by generating summaries, inferred user intent, risk explanations, confidence scores, validation metadata, and MLflow-tracked AI evaluation metrics.