ML Inference & MLflow Integration

This page walks through the ML inference logic applied to the Silver Delta table using a registered Isolation Forest model. MLflow enables tracking, versioning, and evaluation — making this pipeline reproducible and production-ready.

Step 1: Registering the Model in MLflow

The Isolation Forest model was trained on Silver table features and logged to MLflow. It was registered to enable versioned inference.

MLflow Model Registry Screenshot

Model registry showing version control and model ownership.

Step 2: MLflow Run Metadata

This batch inference run logs metrics, parameters, and artifacts such as predictions and evaluation results.

MLflow Run Screenshot

Overview of MLflow experiment run tracking all pipeline metadata.

Step 3: Tracked Metrics & Artifacts

MLflow automatically tracked anomaly score distributions and total events scored for audit and monitoring.

MLflow Metrics Visualization

MLflow tracked metrics: 1020 events scored, 0.80 average anomaly score.

Step 4: Output to Delta Gold Table

Scored records are written to `gold_events_scored` with metadata like timestamp, run ID, and prediction flags.

Gold Table Output Preview

Preview of enriched Delta table showing scored events with anomaly scores.

Step 5: Anomaly Score Distribution

Visualizations highlight the distribution of prediction scores and flagged anomalies.

Anomaly Evaluation Visualization

Confusion matrix and KDE plot used to define decision threshold for scoring.

Step 6: 📊 Summary of Inference Run

Step 7: Delta Lake Audit History

Using `DESCRIBE HISTORY`, each ML run is auditable via versioned Delta Lake metadata.

Delta Gold History Screenshot

Delta Lake history logs every inference write, including schema version and run ID.