Home / Architecture / Tool 13 Demo / Algorithm Detail

Tool 13 · Algorithm Deep Dive

Confidence & Explainability

SHAP + Monte Carlo Dropout + LIME

<0.03ECE
97%Issues Caught
0.75Human Threshold
13/13Tools Covered
Try Interactive Demo

🎯 Why This Layer

📋 Problem Statement

An AI that says "91% confident" when it's right 70% of the time is worse than one that says nothing. Modern neural networks are notoriously overconfident. In SAP consulting, a wrong "high confidence" compliance classification can trigger SOX deficiencies and financial restatements.

✅ Solution

Calibration layer sits downstream of every scoring model. Platt Scaling (classifiers) and Isotonic Regression (regressors) align predicted confidence with actual accuracy. Predictions below 0.75 threshold route to human experts. SHAP and Integrated Gradients provide explanations.

🧩 What It Comprises

📏 Platt Scaling

Logistic regression on model outputs for classifiers. Maps raw logits to calibrated probabilities. Refit weekly on validation data.

📈 Isotonic Regression

Non-parametric calibration for regressors. Fits monotonic function minimizing MSE on held-out predictions.

🔮 SHAP (TreeSHAP)

Game-theoretic explanations for LightGBM/XGBoost models. Exact Shapley values for feature attribution.

🧬 Integrated Gradients

Path-integral explanations for neural models (DeBERTa, LayoutLMv3). Axiomatically sound attributions.

👤 Human-in-the-Loop Routing

⚠️

Confidence < 0.75

→ Route to Human Expert

Confidence ≥ 0.75

→ Auto-approve with audit log

Human review catches 97% of remaining issues.

🔄 How It Runs — Step by Step

┌─────────────────────────────────────────────────────────────────────────────────────────┐
│                    CONFIDENCE & EXPLAINABILITY LAYER PIPELINE                              │
├─────────────────────────────────────────────────────────────────────────────────────────┤
│                                                                                           │
│   ┌─────────────────────────────────────────────────────────────────────────────────┐   │
│   │                              INPUT: Raw Model Output                               │   │
│   │                                                                                   │   │
│   │   From any A²AI tool:                                                             │   │
│   │   • Classifier: Raw logits / probabilities (Tools 02, 03, 07, 08, 11)             │   │
│   │   • Regressor: Raw scores (Tools 04, 05, 09)                                       │   │
│   └─────────────────────────────────────────────────────────────────────────────────┘   │
│          │                                                                                 │
│          ▼                                                                                 │
│   ┌─────────────────────────────────────────────────────────────────────────────────┐   │
│   │                    STEP 1: CALIBRATION                                             │   │
│   │                                                                                   │   │
│   │   ┌─────────────────────────────────┐  ┌─────────────────────────────────────┐   │   │
│   │   │   CLASSIFIERS                   │  │   REGRESSORS                          │   │   │
│   │   │                                 │  │                                       │   │   │
│   │   │   Platt Scaling:                │  │   Isotonic Regression:                │   │   │
│   │   │   P_cal = 1/(1+e^(A·logit+B))   │  │   min Σ(y_i - ŷ_i)²                   │   │   │
│   │   │                                 │  │   s.t. ŷ_1 ≤ ŷ_2 ≤ ... ≤ ŷ_n          │   │   │
│   │   │   Fitted on validation fold     │  │                                       │   │   │
│   │   │   Refit weekly                  │  │   Fitted on held-out predictions       │   │   │
│   │   └─────────────────────────────────┘  └─────────────────────────────────────┘   │   │
│   │                                                                                   │   │
│   │   Output: Calibrated Confidence ∈ [0, 1]                                          │   │
│   └─────────────────────────────────────────────────────────────────────────────────┘   │
│          │                                                                                 │
│          ▼                                                                                 │
│   ┌─────────────────────────────────────────────────────────────────────────────────┐   │
│   │                    STEP 2: THRESHOLD ROUTING                                       │   │
│   │                                                                                   │   │
│   │   ┌─────────────────────────────────────────────────────────────────────────┐    │   │
│   │   │                                                                          │    │   │
│   │   │   IF Confidence ≥ 0.75:                                                  │    │   │
│   │   │       → AUTO-APPROVE                                                     │    │   │
│   │   │       → Log decision with calibrated confidence                           │    │   │
│   │   │       → Generate explanation (SHAP/IG)                                    │    │   │
│   │   │                                                                          │    │   │
│   │   │   ELSE:                                                                   │    │   │
│   │   │       → ROUTE TO HUMAN QUEUE                                              │    │   │
│   │   │       → Show AI prediction + confidence + explanation                      │    │   │
│   │   │       → Human reviews and confirms/corrects                                │    │   │
│   │   │       → Correction logged for model improvement                            │    │   │
│   │   │                                                                          │    │   │
│   │   └─────────────────────────────────────────────────────────────────────────┘    │   │
│   └─────────────────────────────────────────────────────────────────────────────────┘   │
│          │                                                                                 │
│          ▼                                                                                 │
│   ┌─────────────────────────────────────────────────────────────────────────────────┐   │
│   │                    STEP 3: EXPLANATION GENERATION                                   │   │
│   │                                                                                   │   │
│   │   Model-type specific explanation:                                                 │   │
│   │                                                                                   │   │
│   │   ┌──────────────────────────┬────────────────────────────────────────────────┐  │   │
│   │   │ Model Type               │ Explanation Method                               │  │   │
│   │   ├──────────────────────────┼────────────────────────────────────────────────┤  │   │
│   │   │ Tree-based (LGBM, XGB)   │ TreeSHAP — Exact Shapley values                  │  │   │
│   │   │ Neural (DeBERTa, Layout) │ Integrated Gradients — Path integral              │  │   │
│   │   │ Embedding (SBERT)        │ Nearest Neighbor citation                         │  │   │
│   │   │ Rules                    │ Rule trace + firing conditions                     │  │   │
│   │   └──────────────────────────┴────────────────────────────────────────────────┘  │   │
│   └─────────────────────────────────────────────────────────────────────────────────┘   │
│          │                                                                                 │
│          ▼                                                                                 │
│   ┌─────────────────────────────────────────────────────────────────────────────────┐   │
│   │   OUTPUT:                                                                         │   │
│   │   {                                                                               │   │
│   │     "prediction": "Enhancement (E)",                                              │   │
│   │     "raw_confidence": 0.87,                                                       │   │
│   │     "calibrated_confidence": 0.82,                                                │   │
│   │     "auto_approved": true,                                                        │   │
│   │     "explanation": {                                                              │   │
│   │       "method": "TreeSHAP",                                                       │   │
│   │       "top_features": [{"name": "Z_CDS_ prefix", "shap": 0.42}, ...]              │   │
│   │     }                                                                             │   │
│   │   }                                                                               │   │
│   └─────────────────────────────────────────────────────────────────────────────────┘   │
│                                                                                           │
└─────────────────────────────────────────────────────────────────────────────────────────┘
                    

🏗️ Architecture & Integration

Cross-Cutting Layer — Wraps All 13 Tools

TOOL 01
TOOL 02
TOOL 03
TOOL 04
TOOL 05
🔒 TOOL 13
Confidence Layer
TOOL 06
TOOL 07
TOOL 08
TOOL 09
TOOL 10
TOOL 11
TOOL 12
Calibrated Output
Human Queue
Audit Log

Tool 13 ensures the entire platform remains trustworthy and auditable.

📊 Expected Calibration Error (ECE)

ECE measures the gap between predicted confidence and actual accuracy:

ECE = Σ_{m=1}^M (|B_m|/N) × |acc(B_m) - conf(B_m)|

Where predictions are binned into M=10 confidence intervals. Our ECE < 0.03 means confidence and accuracy differ by less than 3% on average.

📐 Mathematical Explanation

Platt Scaling (Classifier Calibration):

P(y=1 | f(x)) = 1 / (1 + e^{A·f(x) + B})

Where A, B are learned on validation set via logistic regression.

Isotonic Regression (Regressor Calibration):

min_{ŷ_1 ≤ ŷ_2 ≤ ... ≤ ŷ_n} Σ (y_i - ŷ_i)²

Subject to monotonicity constraint.

TreeSHAP (Exact for Tree Ensembles):

φ_i = Σ_{S⊆N\{i}} [|S|!(|N|-|S|-1)! / |N|!] × [f(S∪{i}) - f(S)]

Integrated Gradients (Neural Networks):

IG_i(x) = (x_i - x'_i) × ∫_{α=0}^1 ∂F(x' + α(x-x'))/∂x_i dα

Where x' is a baseline (e.g., zero embedding).

📊 Measured Performance

MetricValueBenchmark
ECE (Expected Calibration Error)< 0.03Across all classifiers
Human Review Catch Rate97%Of remaining model errors
Auto-Approval Rate82%Predictions above 0.75 threshold
SHAP Explanation Fidelity0.94Correlation with actual feature impact
Calibration RefreshWeeklyRolling 90-day validation window

📚 Training & Calibration Set

  • Calibration Data: Rolling 90-day window of predictions with ground truth outcomes
  • Platt Scaling: 3-fold cross-validation on validation folds
  • Isotonic Regression: Held-out predictions from last 30 days
  • Threshold Tuning: 0.75 selected to balance automation vs. accuracy
  • Human Feedback: All corrections logged and used for model improvement
  • Audit Trail: Every decision (auto or human) logged with timestamp and rationale

🎬 End-to-End Example

Scenario: Low-Confidence Compliance Classification

  1. Tool 11 Output: Predicts "SOX 404" compliance required with raw confidence 0.72
  2. Platt Scaling: Calibrates to 0.68 (below threshold)
  3. Routing: Flagged for human review (confidence < 0.75)
  4. Human Review: Consultant confirms SOX 404 does apply; corrects label
  5. Logging: Correction logged; model will be fine-tuned on this example

Result: Critical compliance requirement correctly identified; audit trail complete.