Demystifying the Black Box: A Hands-On Guide to Explainable AI (XAI)


  • TL;DR
  • Introduction
  • The Core Engine: XGBoost on Heart Disease Data
  • Pillar 1: Permutation Importance (What does the model value?)
  • Pillar 2: Partial Dependence Plots (How do features affect predictions?)
  • Pillar 3: SHAP Values (Local & Global Explanations)
  • Closing Thoughts: Trust is the New Metric
  • TL;DR

    🧠 Explainability is Trust – High-performance models (like XGBoost) are often “black boxes”. Explainable AI (XAI) peels back the layers to reveal how and why they make decisions.
    🚀 Streamlit in action – We build and explain an interactive heart disease classifier, showing XAI in a clinical context using an interactive live application.
    🧩 Feature importance (ELI5) – Permutation Importance identifies which patient attributes (e.g. major vessels, cholesterol) carry the most predictive weight.
    📉 Partial Dependence (PDP) – Visualises the global relationship between specific inputs (like max heart rate) and heart disease risk.
    🔬 SHAP (Game Theory) – Provides both hyper-local explanations (why patient #1 is high-risk) and birds-eye global insights using coalition game theory.

    Introduction

    We are living in an era where Machine Learning models are making increasingly critical decisions. From approving credit cards to diagnosing life-threatening conditions, AI is driving automation at an unprecedented scale. But as models become more powerful like XGBoost, Deep Neural Networks, and Large Language Models; they also become more complex. The underlying algorithms acts like “black boxes”. Black boxes are systems where we feed inputs and receive highly accurate predictions, but have absolutely no visibility into the logic under the hood.

    In low-stakes scenarios (like movie recommendations), this opacity is acceptable. In high-stakes fields like healthcare and self driving cars, it is dangerous and could potential have real-world implications. A doctor cannot simply trust an AI that says, “This patient has a 90% chance of heart disease,” without asking why.

    This is where Explainable AI (XAI) steps in. XAI is the collection of tools and frameworks that make machine learning models transparent, interpretable, and accountable.

    To see this in action, we built an interactive Heart Disease Explainable AI Streamlit App.

    Let’s dive into how we can use three core pillars of explainability to dissect an XGBoost classifier trained on Cleveland heart disease data.

    If algorithms meet explainability requirements, they provide a basis for justifying decisions, tracking and thereby verifying them, improving the algorithms, and exploring new facts.

    The Core Engine: XGBoost on Heart Disease Data

    Our application uses the popular Kaggle Heart Disease dataset, which contains 14 patient attributes, including age, sex, chest pain type, max heart rate, and the number of major vessels.

    We trained an XGBoost Classifier on this dataset. XGBoost is a decision-tree-based ensemble algorithm known for high accuracy but notorious for its non-linear, complex decision boundaries.

    Once trained, the model achieves strong classification metrics on validation data. But to transition this model from a research experiment into a trustworthy clinical tool, we must explain its predictions. We do this using three distinct techniques.

    Pillar 1: Permutation Importance (What does the model value?)

    Before we look at individual patients, we want to know: what features does the model care about most across the entire dataset?

    One of the fastest and most intuitive ways to calculate this is Permutation Importance (rendered in our app via the eli5 library).

    How it works

  • We measure the model’s accuracy on a test dataset.
  • We select a single feature (e.g., cholestrol) and shuffle (permute) its values across the test rows, breaking the relationship between that feature and the target outcome.
  • We re-measure the model’s accuracy. If the accuracy drops significantly, it means the model relied heavily on that feature. If the accuracy barely changes, the feature is relatively unimportant.
  • from eli5.sklearn import PermutationImportance # Fit Permutation Importance on test data perm = PermutationImportance(model, random_state=1).fit(X_test, y_test)

    In our heart prediction model, the top 3 most important features identified are number_of_major_vesselscholestrol, and st_depression_rt_rest.

    This aligns perfectly with clinical literature: the number of clear major blood vessels and high cholesterol levels are leading physiological indicators of cardiovascular health.

    Pillar 2: Partial Dependence Plots (How do features affect predictions?)

    While Permutation Importance tells us which features are important, it doesn’t tell us the direction of the relationship. Does a higher heart rate increase or decrease the risk of heart disease?

    To answer this, we use Partial Dependence Plots (PDP), implemented using scikit-learn’s Partial Dependence Display.

    How it works

    PDPs isolate a single feature and show its marginal effect on the predicted outcome. The algorithm takes the dataset, forces the target feature to a specific value for all rows (e.g., setting every patient’s max_heart_rate to 140), and calculates the average predicted probability. It repeats this across a grid of values.

    In our Streamlit app, we render these plots dynamically using Matplotlib:

    from sklearn.inspection import PartialDependenceDisplay import matplotlib.pyplot as plt fig, ax = plt.subplots(figsize=(6, 4), dpi=100) pdp_iso = PartialDependenceDisplay.from_estimator( model, X_test, features=[feature_index], feature_names=list(features_list), ax=ax )

    Clinical Insights from PDP:

    • max_heart_rate: As the patient’s maximum heart rate increases, the PDP line rises, indicating a clear increase in heart disease probability.
    • number_of_major_vessels: Conversely, as the number of clear major vessels increases from 0 to 3, the risk curve drops sharply. More clear vessels mean better blood flow, reducing cardiovascular risk.

    Pillar 3: SHAP Values (Local & Global Explanations)

    If a doctor is sitting with a specific patient, general dataset-level statistics aren’t enough. They need to know: why did the AI predict that this specific patient has a high risk of heart disease?

    For local, patient-level explanations, we use SHAP (SHapley Additive exPlanations).

    Based on coalition game theory, SHAP treats each feature value of a patient as a “player” in a game, where the model’s prediction is the payout. It calculates the fair contribution of each feature to the difference between the actual prediction and the average baseline prediction.

    import shap explainer = shap.TreeExplainer(model) shap_values = explainer.shap_values(patient_data)

    Local Interpretability (Individual Patients)

    In our app, you can select individual patients and see their Force Plots:

    • Features that push the risk higher are shown in red/pink.
    • Features that pull the risk lower are shown in blue.

    For example, for a high-risk patient, SHAP highlights that being a male (sex_male = 1) and having a thalassemia defect (thalassemia_reversible_defect = 1) are the dominant drivers pushing their score above the baseline.

    Global Interpretability (Summary Plot)

    SHAP also provides a Summary Plot that combines local explanations across all patients. Each point on the plot represents a single patient:

    • Y-axis: Features ordered by overall impact.
    • X-axis: SHAP value (impact on model output).
    • Color: High feature values (red) vs. low feature values (blue).

    This single chart instantly reveals that high values of Thalassemia defects (red dots on the right side) push predictions higher, while having more major vessels (red dots on the left side) pulls predictions down.

    Closing Thoughts: Trust is the New Metric

    Subscribe to keep reading

    It's a subscribers only post. Subscribe to get access to the rest of this post and other subscriber-only content.

    Subscribe #AI #AIML #Data #DeepLearning #ELI5 #Explainability #GameTheory #PartialDependencyPlot #PDP #Python #ResponsibleAI #SHAP #Trust #XAI #XGBoost

    Firewalls are bouncers with a guest list 🚪🕶️

    They check who is coming, where they want to go, which door they use, and whether the rules allow it.

    The new Networking for Humans post explains firewalls without the headache.

    #Networking #Firewall #CyberSecurity #ELI5 #DevOps

    https://webdad.eu/2026/06/04/%f0%9f%94%a5-firewalls-are-bouncers-with-a-guest-list/

    DNS is the phonebook for the internet 📖🌐

    You type a friendly name.
    Your computer needs an address.
    DNS does the lookup — until it doesn’t, and suddenly everything feels broken.

    #Networking #DNS #ELI5 #DevOps #ITTraining

    https://webdad.eu/2026/05/28/%f0%9f%8c%90-dns-is-the-phonebook-for-the-internet/

    🌐 DNS Is the Phonebook for the Internet - WebDaD - Web Development and Design

    DNS explained simply with a phonebook metaphor. Learn how DNS turns human-friendly domain names into IP addresses, why DNS caching matters, and why everything can feel broken when DNS fails.

    WebDaD - Web Development and Design

    TCP is a polite conversation.
    UDP just yells into the void. 😄

    The first post in my Networking for Humans series explains TCP vs UDP with phone calls, playground shouting, and zero protocol headaches.

    #Networking #TCP #UDP #ELI5 #DevOps

    https://webdad.eu/2026/05/21/%f0%9f%8c%90-tcp-is-a-polite-conversation-udp-just-yells-into-the-void/

    🌐 TCP Is a Polite Conversation; UDP Just Yells Into the Void - WebDaD - Web Development and Design

    TCP vs UDP explained simply with human metaphors. Learn why TCP is like a polite phone conversation, while UDP is like shouting across a playground: fast, simple, and not always guaranteed.

    WebDaD - Web Development and Design
    @onekind I’d thought a win on indirect discrimination may have been preferable as that meant it reinforces the law towards protecting gender identity, that a finding on direct discrimination is a weaker precedent in the long term. But #IANAL, care to #ELI5?

    Is there any practical impact to this as someone who uses LE certificates on websites they host? It seems most like this is behind the scenes and doesn't directly impact security or validity of my own certificates.

    https://bugzilla.mozilla.org/show_bug.cgi?id=2038351 "Let's Encrypt: Gen Y Cross-Certified Subordinate CAs missing serverAuth EKU"

    #LetsEncrypt #ELI5

    2038351 - Let's Encrypt: Gen Y Cross-Certified Subordinate CAs missing serverAuth EKU

    ASSIGNED (pporada) in CA Program - CA Certificate Compliance. Last updated 2026-05-12.

    @poes dikira disini channel #ELI5 😛😛😛😛😛

    Some of these #CopyFail threads would be a perfect place to start the practice of having a #SimpleExplanation follower signpost toot. Even a #ELI5 or #FYI

    Any public announcement level info really.
    #EsotericThreads

    #ELI5 #guix is a package manager that works with #Guile, but it has a life of its own now. And it can package manage #Linux now too?

    Is it artificially aware yet? #MonsterSoftware #WereScheme

    We had unroll on Twitter. And we have #ELI5, but it's problematic.

    What do we have to signify this without making it verbose and taxing?