kth.sePublications
System disruptions
We are currently experiencing disruptions on the search portals due to high traffic. We are working to resolve the issue, you may temporarily encounter an error message.
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Addressing Shortcomings of Explainable Machine Learning Methods
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.ORCID iD: 0000-0003-2745-6414
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Recently, machine learning algorithms have achieved state-of-the-art performance in real-life applications in various domains, but such algorithms tend to produce non-interpretable models. However, users often require an understanding of the reasoning behind predictions to trust the models and use them in decision-making. Therefore, explainable machine learning has gained attention as a way to achieve transparency while keeping the performance of state-of-the-art algorithms. Various methods have been proposed as a post-hoc remedy to explain the black-box models. However, such techniques are constrained in their ability to provide a comprehensive and faithful insight into the prediction process. For instance, many explanation methods based on additive importance scores generate explanations without assurance that the explanation provided reflects the model's reasoning. Other rule-based explanations can produce excessively specific explanations that occasionally exhibit poor fidelity, i.e., they lack faithfulness to the underlying black-box model. Furthermore, explanation methods are generally computationally expensive, making their application unrealistic in many real-world situations.

We aim to tackle several key limitations of explainable machine learning methods, with a focus on (i) low fidelity, (ii) the absence of validity guarantees, i.e., explaining without a pre-specified error rate, (iii) and high computational cost. Firstly, we propose a method that summarizes local explanations into a concise set of characteristic rules that can be evaluated with respect to their fidelity. We also investigate using Venn prediction to quantify the uncertainty of rule-based explanations. In addition, we propose to estimate the accuracy of approximate explanations and establish error bounds for the accuracy estimates using the conformal prediction framework. Secondly, we propose a method to approximate any score-based explanation technique using computationally efficient regression models and produce error bounds around the approximated importance scores using conformal regression. Moreover, we propose a novel method to approximate Shapley value explanations in real time, achieving high similarity to the ground truth while using a limited amount of data. Thirdly, we propose a method that restricts graph neural networks to generate inherently interpretable models, hence saving the time and resources required for post-hoc explanations while maintaining high fidelity. We also extend the graph neural networks approach to process heterogeneous tabular data. Finally, we present a method that learns a function to compute Shapley values, from which the predictions are directly obtained by summation, that is, the method can compute the Shapley values beforehand.

Empirical investigations of the proposed methods suggest that the fidelity of approximated explanations can vary based on the black-box predictor, dataset, and explanation method. The conformal prediction framework can be reliable in controlling the error level when timely explanations are required. Furthermore, constraining graph neural networks to produce inherently explainable models does not necessarily compromise predictive performance and can reduce the time and resources needed for post-hoc explanations.

Abstract [sv]

Nyligen har maskininlärningsalgoritmer uppnått högsta prestanda för till-ämpningar inom olika områden, men sådana algoritmer tenderar att generera modeller som är svåra att tolka. Användare kräver dock ofta en förståelse för resonemanget bakom modellens förutsägelser för att kunna lita på dessa och använda dem för beslutsfattande. Därför har förklarbar maskininlärning fått uppmärksamhet som ett sätt att uppnå transparens samtidigt som prestandan hos algoritmerna bibehålls. Olika metoder har föreslagits för att i efterhand förklara så kallade "svart låda"-modeller. Dessa tekniker är dock begränsade i sin förmåga att ge en utförlig och exakt inblick i hur förutsägelserna går till. Till exempel genererar många förklaringsmetoder baserade på additiv viktighet förklaringar utan att säkerställa att de reflekterar modellens verkliga resonemang. Andra regelbaserade förklaringar kan skapa alltför specifika förklaringar som ibland uppvisar låg tillförlitlighet, det vill säga att de inte är trogna den underliggande "svart låda"-modellen. Dessutom är förklaringsmetoder generellt sett beräkningsmässigt kostsamma, vilket gör deras tillämpning orealistisk i många verkliga situationer.

Vi strävar efter att ta itu med flera viktiga begränsningar hos förklarbara maskininlärningsmetoder, med fokus på (i) låg tillförlitlighet, (ii) avsaknaden av garantier för giltighet, det vill säga förklaringar ges utan en förutbestämd felmarginal, och (iii) höga beräkningskostnader. För det första föreslår vi en metod som sammanfattar lokala förklaringar i en koncis uppsättning karakteristiska regler som kan utvärderas med avseende på deras tillförlitlighet. Vi undersöker också användningen av Venn-prediktion för att kvantifiera osäkerheten i regelbaserade förklaringar. Dessutom föreslår vi att noggrannheten hos de approximativa förklaringarna uppskattas och att felgränser för dessa uppskattningar fastställs med hjälp av ramverket för konform prediktion. 

För det andra föreslår vi en metod för att approximera värdebaserade för-klaringstekniker genom att använda beräkningsmässigt effektiva regressionsmodeller och generera felgränser kring de approximativa värdena med hjälp av konform regressionsanalys. Vidare presenterar vi en ny metod för att approximera Shapley-värden i realtid, vilka uppnår hög likhet med de verkliga värdena samtidigt som en begränsad mängd data används. För det tredje föreslår vi en metod som begränsar grafneurala nätverk så att tolkningsbara modeller genereras, vilket sparar tid och resurser som annars skulle krävas för att generera förklaringar i efterhand, samtidigt som hög träffsäkerhet bibehålls. Vi utvidgar också tillämpningen av grafneurala nätverk till att hantera heterogena tabulära data. Slutligen presenterar vi en metod som lär sig en funktion för att beräkna Shapley-värden, från vilka förutsägelserna direkt erhålls genom summering, vilket innebär att metoden kan beräkna Shapley-värden i förväg.

Empiriska undersökningar av de föreslagna metoderna tyder på att pålitlighet hos approximativa förklaringar kan variera beroende på "svart låda"-modellen, datamängden och förklaringsmetoden. Ramverket för konform prediktion ger en tillförlitlig kontroll av felmarginalen när förklaringar krävs inom en snäv tidsram. Begränsningen av grafneurala nätverk till att generera förklarbara modeller leder inte nödvändigtvis till att den prediktiva prestandan försämras och dessutom kan tiden och resurserna som krävs för att generera förklaringar i efterhand minskas.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2025. , p. xii, 73
Series
TRITA-EECS-AVL ; 2025:11
National Category
Computer Sciences
Research subject
Information and Communication Technology
Identifiers
URN: urn:nbn:se:kth:diva-358366ISBN: 978-91-8106-107-9 (print)OAI: oai:DiVA.org:kth-358366DiVA, id: diva2:1927808
Public defence
2025-02-13, https://kth-se.zoom.us/j/66054420196, Ka-Sal B (Peter Weisglass), Kistagången 16, Electrum, KTH Kista, Stockholm, 13:00 (English)
Opponent
Supervisors
Note

QC 20250116

Available from: 2025-01-16 Created: 2025-01-15 Last updated: 2025-01-17Bibliographically approved
List of papers
1. Explaining Predictions by Characteristic Rules
Open this publication in new window or tab >>Explaining Predictions by Characteristic Rules
2023 (English)In: Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2022, Part I / [ed] Amini, MR Canu, S Fischer, A Guns, T Novak, PK Tsoumakas, G, Springer Nature , 2023, Vol. 13713, p. 389-403Conference paper, Published paper (Refereed)
Abstract [en]

Characteristic rules have been advocated for their ability to improve interpretability over discriminative rules within the area of rule learning. However, the former type of rule has not yet been used by techniques for explaining predictions. A novel explanation technique, called CEGA (Characteristic Explanatory General Association rules), is proposed, which employs association rule mining to aggregate multiple explanations generated by any standard local explanation technique into a set of characteristic rules. An empirical investigation is presented, in which CEGA is compared to two state-of-the-art methods, Anchors and GLocalX, for producing local and aggregated explanations in the form of discriminative rules. The results suggest that the proposed approach provides a better trade-off between fidelity and complexity compared to the two state-of-the-art approaches; CEGA and Anchors significantly outperform GLocalX with respect to fidelity, while CEGA and GLocalX significantly outperform Anchors with respect to the number of generated rules. The effect of changing the format of the explanations of CEGA to discriminative rules and using LIME and SHAP as local explanation techniques instead of Anchors are also investigated. The results show that the characteristic explanatory rules still compete favorably with rules in the standard discriminative format. The results also indicate that using CEGA in combination with either SHAP or Anchors consistently leads to a higher fidelity compared to using LIME as the local explanation technique.

Place, publisher, year, edition, pages
Springer Nature, 2023
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349
Keywords
Explainable machine learning, Rule mining
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-329374 (URN)10.1007/978-3-031-26387-3_24 (DOI)000999035400024 ()2-s2.0-85151060120 (Scopus ID)
Conference
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD), SEP 19-23, 2022, Grenoble, FRANCE
Note

QC 20230620

Available from: 2023-06-20 Created: 2023-06-20 Last updated: 2025-01-15Bibliographically approved
2. Assessing Explanation Quality by Venn Prediction
Open this publication in new window or tab >>Assessing Explanation Quality by Venn Prediction
2022 (English)In: Proceedings of the 11th Symposium on Conformal and Probabilistic Prediction with Applications, COPA 2022, ML Research Press , 2022, p. 42-54Conference paper, Published paper (Refereed)
Abstract [en]

Rules output by explainable machine learning techniques naturally come with a degree of uncertainty, as the complex functionality of the underlying black-box model often can be difficult to approximate by a single, interpretable rule. However, the uncertainty of these approximations is not properly quantified by current explanatory techniques. The use of Venn prediction is here proposed and investigated as a means to quantify the uncertainty of the explanations and thereby also allow for competing explanation techniques to be evaluated with respect to their relative uncertainty. A number of metrics of rule explanation quality based on uncertainty are proposed and discussed, including metrics that capture the tendency of the explanations to predict the correct outcome of a black-box model on new instances, how informative (tight) the produced intervals are, and how certain a rule is when predicting one class. An empirical investigation is presented, in which explanations produced by the state-of-the-art technique Anchors are compared to explanatory rules obtained from association rule mining. The results suggest that the association rule mining approach may provide explanations with less uncertainty towards the correct label, as predicted by the black-box model, compared to Anchors. The results also show that the explanatory rules obtained through association rule mining result in tighter intervals and are closer to either one or zero compared to Anchors, i.e., they are more certain towards a specific class label.

Place, publisher, year, edition, pages
ML Research Press, 2022
Keywords
Explainable machine learning, Rule mining, Venn prediction
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-334442 (URN)2-s2.0-85164705716 (Scopus ID)
Conference
11th Symposium on Conformal and Probabilistic Prediction with Applications, COPA 2022, Brighton, United Kingdom of Great Britain and Northern Ireland, Aug 24 2022 - Aug 26 2022
Note

QC 20230821

Available from: 2023-08-21 Created: 2023-08-21 Last updated: 2025-01-15Bibliographically approved
3. Approximating Score-based Explanation Techniques Using Conformal Regression
Open this publication in new window or tab >>Approximating Score-based Explanation Techniques Using Conformal Regression
2023 (English)In: Proceedings of the 12th Symposium on Conformal and Probabilistic Prediction with Applications, COPA 2023, ML Research Press , 2023, p. 450-469Conference paper, Published paper (Refereed)
Abstract [en]

Score-based explainable machine-learning techniques are often used to understand the logic behind black-box models. However, such explanation techniques are often computationally expensive, which limits their application in time-critical contexts. Therefore, we propose and investigate the use of computationally less costly regression models for approximating the output of score-based explanation techniques, such as SHAP. Moreover, validity guarantees for the approximated values are provided by the employed inductive conformal prediction framework. We propose several non-conformity measures designed to take the difficulty of approximating the explanations into account while keeping the computational cost low. We present results from a large-scale empirical investigation, in which the approximate explanations generated by our proposed models are evaluated with respect to efficiency (interval size). The results indicate that the proposed method can significantly improve execution time compared to the fast version of SHAP, TreeSHAP. The results also suggest that the proposed method can produce tight intervals, while providing validity guarantees. Moreover, the proposed approach allows for comparing explanations of different approximation methods and selecting a method based on how informative (tight) are the predicted intervals.

Place, publisher, year, edition, pages
ML Research Press, 2023
Keywords
Explainable machine learning, Inductive conformal prediction, Multi-target regression
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-340791 (URN)001221733900031 ()2-s2.0-85178664754 (Scopus ID)
Conference
12th Symposium on Conformal and Probabilistic Prediction with Applications, COPA 2023, Limassol, Cyprus, Sep 13 2023 - Sep 15 2023
Note

QC 20231215

Available from: 2023-12-15 Created: 2023-12-15 Last updated: 2025-01-15Bibliographically approved
4. Estimating Quality of Approximated Shapley Values Using Conformal Prediction
Open this publication in new window or tab >>Estimating Quality of Approximated Shapley Values Using Conformal Prediction
2024 (English)In: Proceedings of the Thirteenth Symposium on Conformal and Probabilistic Prediction with Applications, PMLR 230:158-174, 2024. / [ed] Vantini, Simone and Fontana, Matteo and Solari, Aldo and Boström, Henrik and Carlsson, Lars, 2024, Vol. 230, p. 158-174Conference paper, Published paper (Refereed)
Abstract [en]

Thanks to their theoretically proven properties, Shapley values have received a lot of attention as a means to explain predictions within the area of explainable machine learning. However, the computation of Shapley values is time-consuming and computationally expensive, in particular for datasets with high dimensionality, often rendering them impractical for generating timely explanations. Methods to approximate Shapley values, e.g., FastSHAP, offer a solution with adequate computational cost. However, such approximations come with a degree of uncertainty. Therefore, we propose a method to measure the fidelity of Shapley value approximations and use the conformal prediction framework to provide validity guarantees for the whole explanation in contrast to an earlier approach that offered validity guarantees on a per-feature importance basis, disregarding the relative importance of the remaining feature scores within the same explanation. We propose a set of difficulty estimation functions devised to consider the difficulty of explanation approximations. We provide a large-scale empirical investigation where the proposed difficulty estimators are evaluated with respect to their efficiency (interval size) in measuring the similarity to the ground truth Shapley values. The results suggest that the proposed approach can provide predictions coupled with informative validity guarantees (tight intervals), allowing the user to trust/reject the provided explanations based on their similarity to the ground truth values.

Series
Proceedings of Machine Learning Research
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-358342 (URN)
Conference
The Thirteenth Symposium on Conformal and Probabilistic Prediction with Applications
Note

QC 20250117

Available from: 2025-01-15 Created: 2025-01-15 Last updated: 2025-01-17Bibliographically approved
5. Fast Approximation of Shapley Values with Limited Data
Open this publication in new window or tab >>Fast Approximation of Shapley Values with Limited Data
2024 (English)In: The proceedings of the 14th Scandinavian Conference on Artificial Intelligence SCAI 2024, 2024, p. 95-100Conference paper, Published paper (Refereed)
Abstract [en]

Shapley values have multiple desired and theoretically proven properties for explaining black-box model predictions. However, the exact computation of Shapley values can be computationally very expensive, precluding their use when timely explanations are required. FastSHAP is an approach for fast approximation of Shapley values using a trained neural network (the explainer). A novel approach, called FF-SHAP, is proposed, which incorporates three modifications to FastSHAP: i) the explainer is trained on ground-truth explanations rather than a weighted least squares characterization of the Shapley values, ii) cosine similarity is used as a loss function instead of mean-squared error, and iii) the actual prediction of the underlying model is given as input to the explainer. An empirical investigation is presented showing that FF-SHAP significantly outperforms FastSHAP with respect to fidelity, measured using both cosine similarity and Spearman's rank-order correlation. The investigation further shows that FF-SHAP even outperforms FastSHAP when using substantially smaller amounts of data to train the explainer, and more importantly, FF-SHAP still maintains the performance level of FastSHAP even when trained with as little as 15% of training data.

Series
Linköping Electronic Conference Proceedings 208, ISSN 1650-3740
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-358345 (URN)10.3384/ecp208011 (DOI)
Conference
14th Scandinavian Conference on Artificial Intelligence SCAI 2024, June 10-11, 2024, Jönköping, Sweden
Note

Part of ISBN 978-91-8075-709-6

QC 20250117

Available from: 2025-01-15 Created: 2025-01-15 Last updated: 2025-01-17Bibliographically approved
6. Interpretable Graph Neural Networks for Tabular Data
Open this publication in new window or tab >>Interpretable Graph Neural Networks for Tabular Data
2024 (English)In: ECAI 2024 - 27th European Conference on Artificial Intelligence, Including 13th Conference on Prestigious Applications of Intelligent Systems, PAIS 2024, Proceedings, IOS Press , 2024, p. 1848-1855Conference paper, Published paper (Refereed)
Abstract [en]

Data in tabular format is frequently occurring in real-world applications.Graph Neural Networks (GNNs) have recently been extended to effectively handle such data, allowing feature interactions to be captured through representation learning.However, these approaches essentially produce black-box models, in the form of deep neural networks, precluding users from following the logic behind the model predictions.We propose an approach, called IGNNet (Interpretable Graph Neural Network for tabular data), which constrains the learning algorithm to produce an interpretable model, where the model shows how the predictions are exactly computed from the original input features.A large-scale empirical investigation is presented, showing that IGNNet is performing on par with state-ofthe-art machine-learning algorithms that target tabular data, including XGBoost, Random Forests, and TabNet.At the same time, the results show that the explanations obtained from IGNNet are aligned with the true Shapley values of the features without incurring any additional computational overhead.

Place, publisher, year, edition, pages
IOS Press, 2024
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-358264 (URN)10.3233/FAIA240697 (DOI)2-s2.0-85213390603 (Scopus ID)
Conference
27th European Conference on Artificial Intelligence, ECAI 2024, Santiago de Compostela, Spain, Oct 19 2024 - Oct 24 2024
Note

Part of ISBN 9781643685489

QC 20250114

Available from: 2025-01-08 Created: 2025-01-08 Last updated: 2025-01-15Bibliographically approved
7. Interpretable Graph Neural Networks for Heterogeneous Tabular Data
Open this publication in new window or tab >>Interpretable Graph Neural Networks for Heterogeneous Tabular Data
2024 (English)In: In Proceedings of the 27th International Conference on Discovery Science, DS 2024, Pisa, Italy, Springer Nature , 2024, p. 310-324Conference paper, Published paper (Refereed)
Abstract [en]

Many machine learning algorithms for tabular data produce black-box models, which prevent users from understanding the rationale behind the model predictions. In their unconstrained form, graph neural networks fall into this category, and they have further limited abilities to handle heterogeneous data. To overcome these limitations, an approach is proposed, called IGNH (Interpretable Graph Neural Network for Heterogeneous tabular data), which handles both categorical and numerical features, while constraining the learning process to generate exact feature attributions together with the predictions. A large-scale empirical investigation is presented, showing that the feature attributions provided by IGNH align with Shapley values that are computed post hoc. Furthermore, the results show that IGNH outperforms two powerful machine learning algorithms for tabular data, Random Forests and TabNet, while reaching a similar level of performance as XGBoost.

Place, publisher, year, edition, pages
Springer Nature, 2024
Keywords
Explainable Machine Learning, Graph Neural Networks, Machine Learning
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-358349 (URN)10.1007/978-3-031-78977-9_20 (DOI)2-s2.0-85218494615 (Scopus ID)
Conference
27th International Conference on Discovery Science, DS 2024, Pisa, Italy, October 14-16, 2024
Note

Part of ISBN 9783031789762

QC 20250308

Available from: 2025-01-15 Created: 2025-01-15 Last updated: 2025-03-08Bibliographically approved
8. Prediction Via Shapley Value Regression
Open this publication in new window or tab >>Prediction Via Shapley Value Regression
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Shapley values have several desirable properties for explaining black-box model predictions, which come with strong theoretical support. Traditionally, Shapley values are computed post-hoc, leading to additional computational cost at inference time. To overcome this, we introduce ViaSHAP, a novel approach that learns a function to compute Shapley values, from which the predictions can be derived directly by summation. We explore two learning approaches based on the universal approximation theorem and the Kolmogorov-Arnold representation theorem. Results from a large-scale empirical investigation are presented, in which the predictive performance of ViaSHAP is compared to state-of-the-art algorithms for tabular data, where the implementation using Kolmogorov-Arnold Networks showed a superior performance. It is also demonstrated that the explanations of ViaSHAP are accurate, and that the accuracy is controllable through the hyperparameters

National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-358351 (URN)
Note

QC 20250117

Available from: 2025-01-15 Created: 2025-01-15 Last updated: 2025-01-17Bibliographically approved

Open Access in DiVA

fulltext(5181 kB)678 downloads
File information
File name FULLTEXT01.pdfFile size 5181 kBChecksum SHA-512
6d5ecc82aa09bf1c6cef9e8e7ff528696843ee6b60a31d5f6165fe0645f64d70af593183a821b8c315b9b80e38936160a5bdbfa101dc8a30d47a42c1eccb49fe
Type fulltextMimetype application/pdf

Authority records

Alkhatib, Amr

Search in DiVA

By author/editor
Alkhatib, Amr
By organisation
Software and Computer systems, SCS
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 678 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 2269 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf