dc.contributor.author | Gautam, Srishti | |
dc.contributor.author | Boubekki, Ahcene | |
dc.contributor.author | Höhne, Marina Marie-Claire | |
dc.contributor.author | Kampffmeyer, Michael Christian | |
dc.date.accessioned | 2024-11-18T14:24:44Z | |
dc.date.available | 2024-11-18T14:24:44Z | |
dc.date.issued | 2024 | |
dc.description.abstract | Explainable AI (XAI) has unfolded in two distinct research directions with, on the one hand,
post-hoc methods that explain the predictions of a pre-trained black-box model and, on the
other hand, self-explainable models (SEMs) which are trained directly to provide explanations
alongside their predictions. While the latter is preferred in safety-critical scenarios, post-hoc
approaches have received the majority of attention until now, owing to their simplicity and
ability to explain base models without retraining. Current SEMs, instead, require complex
architectures and heavily regularized loss functions, thus necessitating specific and costly
training. To address this shortcoming and facilitate wider use of SEMs, we propose a simple
yet efficient universal method called KMEx (K-Means Explainer), which can convert any
existing pre-trained model into a prototypical SEM. The motivation behind KMEx is to
enhance transparency in deep learning-based decision-making via class-prototype-based
explanations that are diverse and trustworthy without retraining the base model. We
compare models obtained from KMEx to state-of-the-art SEMs using an extensive qualitative
evaluation to highlight the strengths and weaknesses of each model, further paving the way
toward a more reliable and objective evaluation of SEMs. | en_US |
dc.description | Source at <a href=https://jmlr.org/tmlr/>https://jmlr.org/tmlr/</a>. | en_US |
dc.identifier.citation | Gautam, Boubekki, Höhne, Kampffmeyer. Prototypical Self-Explainable Models Without Re-training. Transactions on Machine Learning Research (TMLR). 2024 | en_US |
dc.identifier.cristinID | FRIDAID 2296090 | |
dc.identifier.issn | 2835-8856 | |
dc.identifier.uri | https://hdl.handle.net/10037/35756 | |
dc.language.iso | eng | en_US |
dc.publisher | TMLR | en_US |
dc.relation.journal | Transactions on Machine Learning Research (TMLR) | |
dc.relation.projectID | Norges forskningsråd: 315029 | en_US |
dc.relation.projectID | Norges forskningsråd: 309439 | en_US |
dc.rights.accessRights | openAccess | en_US |
dc.rights.holder | Copyright 2024 The Author(s) | en_US |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0 | en_US |
dc.rights | Attribution 4.0 International (CC BY 4.0) | en_US |
dc.title | Prototypical Self-Explainable Models Without Re-training | en_US |
dc.type.version | publishedVersion | en_US |
dc.type | Journal article | en_US |
dc.type | Tidsskriftartikkel | en_US |
dc.type | Peer reviewed | en_US |