Show simple item record

dc.contributor.authorGautam, Srishti
dc.contributor.authorBoubekki, Ahcene
dc.contributor.authorHöhne, Marina Marie-Claire
dc.contributor.authorKampffmeyer, Michael Christian
dc.date.accessioned2024-11-18T14:24:44Z
dc.date.available2024-11-18T14:24:44Z
dc.date.issued2024
dc.description.abstractExplainable AI (XAI) has unfolded in two distinct research directions with, on the one hand, post-hoc methods that explain the predictions of a pre-trained black-box model and, on the other hand, self-explainable models (SEMs) which are trained directly to provide explanations alongside their predictions. While the latter is preferred in safety-critical scenarios, post-hoc approaches have received the majority of attention until now, owing to their simplicity and ability to explain base models without retraining. Current SEMs, instead, require complex architectures and heavily regularized loss functions, thus necessitating specific and costly training. To address this shortcoming and facilitate wider use of SEMs, we propose a simple yet efficient universal method called KMEx (K-Means Explainer), which can convert any existing pre-trained model into a prototypical SEM. The motivation behind KMEx is to enhance transparency in deep learning-based decision-making via class-prototype-based explanations that are diverse and trustworthy without retraining the base model. We compare models obtained from KMEx to state-of-the-art SEMs using an extensive qualitative evaluation to highlight the strengths and weaknesses of each model, further paving the way toward a more reliable and objective evaluation of SEMs.en_US
dc.descriptionSource at <a href=https://jmlr.org/tmlr/>https://jmlr.org/tmlr/</a>.en_US
dc.identifier.citationGautam, Boubekki, Höhne, Kampffmeyer. Prototypical Self-Explainable Models Without Re-training. Transactions on Machine Learning Research (TMLR). 2024en_US
dc.identifier.cristinIDFRIDAID 2296090
dc.identifier.issn2835-8856
dc.identifier.urihttps://hdl.handle.net/10037/35756
dc.language.isoengen_US
dc.publisherTMLRen_US
dc.relation.journalTransactions on Machine Learning Research (TMLR)
dc.relation.projectIDNorges forskningsråd: 315029en_US
dc.relation.projectIDNorges forskningsråd: 309439en_US
dc.rights.accessRightsopenAccessen_US
dc.rights.holderCopyright 2024 The Author(s)en_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0en_US
dc.rightsAttribution 4.0 International (CC BY 4.0)en_US
dc.titlePrototypical Self-Explainable Models Without Re-trainingen_US
dc.type.versionpublishedVersionen_US
dc.typeJournal articleen_US
dc.typeTidsskriftartikkelen_US
dc.typePeer revieweden_US


File(s) in this item

Thumbnail

This item appears in the following collection(s)

Show simple item record

Attribution 4.0 International (CC BY 4.0)
Except where otherwise noted, this item's license is described as Attribution 4.0 International (CC BY 4.0)