Prototypical Self-Explainable Models Without Re-training

Gautam, Srishti; Boubekki, Ahcene; Höhne, Marina Marie-Claire; Kampffmeyer, Michael Christian

dc.contributor.author	Gautam, Srishti
dc.contributor.author	Boubekki, Ahcene
dc.contributor.author	Höhne, Marina Marie-Claire
dc.contributor.author	Kampffmeyer, Michael Christian
dc.date.accessioned	2024-11-18T14:24:44Z
dc.date.available	2024-11-18T14:24:44Z
dc.date.issued	2024
dc.description.abstract	Explainable AI (XAI) has unfolded in two distinct research directions with, on the one hand, post-hoc methods that explain the predictions of a pre-trained black-box model and, on the other hand, self-explainable models (SEMs) which are trained directly to provide explanations alongside their predictions. While the latter is preferred in safety-critical scenarios, post-hoc approaches have received the majority of attention until now, owing to their simplicity and ability to explain base models without retraining. Current SEMs, instead, require complex architectures and heavily regularized loss functions, thus necessitating specific and costly training. To address this shortcoming and facilitate wider use of SEMs, we propose a simple yet efficient universal method called KMEx (K-Means Explainer), which can convert any existing pre-trained model into a prototypical SEM. The motivation behind KMEx is to enhance transparency in deep learning-based decision-making via class-prototype-based explanations that are diverse and trustworthy without retraining the base model. We compare models obtained from KMEx to state-of-the-art SEMs using an extensive qualitative evaluation to highlight the strengths and weaknesses of each model, further paving the way toward a more reliable and objective evaluation of SEMs.	en_US
dc.description	Source at <a href=https://jmlr.org/tmlr/>https://jmlr.org/tmlr/</a>.	en_US
dc.identifier.citation	Gautam, Boubekki, Höhne, Kampffmeyer. Prototypical Self-Explainable Models Without Re-training. Transactions on Machine Learning Research (TMLR). 2024	en_US
dc.identifier.cristinID	FRIDAID 2296090
dc.identifier.issn	2835-8856
dc.identifier.uri	https://hdl.handle.net/10037/35756
dc.language.iso	eng	en_US
dc.publisher	TMLR	en_US
dc.relation.journal	Transactions on Machine Learning Research (TMLR)
dc.relation.projectID	Norges forskningsråd: 315029	en_US
dc.relation.projectID	Norges forskningsråd: 309439	en_US
dc.rights.accessRights	openAccess	en_US
dc.rights.holder	Copyright 2024 The Author(s)	en_US
dc.rights.uri	https://creativecommons.org/licenses/by/4.0	en_US
dc.rights	Attribution 4.0 International (CC BY 4.0)	en_US
dc.title	Prototypical Self-Explainable Models Without Re-training	en_US
dc.type.version	publishedVersion	en_US
dc.type	Journal article	en_US
dc.type	Tidsskriftartikkel	en_US
dc.type	Peer reviewed	en_US

Tilhørende fil(er)

Navn:: article.pdf
Størrelse:: 25.23Mb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Artikler, rapporter og annet (UB) [3071]

Vis enkel innførsel

Med mindre det står noe annet, er denne innførselens lisens beskrevet som Attribution 4.0 International (CC BY 4.0)