We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.AI

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Artificial Intelligence

Title: SIDEs: Separating Idealization from Deceptive Explanations in xAI

Abstract: Explainable AI (xAI) methods are important for establishing trust in using black-box models. However, recent criticism has mounted against current xAI methods that they disagree, are necessarily false, and can be manipulated, which has started to undermine the deployment of black-box models. Rudin (2019) goes so far as to say that we should stop using black-box models altogether in high-stakes cases because xAI explanations "must be wrong". However, strict fidelity to the truth is historically not a desideratum in science. Idealizations -- the intentional distortions introduced to scientific theories and models -- are commonplace in the natural sciences and are seen as a successful scientific tool. Thus, it is not falsehood qua falsehood that is the issue. In this paper, I outline the need for xAI research to engage in idealization evaluation. Drawing on the use of idealizations in the natural sciences and philosophy of science, I introduce a novel framework for evaluating whether xAI methods engage in successful idealizations or deceptive explanations (SIDEs). SIDEs evaluates whether the limitations of xAI methods, and the distortions that they introduce, can be part of a successful idealization or are indeed deceptive distortions as critics suggest. I discuss the role that existing research can play in idealization evaluation and where innovation is necessary. Through a qualitative analysis we find that leading feature importance methods and counterfactual explanations are subject to idealization failure and suggest remedies for ameliorating idealization failure.
Comments: 18 pages, 3 figures, 2 tables Forthcoming in FAccT'24
Subjects: Artificial Intelligence (cs.AI); Computers and Society (cs.CY)
ACM classes: A.0; I.2.0; K.4.0
Cite as: arXiv:2404.16534 [cs.AI]
  (or arXiv:2404.16534v1 [cs.AI] for this version)

Submission history

From: Emily Sullivan [view email]
[v1] Thu, 25 Apr 2024 11:47:39 GMT (727kb,D)

Link back to: arXiv, form interface, contact.