Toward a Theory of Causation for Interpreting Neural Code Models

Palacio, David N.; Velasco, Alejandro; Cooper, Nathan; Rodriguez, Alvaro; Moran, Kevin; Poshyvanyk, Denys

doi:10.1109/TSE.2024.3379943

Full-text links:

Download:

Current browse context:

cs.SE

< prev | next >

new | recent | 2302

Computer Science > Software Engineering

Title: Toward a Theory of Causation for Interpreting Neural Code Models

Authors: David N. Palacio, Alejandro Velasco, Nathan Cooper, Alvaro Rodriguez, Kevin Moran, Denys Poshyvanyk

(Submitted on 7 Feb 2023 (v1), last revised 28 Mar 2024 (this version, v5))

Abstract: Neural Language Models of Code, or Neural Code Models (NCMs), are rapidly progressing from research prototypes to commercial developer tools. As such, understanding the capabilities and limitations of such models is becoming critical. However, the abilities of these models are typically measured using automated metrics that often only reveal a portion of their real-world performance. While, in general, the performance of NCMs appears promising, currently much is unknown about how such models arrive at decisions. To this end, this paper introduces $do_{code}$, a post hoc interpretability method specific to NCMs that is capable of explaining model predictions. $do_{code}$ is based upon causal inference to enable programming language-oriented explanations. While the theoretical underpinnings of $do_{code}$ are extensible to exploring different model properties, we provide a concrete instantiation that aims to mitigate the impact of spurious correlations by grounding explanations of model behavior in properties of programming languages. To demonstrate the practical benefit of $do_{code}$, we illustrate the insights that our framework can provide by performing a case study on two popular deep learning architectures and ten NCMs. The results of this case study illustrate that our studied NCMs are sensitive to changes in code syntax. All our NCMs, except for the BERT-like model, statistically learn to predict tokens related to blocks of code (\eg brackets, parenthesis, semicolon) with less confounding bias as compared to other programming language constructs. These insights demonstrate the potential of $do_{code}$ as a useful method to detect and facilitate the elimination of confounding bias in NCMs.

Comments:	Accepted to appear in IEEE Transactions on Software Engineering
Subjects:	Software Engineering (cs.SE); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Methodology (stat.ME)
DOI:	10.1109/TSE.2024.3379943
Cite as:	arXiv:2302.03788 [cs.SE]
	(or arXiv:2302.03788v5 [cs.SE] for this version)

Submission history

From: David N. Palacio [view email]
[v1] Tue, 7 Feb 2023 22:56:58 GMT (3084kb,D)
[v2] Mon, 18 Mar 2024 22:41:04 GMT (7231kb,D)
[v3] Thu, 21 Mar 2024 13:30:22 GMT (7231kb,D)
[v4] Tue, 26 Mar 2024 15:41:28 GMT (7232kb,D)
[v5] Thu, 28 Mar 2024 01:36:14 GMT (7232kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2302.03788

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Software Engineering

Title: Toward a Theory of Causation for Interpreting Neural Code Models

Submission history