Constrained Decoding for Secure Code Generation

Fu, Yanjun; Baker, Ethan; Chen, Yizheng

Full-text links:

Download:

Current browse context:

cs.CR

< prev | next >

new | recent | 2405

Computer Science > Cryptography and Security

Title: Constrained Decoding for Secure Code Generation

Authors: Yanjun Fu, Ethan Baker, Yizheng Chen

(Submitted on 30 Apr 2024)

Abstract: Code Large Language Models (Code LLMs) have been increasingly used by developers to boost productivity, but they often generate vulnerable code. Thus, there is an urgent need to ensure that code generated by Code LLMs is correct and secure. Previous research has primarily focused on generating secure code, overlooking the fact that secure code also needs to be correct. This oversight can lead to a false sense of security. Currently, the community lacks a method to measure actual progress in this area, and we need solutions that address both security and correctness of code generation.
This paper introduces a new benchmark, CodeGuard+, along with two new metrics, secure-pass@k and secure@$k_{\text{pass}}$, to measure Code LLMs' ability to generate both secure and correct code. Using our new evaluation methods, we show that the state-of-the-art defense technique, prefix tuning, may not be as strong as previously believed, since it generates secure code but sacrifices functional correctness. We also demonstrate that different decoding methods significantly affect the security of Code LLMs.
Furthermore, we explore a new defense direction: constrained decoding for secure code generation. We propose new constrained decoding techniques to generate code that satisfies security and correctness constraints simultaneously. Our results reveal that constrained decoding is more effective than prefix tuning to improve the security of Code LLMs, without requiring a specialized training dataset. Moreover, constrained decoding can be used together with prefix tuning to further improve the security of Code LLMs.

Comments:	17 pages, 8 figures
Subjects:	Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Software Engineering (cs.SE)
Cite as:	arXiv:2405.00218 [cs.CR]
	(or arXiv:2405.00218v1 [cs.CR] for this version)

Submission history

From: Yanjun Fu [view email]
[v1] Tue, 30 Apr 2024 21:52:19 GMT (90kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2405.00218

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Computer Science > Cryptography and Security

Title: Constrained Decoding for Secure Code Generation

Submission history