Enhancing Generalization in Audio Deepfake Detection: A Neural Collapse based Sampling and Training Approach

Yousif, Mohammed; Mathew, Jonat John; Pallan, Huzaifa; Padda, Agamjeet Singh; Shah, Syed Daniyal; Adamski, Sara; Reddiboina, Madhu; Pankajakshan, Arjun

Full-text links:

Download:

Current browse context:

cs.SD

< prev | next >

new | recent | 2404

Computer Science > Sound

Title: Enhancing Generalization in Audio Deepfake Detection: A Neural Collapse based Sampling and Training Approach

Authors: Mohammed Yousif, Jonat John Mathew, Huzaifa Pallan, Agamjeet Singh Padda, Syed Daniyal Shah, Sara Adamski, Madhu Reddiboina, Arjun Pankajakshan

(Submitted on 19 Apr 2024)

Abstract: Generalization in audio deepfake detection presents a significant challenge, with models trained on specific datasets often struggling to detect deepfakes generated under varying conditions and unknown algorithms. While collectively training a model using diverse datasets can enhance its generalization ability, it comes with high computational costs. To address this, we propose a neural collapse-based sampling approach applied to pre-trained models trained on distinct datasets to create a new training database. Using ASVspoof 2019 dataset as a proof-of-concept, we implement pre-trained models with Resnet and ConvNext architectures. Our approach demonstrates comparable generalization on unseen data while being computationally efficient, requiring less training data. Evaluation is conducted using the In-the-wild dataset.

Subjects:	Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2404.13008 [cs.SD]
	(or arXiv:2404.13008v1 [cs.SD] for this version)

Submission history

From: Arjun Pankajakshan [view email]
[v1] Fri, 19 Apr 2024 17:13:21 GMT (7167kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2404.13008

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Sound

Title: Enhancing Generalization in Audio Deepfake Detection: A Neural Collapse based Sampling and Training Approach

Submission history