Automatic Data Augmentation for Domain Adapted Fine-Tuning of Self-Supervised Speech Representations

Zaiem, Salah; Parcollet, Titouan; Essid, Slim

Full-text links:

Download:

Current browse context:

eess.AS

< prev | next >

new | recent | 2306

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Automatic Data Augmentation for Domain Adapted Fine-Tuning of Self-Supervised Speech Representations

Authors: Salah Zaiem, Titouan Parcollet, Slim Essid

(Submitted on 1 Jun 2023)

Abstract: Self-Supervised Learning (SSL) has allowed leveraging large amounts of unlabeled speech data to improve the performance of speech recognition models even with small annotated datasets. Despite this, speech SSL representations may fail while facing an acoustic mismatch between the pretraining and target datasets. To address this issue, we propose a novel supervised domain adaptation method, designed for cases exhibiting such a mismatch in acoustic domains. It consists in applying properly calibrated data augmentations on a large clean dataset, bringing it closer to the target domain, and using it as part of an initial fine-tuning stage. Augmentations are automatically selected through the minimization of a conditional-dependence estimator, based on the target dataset. The approach is validated during an oracle experiment with controlled distortions and on two amateur-collected low-resource domains, reaching better performances compared to the baselines in both cases.

Comments:	6 pages,INTERSPEECH 2023
Subjects:	Audio and Speech Processing (eess.AS); Machine Learning (cs.LG)
Cite as:	arXiv:2306.00481 [eess.AS]
	(or arXiv:2306.00481v1 [eess.AS] for this version)

Submission history

From: Salah Zaiem [view email]
[v1] Thu, 1 Jun 2023 09:30:49 GMT (2506kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> eess > arXiv:2306.00481

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: Automatic Data Augmentation for Domain Adapted Fine-Tuning of Self-Supervised Speech Representations

Submission history