Enhancing Audio Augmentation Methods with Consistency Learning

Iqbal, Turab; Helwani, Karim; Krishnaswamy, Arvindh; Wang, Wenwu

Full-text links:

Download:

Current browse context:

cs.SD

< prev | next >

new | recent | 2102

Computer Science > Sound

Title: Enhancing Audio Augmentation Methods with Consistency Learning

Authors: Turab Iqbal, Karim Helwani, Arvindh Krishnaswamy, Wenwu Wang

(Submitted on 9 Feb 2021 (v1), last revised 19 Apr 2021 (this version, v3))

Abstract: Data augmentation is an inexpensive way to increase training data diversity and is commonly achieved via transformations of existing data. For tasks such as classification, there is a good case for learning representations of the data that are invariant to such transformations, yet this is not explicitly enforced by classification losses such as the cross-entropy loss. This paper investigates the use of training objectives that explicitly impose this consistency constraint and how it can impact downstream audio classification tasks. In the context of deep convolutional neural networks in the supervised setting, we show empirically that certain measures of consistency are not implicitly captured by the cross-entropy loss and that incorporating such measures into the loss function can improve the performance of audio classification systems. Put another way, we demonstrate how existing augmentation methods can further improve learning by enforcing consistency.

Comments:	Accepted to 46th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2021)
Subjects:	Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2102.05151 [cs.SD]
	(or arXiv:2102.05151v3 [cs.SD] for this version)

Submission history

From: Turab Iqbal [view email]
[v1] Tue, 9 Feb 2021 22:01:58 GMT (33kb,D)
[v2] Tue, 23 Mar 2021 18:09:47 GMT (36kb,D)
[v3] Mon, 19 Apr 2021 15:04:13 GMT (34kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2102.05151

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Sound

Title: Enhancing Audio Augmentation Methods with Consistency Learning

Submission history