LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation

Zeng, Zhen; Wang, Jianzong; Cheng, Ning; Xiao, Jing

Full-text links:

Download:

Current browse context:

eess.AS

< prev | next >

new | recent | 2102

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation

Authors: Zhen Zeng, Jianzong Wang, Ning Cheng, Jing Xiao

(Submitted on 22 Feb 2021)

Abstract: In this paper, we propose a novel conditional convolution network, named location-variable convolution, to model the dependencies of the waveform sequence. Different from the use of unified convolution kernels in WaveNet to capture the dependencies of arbitrary waveform, the location-variable convolution uses convolution kernels with different coefficients to perform convolution operations on different waveform intervals, where the coefficients of kernels is predicted according to conditioning acoustic features, such as Mel-spectrograms. Based on location-variable convolutions, we design LVCNet for waveform generation, and apply it in Parallel WaveGAN to design more efficient vocoder. Experiments on the LJSpeech dataset show that our proposed model achieves a four-fold increase in synthesis speed compared to the original Parallel WaveGAN without any degradation in sound quality, which verifies the effectiveness of location-variable convolutions.

Comments:	Accepted to ICASSP 2021. arXiv admin note: text overlap with arXiv:2012.01684
Subjects:	Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2102.10815 [eess.AS]
	(or arXiv:2102.10815v1 [eess.AS] for this version)

Submission history

From: Jianzong Wang [view email]
[v1] Mon, 22 Feb 2021 07:55:34 GMT (387kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> eess > arXiv:2102.10815

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation

Submission history