The DKU-Duke-Lenovo System Description for the Third DIHARD Speech Diarization Challenge

Wang, Weiqing; Lin, Qingjian; Cai, Danwei; Yang, Lin; Li, Ming

Full-text links:

Download:

Current browse context:

eess.AS

< prev | next >

new | recent | 2102

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: The DKU-Duke-Lenovo System Description for the Third DIHARD Speech Diarization Challenge

Authors: Weiqing Wang, Qingjian Lin, Danwei Cai, Lin Yang, Ming Li

(Submitted on 6 Feb 2021)

Abstract: In this paper, we present the submitted system for the third DIHARD Speech Diarization Challenge from the DKU-Duke-Lenovo team. Our system consists of several modules: voice activity detection (VAD), segmentation, speaker embedding extraction, attentive similarity scoring, agglomerative hierarchical clustering. In addition, the target speaker VAD (TSVAD) is used for the phone call data to further improve the performance. Our final submitted system achieves a DER of 15.43% for the core evaluation set and 13.39% for the full evaluation set on task 1, and we also get a DER of 21.63% for core evaluation set and 18.90% for full evaluation set on task 2.

Subjects:	Audio and Speech Processing (eess.AS); Sound (cs.SD)
Cite as:	arXiv:2102.03649 [eess.AS]
	(or arXiv:2102.03649v1 [eess.AS] for this version)

Submission history

From: Weiqing Wang [view email]
[v1] Sat, 6 Feb 2021 19:41:42 GMT (462kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> eess > arXiv:2102.03649

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Electrical Engineering and Systems Science > Audio and Speech Processing

Title: The DKU-Duke-Lenovo System Description for the Third DIHARD Speech Diarization Challenge

Submission history