Audio and Speech Processing

Authors and titles for eess.AS in Feb 2021

[ total of 208 entries: 1-10 | 11-20 | 21-30 | 31-40 | ... | 201-208 ]
[ showing 10 entries per page: fewer | more | all ]

[1] arXiv:2102.00154 [pdf, ps, other]: Title: Semi-supervised Sound Event Detection using Random Augmentation and Consistency Regularization

Authors: Xiaofei Li

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[2] arXiv:2102.00184 [pdf, other]: Title: Adversarially learning disentangled speech representations for robust multi-factor voice conversion

Authors: Jie Wang, Jingbei Li, Xintao Zhao, Zhiyong Wu, Shiyin Kang, Helen Meng

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[3] arXiv:2102.00196 [pdf, ps, other]: Title: Directional Sparse Filtering using Weighted Lehmer Mean for Blind Separation of Unbalanced Speech Mixtures

Authors: Karn Watcharasupat, Anh H. T. Nguyen, Ching-Hui Ooi, Andy W. H. Khong

Comments: (c) 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

Journal-ref: Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, pp. 4485-4489

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Signal Processing (eess.SP)
[4] arXiv:2102.00270 [pdf, other]: Title: Enhancing the Intelligibility of Cleft Lip and Palate Speech using Cycle-consistent Adversarial Networks

Authors: Protima Nomo Sudro, Rohan Kumar Das, Rohit Sinha, S R Mahadeva Prasanna

Comments: 8 pages, 4 figures, IEEE spoken language and technology workshop

Subjects: Audio and Speech Processing (eess.AS)
[5] arXiv:2102.00306 [pdf, other]: Title: End-to-End Language Identification using Multi-Head Self-Attention and 1D Convolutional Neural Networks

Authors: Krishna D N, Ankita Patil

Comments: 5 pages, 1 figure

Subjects: Audio and Speech Processing (eess.AS)
[6] arXiv:2102.00804 [pdf, other]: Title: Phoneme-BERT: Joint Language Modelling of Phoneme Sequence and ASR Transcript

Authors: Mukuntha Narayanan Sundararaman, Ayush Kumar, Jithendra Vepa

Comments: Accepted to Interspeech 2021 conference

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)
[7] arXiv:2102.00850 [pdf, other]: Title: On Scaling Contrastive Representations for Low-Resource Speech Recognition

Authors: Lasse Borgholt, Tycho Max Sylvester Tax, Jakob Drachmann Havtorn, Lars Maaløe, Christian Igel

Comments: {\copyright} 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[8] arXiv:2102.01106 [pdf, other]: Title: Universal Neural Vocoding with Parallel WaveNet

Authors: Yunlong Jiao, Adam Gabrys, Georgi Tinchev, Bartosz Putrycz, Daniel Korzekwa, Viacheslav Klimkov

Comments: 5 pages, 2 figures. Accepted to ICASSP 2021

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[9] arXiv:2102.01326 [pdf, other]: Title: Multimodal Attention Fusion for Target Speaker Extraction

Authors: Hiroshi Sato, Tsubasa Ochiai, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Shoko Araki

Comments: 7 pages, 5 figures

Journal-ref: in IEEE Spoken Language Technology Workshop (SLT), 2021, pp. 778-784

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[10] arXiv:2102.01363 [pdf, other]: Title: The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-Vector Clustering Systems Combined by DOVER-Lap

Authors: Shota Horiguchi, Nelson Yalta, Paola Garcia, Yuki Takashima, Yawen Xue, Desh Raj, Zili Huang, Yusuke Fujita, Shinji Watanabe, Sanjeev Khudanpur

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)

[ total of 208 entries: 1-10 | 11-20 | 21-30 | 31-40 | ... | 201-208 ]
[ showing 10 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, 2405, contact, help (Access key information)

> eess > eess.AS

Audio and Speech Processing

Authors and titles for eess.AS in Feb 2021