Audio and Speech Processing

Authors and titles for eess.AS in Feb 2021, skipping first 50

[ total of 208 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-125 | 126-150 | ... | 201-208 ]
[ showing 25 entries per page: fewer | more | all ]

[51] arXiv:2102.07961 [pdf, other]: Title: Semi-Supervised Singing Voice Separation with Noisy Self-Training

Authors: Zhepei Wang, Ritwik Giri, Umut Isik, Jean-Marc Valin, Arvindh Krishnaswamy

Comments: Accepted at 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2021)

Subjects: Audio and Speech Processing (eess.AS)
[52] arXiv:2102.08075 [pdf, other]: Title: Axial Residual Networks for CycleGAN-based Voice Conversion

Authors: Jaeseong You, Gyuhyeon Nam, Dalhyun Kim, Gyeongsu Chae

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[53] arXiv:2102.08328 [pdf, other]: Title: Context-Aware Prosody Correction for Text-Based Speech Editing

Authors: Max Morrison, Lucas Rencker, Zeyu Jin, Nicholas J. Bryan, Juan-Pablo Caceres, Bryan Pardo

Comments: To appear in proceedings of ICASSP 2021

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[54] arXiv:2102.08706 [pdf, other]: Title: Variational Autoencoder for Speech Enhancement with a Noise-Aware Encoder

Authors: Huajian Fang, Guillaume Carbajal, Stefan Wermter, Timo Gerkmann

Comments: ICASSP 2021. (c) 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

Journal-ref: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[55] arXiv:2102.09106 [pdf, other]: Title: Fundamental Frequency Feature Normalization and Data Augmentation for Child Speech Recognition

Authors: Gary Yeung, Ruchao Fan, Abeer Alwan

Comments: To be published in IEEE ICASSP

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[56] arXiv:2102.09168 [pdf, other]: Title: Gaussian Kernelized Self-Attention for Long Sequence Data and Its Application to CTC-based Speech Recognition

Authors: Yosuke Kashiwagi, Emiru Tsunoo, Shinji Watanabe

Comments: Accepted to ICASSP2021

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[57] arXiv:2102.09660 [pdf, other]: Title: Generative Speech Coding with Predictive Variance Regularization

Authors: W. Bastiaan Kleijn, Andrew Storus, Michael Chinen, Tom Denton, Felicia S. C. Lim, Alejandro Luebs, Jan Skoglund, Hengchin Yeh

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[58] arXiv:2102.09666 [pdf, other]: Title: Dynamic curriculum learning via data parameters for noise robust keyword spotting

Authors: Takuya Higuchi, Shreyas Saxena, Mehrez Souden, Tien Dung Tran, Masood Delfarah, Chandra Dhir

Comments: Accepted at ICASSP 2021

Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Sound (cs.SD)
[59] arXiv:2102.09838 [pdf, other]: Title: A Robust Maximum Likelihood Distortionless Response Beamformer based on a Complex Generalized Gaussian Distribution

Authors: Weixin Meng, Chengshi Zheng, Xiaodong Li

Subjects: Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[60] arXiv:2102.09853 [pdf, ps, other]: Title: Direction of Arrival Estimation of Noisy Speech Using Convolutional Recurrent Neural Networks with Higher-Order Ambisonics Signals

Authors: Nils Poschadel, Robert Hupke, Stephan Preihs, Jürgen Peissig

Comments: 5 pages, 6 figures. Accepted to EUSIPCO 2021

Subjects: Audio and Speech Processing (eess.AS)
[61] arXiv:2102.09918 [pdf, other]: Title: End-to-End Neural Systems for Automatic Children Speech Recognition: An Empirical Study

Authors: Prashanth Gurunath Shivakumar, Shrikanth Narayanan

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[62] arXiv:2102.09928 [pdf, other]: Title: Do End-to-End Speech Recognition Models Care About Context?

Authors: Lasse Borgholt, Jakob Drachmann Havtorn, Željko Agić, Anders Søgaard, Lars Maaløe, Christian Igel

Comments: Published in the proceedings of INTERSPEECH 2020, pp. 4352-4356

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[63] arXiv:2102.09939 [pdf, ps, other]: Title: ABSP System for The Third DIHARD Challenge

Authors: A Kishore Kumar, Shefali Waldekar, Goutam Saha, Md Sahidullah

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[64] arXiv:2102.09959 [pdf, other]: Title: Artificially Synthesising Data for Audio Classification and Segmentation to Improve Speech and Music Detection in Radio Broadcast

Authors: Satvik Venkatesh, David Moffat, Alexis Kirke, Gözel Shakeri, Stephen Brewster, Jörg Fachner, Helen Odell-Miller, Alex Street, Nicolas Farina, Sube Banerjee, Eduardo Reck Miranda

Comments: 5 pages, 3 figures, Accepted to ICASSP 2021

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[65] arXiv:2102.10345 [pdf, other]: Title: Model architectures to extrapolate emotional expressions in DNN-based text-to-speech

Authors: Katsuki Inoue, Sunao Hara, Masanobu Abe, Nobukatsu Hojo, Yusuke Ijima

Comments: This is the author's final draft. Accepted by Speech Communication. Please refer to the journal if you want

Subjects: Audio and Speech Processing (eess.AS)
[66] arXiv:2102.10376 [pdf, other]: Title: The Use of Voice Source Features for Sung Speech Recognition

Authors: Gerardo Roa Dabike, Jon Barker

Comments: Accepted to ICASSP 2021

Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Signal Processing (eess.SP)
[67] arXiv:2102.10449 [pdf, other]: Title: WARP-Q: Quality Prediction For Generative Neural Speech Codecs

Authors: Wissam A. Jassim, Jan Skoglund, Michael Chinen, Andrew Hines

Comments: Accepted for presentation at IEEE ICASSP 2021. Source code and data can be found on this https URL

Subjects: Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[68] arXiv:2102.10815 [pdf, other]: Title: LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation

Authors: Zhen Zeng, Jianzong Wang, Ning Cheng, Jing Xiao

Comments: Accepted to ICASSP 2021. arXiv admin note: text overlap with arXiv:2012.01684

Subjects: Audio and Speech Processing (eess.AS)
[69] arXiv:2102.11265 [pdf, other]: Title: Automated Evaluation Of Psychotherapy Skills Using Speech And Language Technologies

Authors: Nikolaos Flemotomos, Victor R. Martinez, Zhuohao Chen, Karan Singla, Victor Ardulov, Raghuveer Peri, Derek D. Caperton, James Gibson, Michael J. Tanana, Panayiotis Georgiou, Jake Van Epps, Sarah P. Lord, Tad Hirsch, Zac E. Imel, David C. Atkins, Shrikanth Narayanan

Comments: new version has an updated title

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[70] arXiv:2102.11480 [pdf, ps, other]: Title: Evolutionary optimization of contexts for phonetic correction in speech recognition systems

Authors: Rafael Viana-Cámara, Diego Campos-Sobrino, Mario Campos-Soberanis

Comments: 13 pages, 4 figures, This article is a translation of the paper "Optimizaci\'on evolutiva de contextos para la correcci\'on fon\'etica en sistemas de reconocimiento del habla" presented in COMIA 2019

Journal-ref: Research in Computing Science Issue 148(8), 2019, pp. 293-306. ISSN 1870-4069

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[71] arXiv:2102.11525 [pdf, other]: Title: End-to-End Dereverberation, Beamforming, and Speech Recognition with Improved Numerical Stability and Advanced Frontend

Authors: Wangyou Zhang, Christoph Boeddeker, Shinji Watanabe, Tomohiro Nakatani, Marc Delcroix, Keisuke Kinoshita, Tsubasa Ochiai, Naoyuki Kamo, Reinhold Haeb-Umbach, Yanmin Qian

Comments: 5 pages, 1 figure, accepted by ICASSP 2021

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[72] arXiv:2102.11594 [pdf, other]: Title: Unidirectional Memory-Self-Attention Transducer for Online Speech Recognition

Authors: Jian Luo, Jianzong Wang, Ning Cheng, Jing Xiao

Comments: Accepted to ICASSP 2021

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[73] arXiv:2102.11634 [pdf, other]: Title: Dual-Path Modeling for Long Recording Speech Separation in Meetings

Authors: Chenda Li, Zhuo Chen, Yi Luo, Cong Han, Tianyan Zhou, Keisuke Kinoshita, Marc Delcroix, Shinji Watanabe, Yanmin Qian

Comments: Accepted by ICASSP 2021

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[74] arXiv:2102.11906 [pdf, other]: Title: Handling Background Noise in Neural Speech Generation

Authors: Tom Denton, Alejandro Luebs, Felicia S. C. Lim, Andrew Storus, Hengchin Yeh, W. Bastiaan Kleijn, Jan Skoglund

Comments: 5 pages, 3 figures, presented at the Asilomar Conference on Signals, Systems, and Computers 2020

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[75] arXiv:2102.12078 [pdf, other]: Title: Speech Enhancement Using Multi-Stage Self-Attentive Temporal Convolutional Networks

Authors: Ju Lin, Adriaan J. van Wijngaarden, Kuang-Ching Wang, Melissa C. Smith

Comments: Preprint

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

[ total of 208 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-125 | 126-150 | ... | 201-208 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, 2405, contact, help (Access key information)

> eess > eess.AS

Audio and Speech Processing

Authors and titles for eess.AS in Feb 2021, skipping first 50