Audio and Speech Processing

Authors and titles for eess.AS in Feb 2021, skipping first 25

[ total of 208 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-125 | ... | 201-208 ]
[ showing 25 entries per page: fewer | more | all ]

[26] arXiv:2102.04029 [pdf, ps, other]: Title: Non-linear frequency warping using constant-Q transformation for speech emotion recognition

Authors: Premjeet Singh, Goutam Saha, Md Sahidullah

Comments: Accepted for publication in 2021 IEEE International Conference on Computer Communication and Informatics (IEEE ICCCI 2021)

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Signal Processing (eess.SP)
[27] arXiv:2102.04144 [pdf, ps, other]: Title: Switching Variational Auto-Encoders for Noise-Agnostic Audio-visual Speech Enhancement

Authors: Mostafa Sadeghi, Xavier Alameda-Pineda

Comments: 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Subjects: Audio and Speech Processing (eess.AS); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD)
[28] arXiv:2102.04629 [pdf, other]: Title: Real-time Monaural Speech Enhancement With Short-time Discrete Cosine Transform

Authors: Qinglong Li, Fei Gao, Haixin Guan, Kaichi Ma

Comments: 5 pages, 2 figures, Journal submitted

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[29] arXiv:2102.04696 [pdf, other]: Title: Independent Vector Extraction for Fast Joint Blind Source Separation and Dereverberation

Authors: Rintaro Ikeshita, Tomohiro Nakatani

Comments: Accepted to IEEE Signal Processing Letters

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[30] arXiv:2102.04697 [pdf, other]: Title: Train your classifier first: Cascade Neural Networks Training from upper layers to lower layers

Authors: Shucong Zhang, Cong-Thanh Do, Rama Doddipatla, Erfan Loweimi, Peter Bell, Steve Renals

Comments: Accepted by ICASSP 2021

Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Sound (cs.SD)
[31] arXiv:2102.05109 [pdf, other]: Title: CDPAM: Contrastive learning for perceptual audio similarity

Authors: Pranay Manocha, Zeyu Jin, Richard Zhang, Adam Finkelstein

Comments: Dataset, code and sound examples can be found at this https URL

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[32] arXiv:2102.05245 [pdf, other]: Title: Low-Complexity, Real-Time Joint Neural Echo Control and Speech Enhancement Based On PercepNet

Authors: Jean-Marc Valin, Srikanth Tenneti, Karim Helwani, Umut Isik, Arvindh Krishnaswamy

Comments: Accepted for ICASSP 2021, 5 pages

Subjects: Audio and Speech Processing (eess.AS)
[33] arXiv:2102.05259 [pdf, other]: Title: VACE-WPE: Virtual Acoustic Channel Expansion Based On Neural Networks for Weighted Prediction Error-Based Speech Dereverberation

Authors: Joon-Young Yang, Joon-Hyuk Chang

Comments: 13 pages, 12 figures, 10 tables

Subjects: Audio and Speech Processing (eess.AS)
[34] arXiv:2102.05889 [pdf, other]: Title: ASVspoof 2019: spoofing countermeasures for the detection of synthesized, converted and replayed speech

Authors: Andreas Nautsch, Xin Wang, Nicholas Evans, Tomi Kinnunen, Ville Vestman, Massimiliano Todisco, Héctor Delgado, Md Sahidullah, Junichi Yamagishi, Kong Aik Lee

Journal-ref: IEEE Transactions on Biometrics, Behavior, and Identity Science 2021

Subjects: Audio and Speech Processing (eess.AS); Cryptography and Security (cs.CR); Sound (cs.SD)
[35] arXiv:2102.06200 [pdf, other]: Title: Efficient neural networks for real-time modeling of analog dynamic range compression

Authors: Christian J. Steinmetz, Joshua D. Reiss

Comments: Updated and will appear at 152nd AES Convention (note title change)

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[36] arXiv:2102.06237 [pdf, other]: Title: An Investigation of End-to-End Models for Robust Speech Recognition

Authors: Archiki Prasad, Preethi Jyothi, Rajbabu Velmurugan

Comments: Accepted to appear at ICASSP 2021

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[37] arXiv:2102.06306 [pdf, other]: Title: DEEPF0: End-To-End Fundamental Frequency Estimation for Music and Speech Signals

Authors: Satwinder Singh, Ruili Wang, Yuanhang Qiu

Comments: Accepted in ICASSP 2021

Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Sound (cs.SD)
[38] arXiv:2102.06322 [pdf, other]: Title: Joint Dereverberation and Separation with Iterative Source Steering

Authors: Taishi Nakashima, Robin Scheibler, Masahito Togami, Nobutaka Ono

Comments: 5 pages, 2 figures, accepted at ICASSP 2021

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)
[39] arXiv:2102.06332 [pdf, ps, other]: Title: Data Augmentation with Signal Companding for Detection of Logical Access Attacks

Authors: Rohan Kumar Das, Jichen Yang, Haizhou Li

Comments: 5 pages, Accepted for publication in International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2021

Subjects: Audio and Speech Processing (eess.AS)
[40] arXiv:2102.06454 [pdf, other]: Title: Guided Variational Autoencoder for Speech Enhancement With a Supervised Classifier

Authors: Guillaume Carbajal, Julius Richter, Timo Gerkmann

Journal-ref: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[41] arXiv:2102.06610 [pdf, other]: Title: Enhancing into the codec: Noise Robust Speech Coding with Vector-Quantized Autoencoders

Authors: Jonah Casebeer, Vinjai Vale, Umut Isik, Jean-Marc Valin, Ritwik Giri, Arvindh Krishnaswamy

Comments: 5 pages, 2 figures, ICASSP 2021

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG)
[42] arXiv:2102.06744 [pdf, ps, other]: Title: Hybrid phonetic-neural model for correction in speech recognition systems

Authors: Rafael Viana-Cámara, Mario Campos-Soberanis, Diego Campos-Sobrino

Comments: 13 pages, 3 figures, presented in COMIA 2020 (this http URL)

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)
[43] arXiv:2102.06816 [pdf, other]: Title: Bi-APC: Bidirectional Autoregressive Predictive Coding for Unsupervised Pre-training and Its Application to Children's ASR

Authors: Ruchao Fan, Amber Afshan, Abeer Alwan

Comments: Accepted to ICASSP2021

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG)
[44] arXiv:2102.07047 [pdf, other]: Title: Adversarial defense for automatic speaker verification by cascaded self-supervised learning models

Authors: Haibin Wu, Xu Li, Andy T. Liu, Zhiyong Wu, Helen Meng, Hung-yi Lee

Comments: Accepted to ICASSP 2021

Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI)
[45] arXiv:2102.07054 [pdf, other]: Title: Inverted Vocal Tract Variables and Facial Action Units to Quantify Neuromotor Coordination in Schizophrenia

Authors: Yashish Maduwantha H.P.E.R.S, Chris Kitchen, Deanna L. Kelly, Carol Espy-Wilson

Comments: Conference

Subjects: Audio and Speech Processing (eess.AS)
[46] arXiv:2102.07330 [pdf, other]: Title: A Modulation-Domain Loss for Neural-Network-based Real-time Speech Enhancement

Authors: Tyler Vuong, Yangyang Xia, Richard M. Stern

Comments: Accepted IEEE ICASSP 2021

Subjects: Audio and Speech Processing (eess.AS)
[47] arXiv:2102.07390 [pdf, other]: Title: Representation Learning For Speech Recognition Using Feedback Based Relevance Weighting

Authors: Purvi Agrawal, Sriram Ganapathy

Comments: arXiv admin note: substantial text overlap with arXiv:2011.00721, arXiv:2011.02136, arXiv:2001.07067

Journal-ref: IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP) 2021

Subjects: Audio and Speech Processing (eess.AS)
[48] arXiv:2102.07445 [pdf, other]: Title: On training targets for noise-robust voice activity detection

Authors: Sebastian Braun, Ivan Tashev

Journal-ref: 29th European Signal Processing Conference (EUSIPCO), 2021, Dublin, Ireland

Subjects: Audio and Speech Processing (eess.AS)
[49] arXiv:2102.07786 [pdf, other]: Title: PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components

Authors: Yukiya Hono, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda

Comments: 5 pages, accepted to ICASSP 2021

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Signal Processing (eess.SP)
[50] arXiv:2102.07955 [pdf, other]: Title: Deep Learning based Multi-Source Localization with Source Splitting and its Effectiveness in Multi-Talker Speech Recognition

Authors: Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, Dong Yu

Comments: Submitted to Computer Speech & Language

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

[ total of 208 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-125 | ... | 201-208 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, 2405, contact, help (Access key information)

> eess > eess.AS

Audio and Speech Processing

Authors and titles for eess.AS in Feb 2021, skipping first 25