Audio and Speech Processing

Authors and titles for eess.AS in Jun 2023

[ total of 377 entries: 1-25 | 26-50 | 51-75 | 76-100 | ... | 376-377 ]
[ showing 25 entries per page: fewer | more | all ]

[1] arXiv:2306.00160 [pdf, other]: Title: Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Model

Authors: Héctor Martel, Julius Richter, Kai Li, Xiaolin Hu, Timo Gerkmann

Comments: Accepted by Interspeech 2023

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[2] arXiv:2306.00203 [pdf, ps, other]: Title: Speaker-independent Speech Inversion for Estimation of Nasalance

Authors: Yashish M. Siriwardena, Carol Espy-Wilson, Suzanne Boyce, Mark K.Tiede, Liran Oren

Comments: Interspeech 2023

Subjects: Audio and Speech Processing (eess.AS)
[3] arXiv:2306.00331 [pdf, other]: Title: A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models

Authors: Pin-Jui Ku, Chao-Han Huck Yang, Sabato Marco Siniscalchi, Chin-Hui Lee

Comments: Accepted to Interspeech 2023. Code will be released at this https URL

Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Sound (cs.SD); Signal Processing (eess.SP); Systems and Control (eess.SY)
[4] arXiv:2306.00426 [pdf, ps, other]: Title: Speaker verification using attentive multi-scale convolutional recurrent network

Authors: Yanxiong Li, Zhongjie Jiang, Wenchang Cao, Qisheng Huang

Comments: 21 pages, 6 figures, 8 tables. Accepted for publication in Applied Soft Computing

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[5] arXiv:2306.00452 [pdf, ps, other]: Title: Speech Self-Supervised Representation Benchmarking: Are We Doing it Right?

Authors: Salah Zaiem, Youcef Kemiche, Titouan Parcollet, Slim Essid, Mirco Ravanelli

Comments: 6 pages

Journal-ref: INTERSPEECH 2023

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG)
[6] arXiv:2306.00481 [pdf, other]: Title: Automatic Data Augmentation for Domain Adapted Fine-Tuning of Self-Supervised Speech Representations

Authors: Salah Zaiem, Titouan Parcollet, Slim Essid

Comments: 6 pages,INTERSPEECH 2023

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG)
[7] arXiv:2306.00625 [pdf, other]: Title: Frame-wise and overlap-robust speaker embeddings for meeting diarization

Authors: Tobias Cord-Landwehr, Christoph Boeddeker, Cătălin Zorilă, Rama Doddipatla, Reinhold Haeb-Umbach

Comments: ICASSP 2023

Subjects: Audio and Speech Processing (eess.AS)
[8] arXiv:2306.00634 [pdf, other]: Title: A Teacher-Student approach for extracting informative speaker embeddings from speech mixtures

Authors: Tobias Cord-Landwehr, Christoph Boeddeker, Cătălin Zorilă, Rama Doddipatla, Reinhold Haeb-Umbach

Comments: Proceedings of INTERSPEECH

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[9] arXiv:2306.00736 [pdf, other]: Title: Spoken Language Identification System for English-Mandarin Code-Switching Child-Directed Speech

Authors: Shashi Kant Gupta, Sushant Hiray, Prashant Kukde

Comments: Accepted by Interspeech 2023, 5 pages, 1 figure, 4 tables

Journal-ref: Proc. INTERSPEECH 2023, 4114--4118

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[10] arXiv:2306.00812 [pdf, other]: Title: Harmonic enhancement using learnable comb filter for light-weight full-band speech enhancement model

Authors: Xiaohuai Le, Tong Lei, Li Chen, Yiqing Guo, Chao He, Cheng Chen, Xianjun Xia, Hua Gao, Yijian Xiao, Piao Ding, Shenyi Song, Jing Lu

Comments: accepted by Interspeech 2023

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[11] arXiv:2306.00952 [pdf, other]: Title: Meta-Learning Framework for End-to-End Imposter Identification in Unseen Speaker Recognition

Authors: Ashutosh Chaubey, Sparsh Sinha, Susmita Ghose

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[12] arXiv:2306.00996 [pdf, other]: Title: Weakly-supervised forced alignment of disfluent speech using phoneme-level modeling

Authors: Theodoros Kouzelis, Georgios Paraskevopoulos, Athanasios Katsamanis, Vassilis Katsouros

Comments: Interspeech 2023

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[13] arXiv:2306.00998 [pdf, other]: Title: Towards Selection of Text-to-speech Data to Augment ASR Training

Authors: Shuo Liu, Leda Sarı, Chunyang Wu, Gil Keren, Yuan Shangguan, Jay Mahadeokar, Ozlem Kalinli

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[14] arXiv:2306.01002 [pdf, other]: Title: Adaptive ship-radiated noise recognition with learnable fine-grained wavelet transform

Authors: Yuan Xie, Jiawei Ren, Ji Xu

Journal-ref: Ocean Engineering 265 (2022): 112626

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[15] arXiv:2306.01100 [pdf, other]: Title: ALO-VC: Any-to-any Low-latency One-shot Voice Conversion

Authors: Bohan Wang, Damien Ronssin, Milos Cernak

Comments: Accepted to Interspeech 2023. Some audio samples are available at this https URL

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[16] arXiv:2306.01208 [pdf, other]: Title: Adapting an Unadaptable ASR System

Authors: Rao Ma, Mengjie Qian, Mark J. F. Gales, Kate M. Knill

Comments: Proceedings of INTERSPEECH

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[17] arXiv:2306.01247 [pdf, other]: Title: Tensor decomposition for minimization of E2E SLU model toward on-device processing

Authors: Yosuke Kashiwagi, Siddhant Arora, Hayato Futami, Jessica Huynh, Shih-Lun Wu, Yifan Peng, Brian Yan, Emiru Tsunoo, Shinji Watanabe

Comments: Accepted by INTERSPEECH 2023

Subjects: Audio and Speech Processing (eess.AS)
[18] arXiv:2306.01296 [pdf, other]: Title: Improved Training for End-to-End Streaming Automatic Speech Recognition Model with Punctuation

Authors: Hanbyul Kim, Seunghyun Seo, Lukas Lee, Seolki Baek

Comments: Accepted at INTERSPEECH 2023

Journal-ref: Proc. INTERSPEECH 2023, 1653-1657

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)
[19] arXiv:2306.01332 [pdf, other]: Title: Differentiable Grey-box Modelling of Phaser Effects using Frame-based Spectral Processing

Authors: Alistair Carson, Cassia Valentini-Botinhao, Simon King, Stefan Bilbao

Comments: Accepted for publication in Proc. DAFx23, Copenhagen, Denmark, September 2023

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[20] arXiv:2306.01385 [pdf, ps, other]: Title: Task-Agnostic Structured Pruning of Speech Representation Models

Authors: Haoyu Wang, Siyuan Wang, Wei-Qiang Zhang, Hongbin Suo, Yulong Wan

Comments: Accepted by INTERSPEECH 2023

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[21] arXiv:2306.01411 [pdf, other]: Title: HD-DEMUCS: General Speech Restoration with Heterogeneous Decoders

Authors: Doyeon Kim, Soo-Whan Chung, Hyewon Han, Youna Ji, Hong-Goo Kang

Comments: Accepted by INTERSPEECH 2023

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[22] arXiv:2306.01425 [pdf, other]: Title: Active Noise Control in The New Century: The Role and Prospect of Signal Processing

Authors: Dongyuan Shi, Bhan Lam, Woon-Seng Gan, Jordan Cheer, Stephen J. Elliott

Comments: Submitted to inter.noise 2023, Chiba, Japan

Subjects: Audio and Speech Processing (eess.AS); Signal Processing (eess.SP); Systems and Control (eess.SY)
[23] arXiv:2306.01432 [pdf, other]: Title: Audio-Visual Speech Enhancement with Score-Based Generative Models

Authors: Julius Richter, Simone Frintrop, Timo Gerkmann

Comments: Submitted to ITG Conference on Speech Communication

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG)
[24] arXiv:2306.01433 [pdf, other]: Title: Blind Audio Bandwidth Extension: A Diffusion-Based Zero-Shot Approach

Authors: Eloi Moliner, Filip Elvander, Vesa Välimäki

Comments: Submitted to IEEE/ACM Transactions on Audio, Speech and Language Processing

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[25] arXiv:2306.01522 [pdf, ps, other]: Title: Auditory Representation Effective for Estimating Vocal Tract Information

Authors: Toshio Irino, Shintaro Doan

Comments: This manuscript is a revised version after acceptance for publication in Proc. APSIPA ASC 2023 on August 25, 2023

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

[ total of 377 entries: 1-25 | 26-50 | 51-75 | 76-100 | ... | 376-377 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, 2405, contact, help (Access key information)

> eess > eess.AS

Audio and Speech Processing

Authors and titles for eess.AS in Jun 2023