We gratefully acknowledge support from
the Simons Foundation and member institutions.

Audio and Speech Processing

Authors and titles for eess.AS in Jun 2023

[ total of 377 entries: 1-25 | 26-50 | 51-75 | 76-100 | ... | 376-377 ]
[ showing 25 entries per page: fewer | more | all ]
[1]  arXiv:2306.00160 [pdf, other]
Title: Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Model
Comments: Accepted by Interspeech 2023
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[2]  arXiv:2306.00203 [pdf, ps, other]
Title: Speaker-independent Speech Inversion for Estimation of Nasalance
Comments: Interspeech 2023
Subjects: Audio and Speech Processing (eess.AS)
[3]  arXiv:2306.00331 [pdf, other]
Title: A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models
Comments: Accepted to Interspeech 2023. Code will be released at this https URL
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Sound (cs.SD); Signal Processing (eess.SP); Systems and Control (eess.SY)
[4]  arXiv:2306.00426 [pdf, ps, other]
Title: Speaker verification using attentive multi-scale convolutional recurrent network
Comments: 21 pages, 6 figures, 8 tables. Accepted for publication in Applied Soft Computing
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[5]  arXiv:2306.00452 [pdf, ps, other]
Title: Speech Self-Supervised Representation Benchmarking: Are We Doing it Right?
Comments: 6 pages
Journal-ref: INTERSPEECH 2023
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG)
[6]  arXiv:2306.00481 [pdf, other]
Title: Automatic Data Augmentation for Domain Adapted Fine-Tuning of Self-Supervised Speech Representations
Comments: 6 pages,INTERSPEECH 2023
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG)
[7]  arXiv:2306.00625 [pdf, other]
Title: Frame-wise and overlap-robust speaker embeddings for meeting diarization
Comments: ICASSP 2023
Subjects: Audio and Speech Processing (eess.AS)
[8]  arXiv:2306.00634 [pdf, other]
Title: A Teacher-Student approach for extracting informative speaker embeddings from speech mixtures
Comments: Proceedings of INTERSPEECH
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[9]  arXiv:2306.00736 [pdf, other]
Title: Spoken Language Identification System for English-Mandarin Code-Switching Child-Directed Speech
Comments: Accepted by Interspeech 2023, 5 pages, 1 figure, 4 tables
Journal-ref: Proc. INTERSPEECH 2023, 4114--4118
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[10]  arXiv:2306.00812 [pdf, other]
Title: Harmonic enhancement using learnable comb filter for light-weight full-band speech enhancement model
Comments: accepted by Interspeech 2023
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[11]  arXiv:2306.00952 [pdf, other]
Title: Meta-Learning Framework for End-to-End Imposter Identification in Unseen Speaker Recognition
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[12]  arXiv:2306.00996 [pdf, other]
Title: Weakly-supervised forced alignment of disfluent speech using phoneme-level modeling
Comments: Interspeech 2023
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[13]  arXiv:2306.00998 [pdf, other]
Title: Towards Selection of Text-to-speech Data to Augment ASR Training
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[14]  arXiv:2306.01002 [pdf, other]
Title: Adaptive ship-radiated noise recognition with learnable fine-grained wavelet transform
Journal-ref: Ocean Engineering 265 (2022): 112626
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[15]  arXiv:2306.01100 [pdf, other]
Title: ALO-VC: Any-to-any Low-latency One-shot Voice Conversion
Comments: Accepted to Interspeech 2023. Some audio samples are available at this https URL
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[16]  arXiv:2306.01208 [pdf, other]
Title: Adapting an Unadaptable ASR System
Comments: Proceedings of INTERSPEECH
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[17]  arXiv:2306.01247 [pdf, other]
Title: Tensor decomposition for minimization of E2E SLU model toward on-device processing
Comments: Accepted by INTERSPEECH 2023
Subjects: Audio and Speech Processing (eess.AS)
[18]  arXiv:2306.01296 [pdf, other]
Title: Improved Training for End-to-End Streaming Automatic Speech Recognition Model with Punctuation
Comments: Accepted at INTERSPEECH 2023
Journal-ref: Proc. INTERSPEECH 2023, 1653-1657
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)
[19]  arXiv:2306.01332 [pdf, other]
Title: Differentiable Grey-box Modelling of Phaser Effects using Frame-based Spectral Processing
Comments: Accepted for publication in Proc. DAFx23, Copenhagen, Denmark, September 2023
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[20]  arXiv:2306.01385 [pdf, ps, other]
Title: Task-Agnostic Structured Pruning of Speech Representation Models
Comments: Accepted by INTERSPEECH 2023
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[21]  arXiv:2306.01411 [pdf, other]
Title: HD-DEMUCS: General Speech Restoration with Heterogeneous Decoders
Comments: Accepted by INTERSPEECH 2023
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[22]  arXiv:2306.01425 [pdf, other]
Title: Active Noise Control in The New Century: The Role and Prospect of Signal Processing
Comments: Submitted to inter.noise 2023, Chiba, Japan
Subjects: Audio and Speech Processing (eess.AS); Signal Processing (eess.SP); Systems and Control (eess.SY)
[23]  arXiv:2306.01432 [pdf, other]
Title: Audio-Visual Speech Enhancement with Score-Based Generative Models
Comments: Submitted to ITG Conference on Speech Communication
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG)
[24]  arXiv:2306.01433 [pdf, other]
Title: Blind Audio Bandwidth Extension: A Diffusion-Based Zero-Shot Approach
Comments: Submitted to IEEE/ACM Transactions on Audio, Speech and Language Processing
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[25]  arXiv:2306.01522 [pdf, ps, other]
Title: Auditory Representation Effective for Estimating Vocal Tract Information
Comments: This manuscript is a revised version after acceptance for publication in Proc. APSIPA ASC 2023 on August 25, 2023
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[ total of 377 entries: 1-25 | 26-50 | 51-75 | 76-100 | ... | 376-377 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, 2405, contact, help  (Access key information)