We gratefully acknowledge support from
the Simons Foundation and member institutions.

Audio and Speech Processing

Authors and titles for eess.AS in Dec 2023

[ total of 233 entries: 1-25 | 26-50 | 51-75 | 76-100 | ... | 226-233 ]
[ showing 25 entries per page: fewer | more | all ]
[1]  arXiv:2312.00174 [pdf, other]
Title: Compression of end-to-end non-autoregressive image-to-speech system for low-resourced devices
Comments: 5 pages, 2 figures, 2 tables, presented at the 15th ITG Conference on Speech Communications, September 2023, Aachen
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[2]  arXiv:2312.00231 [pdf, other]
Title: Learning domain-invariant classifiers for infant cry sounds
Subjects: Audio and Speech Processing (eess.AS)
[3]  arXiv:2312.00249 [pdf, other]
Title: Acoustic Prompt Tuning: Empowering Large Language Models with Audition Capabilities
Subjects: Audio and Speech Processing (eess.AS)
[4]  arXiv:2312.00698 [pdf, other]
Title: SPIRE-SIES: A Spontaneous Indian English Speech Corpus
Comments: 6 pages, 7 plots, 3 tables, Accepted at O-COCOSDA 2023
Subjects: Audio and Speech Processing (eess.AS)
[5]  arXiv:2312.01744 [pdf, other]
Title: SEFGAN: Harvesting the Power of Normalizing Flows and GANs for Efficient High-Quality Speech Enhancement
Comments: Preprint. Accepted to IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) 2023
Subjects: Audio and Speech Processing (eess.AS)
[6]  arXiv:2312.01808 [pdf, ps, other]
Title: Head Orientation Estimation with Distributed Microphones Using Speech Radiation Patterns
Comments: 6 pages, submitted to 57th Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, USA, 2023
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[7]  arXiv:2312.02581 [pdf, ps, other]
Title: Auralization based on multi-perspective ambisonic room impulse responses
Comments: 18 pages, published in Acta Acustica (Open Access), datasets are available via this https URL and this https URL
Journal-ref: Acta Acustica, Volume 4, Number 6, Article Number 25, 2020
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[8]  arXiv:2312.02683 [pdf, other]
Title: Diffusion-Based Speech Enhancement in Matched and Mismatched Conditions Using a Heun-Based Sampler
Comments: Accepted to ICASSP 2024
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[9]  arXiv:2312.03034 [pdf, other]
Title: Distributed Speech Dereverberation Using Weighted Prediction Error
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[10]  arXiv:2312.03129 [pdf, other]
Title: Leveraging Laryngograph Data for Robust Voicing Detection in Speech
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[11]  arXiv:2312.03324 [pdf, ps, other]
Title: Lightweight Speaker Verification Using Transformation Module with Feature Partition and Fusion
Comments: 12 pages, 5 figures, 6 tables; accepted for publication in IEEE-ACM TASLP
Subjects: Audio and Speech Processing (eess.AS)
[12]  arXiv:2312.03620 [pdf, other]
Title: Golden Gemini is All You Need: Finding the Sweet Spots for Speaker Verification
Comments: Accepted to IEEE/ACM Transactions on Audio, Speech, and Language Processing. Open Access: this https URL
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[13]  arXiv:2312.03668 [pdf, other]
Title: An Integration of Pre-Trained Speech and Language Models for End-to-End Speech Recognition
Comments: 6 pages, 2 figures, 3 tables, The model is available at this https URL
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[14]  arXiv:2312.03694 [pdf, other]
Title: Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers
Comments: The code is available at: this https URL
Subjects: Audio and Speech Processing (eess.AS)
[15]  arXiv:2312.04131 [pdf, other]
Title: Joint Training or Not: An Exploration of Pre-trained Speech Models in Audio-Visual Speaker Diarization
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[16]  arXiv:2312.04324 [pdf, other]
Title: DiaPer: End-to-End Neural Diarization with Perceiver-Based Attractors
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[17]  arXiv:2312.04370 [pdf, other]
Title: Investigating the Design Space of Diffusion Models for Speech Enhancement
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[18]  arXiv:2312.05173 [pdf, other]
Title: Binaural multichannel blind speaker separation with a causal low-latency and low-complexity approach
Comments: Accepted for publication at IEEE ICASSP 2024 OJSP track
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[19]  arXiv:2312.06065 [pdf, other]
Title: EEND-DEMUX: End-to-End Neural Speaker Diarization via Demultiplexed Speaker Embeddings
Comments: Submitted to IEEE Signal Processing Letters
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[20]  arXiv:2312.06270 [pdf, other]
Title: Testing Speech Emotion Recognition Machine Learning Models
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[21]  arXiv:2312.06907 [pdf, other]
Title: w2v-SELD: A Sound Event Localization and Detection Framework for Self-Supervised Spatial Audio Pre-Training
Comments: 17 pages, 5 figures
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[22]  arXiv:2312.07513 [pdf, other]
Title: NeuroHeed+: Improving Neuro-steered Speaker Extraction with Joint Auditory Attention Detection
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[23]  arXiv:2312.08089 [pdf, other]
Title: Audio Deepfake Detection with Self-Supervised WavLM and Multi-Fusion Attentive Classifier
Comments: Accepted to ICASSP 2024. 5 pages, 1 figure
Subjects: Audio and Speech Processing (eess.AS)
[24]  arXiv:2312.08132 [pdf, ps, other]
Title: Ultra Low Complexity Deep Learning Based Noise Suppression
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Signal Processing (eess.SP)
[25]  arXiv:2312.08496 [pdf, ps, other]
Title: Metrological support of acoustic measuring installations mid-frequency devices
Comments: 9 pages, 1 figure
Journal-ref: Environmental control systems. 2023. Issue. 2 (40). pp. 117-126
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[ total of 233 entries: 1-25 | 26-50 | 51-75 | 76-100 | ... | 226-233 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, 2405, contact, help  (Access key information)