Audio and Speech Processing

Authors and titles for eess.AS in Mar 2024, skipping first 200

[ total of 213 entries: 1-25 | ... | 126-150 | 151-175 | 176-200 | 201-213 ]
[ showing 25 entries per page: fewer | more | all ]

[201] arXiv:2403.18811 (cross-list from cs.CV) [pdf, other]: Title: Duolando: Follower GPT with Off-Policy Reinforcement Learning for Dance Accompaniment

Authors: Li Siyao, Tianpei Gu, Zhitao Yang, Zhengyu Lin, Ziwei Liu, Henghui Ding, Lei Yang, Chen Change Loy

Comments: ICLR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[202] arXiv:2403.18821 (cross-list from cs.SD) [pdf, other]: Title: Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark

Authors: Ziyang Chen, Israel D. Gebru, Christian Richardt, Anurag Kumar, William Laney, Andrew Owens, Alexander Richard

Comments: Accepted to CVPR 2024. Project site: this https URL

Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[203] arXiv:2403.18843 (cross-list from cs.CV) [pdf, other]: Title: JEP-KD: Joint-Embedding Predictive Architecture Based Knowledge Distillation for Visual Speech Recognition

Authors: Chang Sun, Hong Yang, Bo Qin

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[204] arXiv:2403.19002 (cross-list from cs.MM) [pdf, other]: Title: Robust Active Speaker Detection in Noisy Environments

Authors: Siva Sai Nagender Vasireddy, Chenxu Zhang, Xiaohu Guo, Yapeng Tian

Comments: 15 pages, 5 figures

Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[205] arXiv:2403.19224 (cross-list from cs.SD) [pdf, other]: Title: Emotion Neural Transducer for Fine-Grained Speech Emotion Recognition

Authors: Siyuan Shen, Yu Gao, Feng Liu, Hanyang Wang, Aimin Zhou

Comments: Accepted by 49th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2024)

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[206] arXiv:2403.19441 (cross-list from cs.SD) [pdf, other]: Title: A Novel Stochastic Transformer-based Approach for Post-Traumatic Stress Disorder Detection using Audio Recording of Clinical Interviews

Authors: Mamadou Dia, Ghazaleh Khodabandelou, Alice Othmani

Journal-ref: 2023 IEEE 36th International Symposium on Computer-Based Medical Systems (2023) 700-705

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[207] arXiv:2403.19509 (cross-list from cs.CL) [pdf, ps, other]: Title: Phonetic Segmentation of the UCLA Phonetics Lab Archive

Authors: Eleanor Chodroff, Blaž Pažon, Annie Baker, Steven Moran

Comments: Accepted at LREC-COLING 2024

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[208] arXiv:2403.19634 (cross-list from cs.SD) [pdf, ps, other]: Title: Asymmetric and trial-dependent modeling: the contribution of LIA to SdSV Challenge Task 2

Authors: Pierre-Michel Bousquet, Mickael Rouvier

Comments: LIA system description for the Short Duration Speaker Verification (SdSv) challenge 2020 Task 2

Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[209] arXiv:2403.19638 (cross-list from cs.CV) [pdf, other]: Title: Siamese Vision Transformers are Scalable Audio-visual Learners

Authors: Yan-Bo Lin, Gedas Bertasius

Subjects: Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[210] arXiv:2403.19763 (cross-list from cs.SD) [pdf, other]: Title: Creating Aesthetic Sonifications on the Web with SIREN

Authors: Tristan Peng, Hongchan Choi, Jonathan Berger

Comments: 7 pages, 1 figure, 5 listings, submitted to the Web Audio Conference 2024

Subjects: Sound (cs.SD); Human-Computer Interaction (cs.HC); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[211] arXiv:2403.20130 (cross-list from cs.SD) [pdf, other]: Title: Sound event localization and classification using WASN in Outdoor Environment

Authors: Dongzhe Zhang, Jianfeng Chen, Jisheng Bai, Mou Wang

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[212] arXiv:2403.20202 (cross-list from cs.SD) [pdf, ps, other]: Title: Voice Signal Processing for Machine Learning. The Case of Speaker Isolation

Authors: Radan Ganchev

Comments: MSc. thesis. for associated source code, see this https URL

Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[213] arXiv:2403.20289 (cross-list from cs.CL) [pdf, other]: Title: Emotion-Anchored Contrastive Learning Framework for Emotion Recognition in Conversation

Authors: Fangxu Yu, Junjie Guo, Zhen Wu, Xinyu Dai

Comments: Accepted by Findings of NAACL 2024

Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)

[ total of 213 entries: 1-25 | ... | 126-150 | 151-175 | 176-200 | 201-213 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, 2405, contact, help (Access key information)

> eess > eess.AS

Audio and Speech Processing

Authors and titles for eess.AS in Mar 2024, skipping first 200