We gratefully acknowledge support from
the Simons Foundation and member institutions.

Audio and Speech Processing

Authors and titles for recent submissions, skipping first 30

[ total of 40 entries: 1-5 | ... | 16-20 | 21-25 | 26-30 | 31-35 | 36-40 ]
[ showing 5 entries per page: fewer | more | all ]

Fri, 17 May 2024 (continued, showing last 2 of 13 entries)

[31]  arXiv:2405.09589 (cross-list from cs.LG) [pdf, other]
Title: Unveiling Hallucination in Text, Image, Video, and Audio Foundation Models: A Comprehensive Survey
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[32]  arXiv:2405.09570 (cross-list from eess.SP) [pdf, other]
Title: FunnelNet: An End-to-End Deep Learning Framework to Monitor Digital Heart Murmur in Real-Time
Comments: 8-page main paper and 4-page supplementary material
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)

Thu, 16 May 2024 (showing first 3 of 8 entries)

[33]  arXiv:2405.09142 [pdf, other]
Title: Speaker Embeddings With Weakly Supervised Voice Activity Detection For Efficient Speaker Diarization
Comments: Proceedings of Odyssey 2024: The Speaker and Language Recognition Workshop
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[34]  arXiv:2405.09470 (cross-list from cs.SD) [pdf, other]
Title: Towards Evaluating the Robustness of Automatic Speech Recognition Systems via Audio Style Transfer
Comments: Accepted to SecTL (AsiaCCS Workshop) 2024
Subjects: Sound (cs.SD); Cryptography and Security (cs.CR); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[35]  arXiv:2405.09266 (cross-list from cs.CV) [pdf, other]
Title: Dance Any Beat: Blending Beats with Visuals in Dance Video Generation
Comments: 11 pages, 6 figures, demo page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[ total of 40 entries: 1-5 | ... | 16-20 | 21-25 | 26-30 | 31-35 | 36-40 ]
[ showing 5 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, new, 2405, contact, help  (Access key information)