Audio and Speech Processing

Authors and titles for recent submissions, skipping first 30

[ total of 108 entries: 1-10 | 11-20 | 21-30 | 31-40 | 41-50 | 51-60 | 61-70 | ... | 101-108 ]
[ showing 10 entries per page: fewer | more | all ]

Thu, 6 Jun 2024 (continued, showing 10 of 41 entries)

[31] arXiv:2406.02649 [pdf, other]: Title: Keyword-Guided Adaptation of Automatic Speech Recognition

Authors: Aviv Shamsian, Aviv Navon, Neta Glazer, Gill Hetz, Joseph Keshet

Comments: Accepted to InterSpeech 2024

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[32] arXiv:2406.02608 [pdf, other]: Title: PPINtonus: Early Detection of Parkinson's Disease Using Deep-Learning Tonal Analysis

Authors: Varun Reddy

Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[33] arXiv:2406.02572 [pdf, other]: Title: Selfsupervised learning for pathological speech detection

Authors: Shakeel Ahmad Sheikh

Comments: in Intersection of Book Chapter in Machine Leanring and Computational Social Sciences CRC (in progress) 2024

Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[34] arXiv:2406.02569 [pdf, other]: Title: Cluster-to-Predict Affect Contours from Speech

Authors: Gökhan Kuşçu, Engin Erzin

Comments: 8 pages, 3 figures

Subjects: Audio and Speech Processing (eess.AS); Human-Computer Interaction (cs.HC)
[35] arXiv:2406.02566 [pdf, other]: Title: Combining X-Vectors and Bayesian Batch Active Learning: Two-Stage Active Learning Pipeline for Speech Recognition

Authors: Ognjen Kundacina, Vladimir Vincan, Dragisa Miskovic

Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[36] arXiv:2406.02563 [pdf, other]: Title: A cost minimization approach to fix the vocabulary size in a tokenizer for an End-to-End ASR system

Authors: Sunil Kumar Kopparapu, Ashish Panda

Comments: 5 pages, 4 figures

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[37] arXiv:2406.02562 [pdf, other]: Title: Gated Low-rank Adaptation for personalized Code-Switching Automatic Speech Recognition on the low-spec devices

Authors: Gwantae Kim, Bokyeung Lee, Donghyeon Kim, Hanseok Ko

Comments: Table 2 is revised

Journal-ref: ICASSP 2024 Workshop(HSCMA 2024) paper

Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[38] arXiv:2406.02561 [pdf, ps, other]: Title: Breaking Walls: Pioneering Automatic Speech Recognition for Central Kurdish: End-to-End Transformer Paradigm

Authors: Abdulhady Abas Abdullah, Hadi Veisi, Tarik Rashid

Comments:

Subjects: Audio and Speech Processing (eess.AS)
[39] arXiv:2406.02560 [pdf, other]: Title: Less Peaky and More Accurate CTC Forced Alignment by Label Priors

Authors: Ruizhe Huang, Xiaohui Zhang, Zhaoheng Ni, Li Sun, Moto Hira, Jeff Hwang, Vimal Manohar, Vineel Pratap, Matthew Wiesner, Shinji Watanabe, Daniel Povey, Sanjeev Khudanpur

Comments: Accepted by ICASSP 2024. Github repo: this https URL

Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[40] arXiv:2406.02555 [pdf, ps, other]: Title: PhoWhisper: Automatic Speech Recognition for Vietnamese

Authors: Thanh-Thien Le, Linh The Nguyen, Dat Quoc Nguyen

Comments: Accepted to ICLR 2024 Tiny Papers Track

Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)

[ total of 108 entries: 1-10 | 11-20 | 21-30 | 31-40 | 41-50 | 51-60 | 61-70 | ... | 101-108 ]
[ showing 10 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, new, 2406, contact, help (Access key information)

> eess > eess.AS

Audio and Speech Processing

Authors and titles for recent submissions, skipping first 30

Thu, 6 Jun 2024 (continued, showing 10 of 41 entries)