Deep functional multiple index models with an application to SER

Saumard, Matthieu; Haj, Abir El; Napoleon, Thibault

Full-text links:

Download:

Current browse context:

cs.SD

< prev | next >

new | recent | 2403

Computer Science > Sound

Title: Deep functional multiple index models with an application to SER

Authors: Matthieu Saumard, Abir El Haj, Thibault Napoleon

(Submitted on 26 Mar 2024)

Abstract: Speech Emotion Recognition (SER) plays a crucial role in advancing human-computer interaction and speech processing capabilities. We introduce a novel deep-learning architecture designed specifically for the functional data model known as the multiple-index functional model. Our key innovation lies in integrating adaptive basis layers and an automated data transformation search within the deep learning framework. Simulations for this new model show good performances. This allows us to extract features tailored for chunk-level SER, based on Mel Frequency Cepstral Coefficients (MFCCs). We demonstrate the effectiveness of our approach on the benchmark IEMOCAP database, achieving good performance compared to existing methods.

Comments:	5 pages, 1 figure
Subjects:	Sound (cs.SD); Audio and Speech Processing (eess.AS); Applications (stat.AP)
Cite as:	arXiv:2403.17562 [cs.SD]
	(or arXiv:2403.17562v1 [cs.SD] for this version)

Submission history

From: Matthieu Saumard [view email]
[v1] Tue, 26 Mar 2024 10:10:56 GMT (176kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2403.17562

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Sound

Title: Deep functional multiple index models with an application to SER

Submission history