We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CV

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computer Vision and Pattern Recognition

Title: Distilling Privileged Multimodal Information for Expression Recognition using Optimal Transport

Abstract: Deep learning models for multimodal expression recognition have reached remarkable performance in controlled laboratory environments because of their ability to learn complementary and redundant semantic information. However, these models struggle in the wild, mainly because of the unavailability and quality of modalities used for training. In practice, only a subset of the training-time modalities may be available at test time. Learning with privileged information enables models to exploit data from additional modalities that are only available during training. State-of-the-art knowledge distillation (KD) methods have been proposed to distill information from multiple teacher models (each trained on a modality) to a common student model. These privileged KD methods typically utilize point-to-point matching, yet have no explicit mechanism to capture the structural information in the teacher representation space formed by introducing the privileged modality. Experiments were performed on two challenging problems - pain estimation on the Biovid dataset (ordinal classification) and arousal-valance prediction on the Affwild2 dataset (regression). Results show that our proposed method can outperform state-of-the-art privileged KD methods on these problems. The diversity among modalities and fusion architectures indicates that PKDOT is modality- and model-agnostic.
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as: arXiv:2401.15489 [cs.CV]
  (or arXiv:2401.15489v3 [cs.CV] for this version)

Submission history

From: Muhammad Haseeb Aslam [view email]
[v1] Sat, 27 Jan 2024 19:44:15 GMT (20902kb,D)
[v2] Thu, 25 Apr 2024 20:17:08 GMT (23155kb,D)
[v3] Mon, 29 Apr 2024 01:01:35 GMT (23155kb,D)

Link back to: arXiv, form interface, contact.