We gratefully acknowledge support from
the Simons Foundation and member institutions.

Multimedia

Authors and titles for cs.MM in Mar 2024, skipping first 100

[ total of 114 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-114 ]
[ showing 25 entries per page: fewer | more | all ]
[101]  arXiv:2403.18714 (cross-list from cs.CV) [pdf, other]
Title: Bringing Textual Prompt to AI-Generated Image Quality Assessment
Comments: 6 pages, 3 figures, accepted by ICME2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[102]  arXiv:2403.18715 (cross-list from cs.CV) [pdf, other]
Title: Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decoding
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Multimedia (cs.MM)
[103]  arXiv:2403.18821 (cross-list from cs.SD) [pdf, other]
Title: Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark
Comments: Accepted to CVPR 2024. Project site: this https URL
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[104]  arXiv:2403.19456 (cross-list from cs.CV) [pdf, other]
Title: Break-for-Make: Modular Low-Rank Adaptations for Composable Content-Style Customization
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Multimedia (cs.MM)
[105]  arXiv:2403.19651 (cross-list from cs.CV) [pdf, other]
Title: MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions
Comments: Work in progress
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Information Retrieval (cs.IR); Multimedia (cs.MM)
[106]  arXiv:2403.19723 (cross-list from cs.CL) [pdf, other]
Title: HGT: Leveraging Heterogeneous Graph-enhanced Large Language Models for Few-shot Complex Table Understanding
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Databases (cs.DB); Multimedia (cs.MM)
[107]  arXiv:2403.19763 (cross-list from cs.SD) [pdf, other]
Title: Creating Aesthetic Sonifications on the Web with SIREN
Comments: 7 pages, 1 figure, 5 listings, submitted to the Web Audio Conference 2024
Subjects: Sound (cs.SD); Human-Computer Interaction (cs.HC); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[108]  arXiv:2403.04804 (cross-list from eess.AS) [pdf, other]
Title: AttentionStitch: How Attention Solves the Speech Editing Problem
Comments: Accepted in Machine Learning for Audio workship in NeurIPS 2023
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Multimedia (cs.MM)
[109]  arXiv:2403.08505 (cross-list from eess.IV) [pdf, other]
Title: Content-aware Masked Image Modeling Transformer for Stereo Image Compression
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[110]  arXiv:2403.08551 (cross-list from eess.IV) [pdf, other]
Title: GaussianImage: 1000 FPS Image Representation and Compression by 2D Gaussian Splatting
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[111]  arXiv:2403.10936 (cross-list from eess.IV) [pdf, ps, other]
Title: Channel-wise Feature Decorrelation for Enhanced Learned Image Compression
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[112]  arXiv:2403.11155 (cross-list from eess.IV) [pdf, other]
Title: Interactive $360^{\circ}$ Video Streaming Using FoV-Adaptive Coding with Temporal Prediction
Subjects: Image and Video Processing (eess.IV); Multimedia (cs.MM)
[113]  arXiv:2403.15336 (cross-list from eess.AS) [pdf, other]
Title: Dialogue Understandability: Why are we streaming movies with subtitles?
Subjects: Audio and Speech Processing (eess.AS); Multimedia (cs.MM)
[114]  arXiv:2403.16143 (cross-list from eess.IV) [pdf, other]
Title: CFAT: Unleashing TriangularWindows for Image Super-resolution
Comments: Accepted to CVPR 2024
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[ total of 114 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-114 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, 2405, contact, help  (Access key information)