Multimedia

Authors and titles for cs.MM in Mar 2024

[ total of 114 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-114 ]
[ showing 25 entries per page: fewer | more | all ]

[1] arXiv:2403.00752 [pdf, other]: Title: An Experimental Study of Low-Latency Video Streaming over 5G

Authors: Imran Khan, Tuyen X. Tran, Matti Hiltunen, Theodore Karagioules, Dimitrios Koutsonikolas

Comments: 6 Pages

Subjects: Multimedia (cs.MM); Performance (cs.PF)
[2] arXiv:2403.01087 [pdf, other]: Title: Towards Accurate Lip-to-Speech Synthesis in-the-Wild

Authors: Sindhu Hegde, Rudrabha Mukhopadhyay, C.V. Jawahar, Vinay Namboodiri

Comments: 8 pages of content, 1 page of references and 4 figures

Journal-ref: In Proceedings of the 31st ACM International Conference on Multimedia, 2023

Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[3] arXiv:2403.02693 [pdf, other]: Title: Optimizing Mobile-Friendly Viewport Prediction for Live 360-Degree Video Streaming

Authors: Lei Zhang, Tao Long, Weizhen Xu, Laizhong Cui, Jiangchuan Liu

Comments: 14 pages

Subjects: Multimedia (cs.MM); Image and Video Processing (eess.IV)
[4] arXiv:2403.02905 [pdf, other]: Title: MMoFusion: Multi-modal Co-Speech Motion Generation with Diffusion Model

Authors: Sen Wang, Jiangning Zhang, Weijian Cao, Xiaobin Hu, Moran Li, Xiaozhong Ji, Xin Tan, Mengtian Li, Zhifeng Xie, Chengjie Wang, Lizhuang Ma

Subjects: Multimedia (cs.MM)
[5] arXiv:2403.03170 [pdf, other]: Title: SNIFFER: Multimodal Large Language Model for Explainable Out-of-Context Misinformation Detection

Authors: Peng Qi, Zehong Yan, Wynne Hsu, Mong Li Lee

Comments: To appear in CVPR 2024

Subjects: Multimedia (cs.MM); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)
[6] arXiv:2403.05060 [pdf, other]: Title: Multimodal Infusion Tuning for Large Models

Authors: Hao Sun, Yu Song, Jihong Hu, Xinyao Yu, Jiaqing Liu, Yen-Wei Chen, Lanfen Lin

Subjects: Multimedia (cs.MM); Human-Computer Interaction (cs.HC)
[7] arXiv:2403.05427 [pdf, other]: Title: Reply with Sticker: New Dataset and Model for Sticker Retrieval

Authors: Bin Liang, Bingbing Wang, Zhixin Bai, Qiwei Lang, Mingwei Sun, Kaiheng Hou, Kam-Fai Wong, Ruifeng Xu

Subjects: Multimedia (cs.MM)
[8] arXiv:2403.05428 [pdf, other]: Title: Towards Real-World Stickers Use: A New Dataset for Multi-Tag Sticker Recognition

Authors: Bingbing Wang, Bin Liang, Chun-Mei Feng, Wangmeng Zuo, Zhixin Bai, Shijue Huang, Kam-Fai Wong, Ruifeng Xu

Subjects: Multimedia (cs.MM)
[9] arXiv:2403.05628 [pdf, other]: Title: AMUSE: Adaptive Multi-Segment Encoding for Dataset Watermarking

Authors: Saeed Ranjbar Alvar, Mohammad Akbari, David (Ming Xuan)Yue, Lingyang Chu, Yong Zhang

Subjects: Multimedia (cs.MM); Cryptography and Security (cs.CR)
[10] arXiv:2403.05834 [pdf, other]: Title: Enhancing Expressiveness in Dance Generation via Integrating Frequency and Music Style Information

Authors: Qiaochu Huang, Xu He, Boshi Tang, Haolin Zhuang, Liyang Chen, Shuochen Gao, Zhiyong Wu, Haozhi Huang, Helen Meng

Subjects: Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[11] arXiv:2403.05851 [pdf, other]: Title: Interest-Aware Joint Caching, Computing, and Communication Optimization for Mobile VR Delivery in MEC Networks

Authors: Baojie Fu, Tong Tang, Dapeng Wu, Ruyan Wang

Subjects: Multimedia (cs.MM); Emerging Technologies (cs.ET)
[12] arXiv:2403.06660 [pdf, other]: Title: FashionReGen: LLM-Empowered Fashion Report Generation

Authors: Yujuan Ding, Yunshan Ma, Wenqi Fan, Yige Yao, Tat-Seng Chua, Qing Li

Subjects: Multimedia (cs.MM); Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)
[13] arXiv:2403.10406 [pdf, other]: Title: Deep Bi-directional Attention Network for Image Super-Resolution Quality Assessment

Authors: Yixiao Li, Xiaoyuan Yang, Jun Fu, Guanghui Yue, Wei Zhou

Comments: 7 pages, 3 figures, published to 2024 IEEE International Conference on Multimedia and Expo (ICME)

Subjects: Multimedia (cs.MM)
[14] arXiv:2403.10943 [pdf, other]: Title: MIntRec2.0: A Large-scale Benchmark Dataset for Multimodal Intent Recognition and Out-of-scope Detection in Conversations

Authors: Hanlei Zhang, Xin Wang, Hua Xu, Qianrui Zhou, Kai Gao, Jianhua Su, jinyue Zhao, Wenrui Li, Yanting Chen

Comments: Published in ICLR 2024; The abstract is slightly modified due to the length limitation

Subjects: Multimedia (cs.MM); Computation and Language (cs.CL)
[15] arXiv:2403.10976 [pdf, other]: Title: Quality-Aware Dynamic Resolution Adaptation Framework for Adaptive Video Streaming

Authors: Amritha Premkumar, Prajit T Rajendran, Vignesh V Menon, Adam Wieckowski, Benjamin Bross, Detlev Marpe

Comments: ACM MMSys '24 | Open-Source Software and Dataset. arXiv admin note: substantial text overlap with arXiv:2401.15346

Subjects: Multimedia (cs.MM)
[16] arXiv:2403.11241 [pdf, other]: Title: Fidelity-preserving Learning-Based Image Compression: Loss Function and Subjective Evaluation Methodology

Authors: Shima Mohammadi, Yaojun Wu, João Ascenso

Comments: 5 pages, 6 figures. In 2023 IEEE International Conference on Visual Communications and Image Processing (VCIP)

Subjects: Multimedia (cs.MM)
[17] arXiv:2403.11700 [pdf, other]: Title: Virbo: Multimodal Multilingual Avatar Video Generation in Digital Marketing

Authors: Juan Zhang, Jiahao Chen, Cheng Wang, Zhiwang Yu, Tangquan Qi, Can Liu, Di Wu

Subjects: Multimedia (cs.MM)
[18] arXiv:2403.11757 [pdf, other]: Title: Efficient Feature Extraction and Late Fusion Strategy for Audiovisual Emotional Mimicry Intensity Estimation

Authors: Jun Yu, Wangyuan Zhu, Jichao Zhu

Subjects: Multimedia (cs.MM); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[19] arXiv:2403.12053 [src]: Title: PiGW: A Plug-in Generative Watermarking Framework

Authors: Rui Ma, Mengxi Guo, Li Yuming, Hengyuan Zhang, Cong Ma, Yuan Li, Xiaodong Xie, Shanghang Zhang

Comments: Improve experimental content

Subjects: Multimedia (cs.MM)
[20] arXiv:2403.12618 [pdf, other]: Title: NewsCaption: Named-Entity aware Captioning for Out-of-Context Media

Authors: Anurag Singh, Shivangi Aneja

Subjects: Multimedia (cs.MM); Social and Information Networks (cs.SI)
[21] arXiv:2403.12667 [pdf, other]: Title: ICE: Interactive 3D Game Character Editing via Dialogue

Authors: Haoqian Wu, Yunjie Wu, Zhipeng Hu, Lincheng Li, Weijie Chen, Rui Zhao, Changjie Fan, Xin Yu

Subjects: Multimedia (cs.MM); Human-Computer Interaction (cs.HC)
[22] arXiv:2403.15226 [pdf, other]: Title: Not All Attention is Needed: Parameter and Computation Efficient Transfer Learning for Multi-modal Large Language Models

Authors: Qiong Wu, Weihao Ye, Yiyi Zhou, Xiaoshuai Sun, Rongrong Ji

Subjects: Multimedia (cs.MM); Computation and Language (cs.CL)
[23] arXiv:2403.15256 [pdf, other]: Title: Experimental Studies of Metaverse Streaming

Authors: Haopeng Wang, Roberto Martinez-Velazquez, Haiwei Dong, Abdulmotaleb El Saddik

Comments: Accepted by IEEE Consumer Electronics Magazine

Subjects: Multimedia (cs.MM); Networking and Internet Architecture (cs.NI)
[24] arXiv:2403.16951 [pdf, other]: Title: Network-Assisted Delivery of Adaptive Video Streaming Services through CDN, SDN, and MEC

Authors: Reza Farahani

Comments: PhD thesis defended in 22.08.2023 (this https URL)

Subjects: Multimedia (cs.MM); Networking and Internet Architecture (cs.NI)
[25] arXiv:2403.16985 [pdf, other]: Title: Towards Low-Latency and Energy-Efficient Hybrid P2P-CDN Live Video Streaming

Authors: Reza Farahani, Christian Timmerer, Hermann Hellwagner

Comments: 6 pages, 3 figures, Special Issue on Sustainable Multimedia Communications and Services, IEEE MMTC Communications

Subjects: Multimedia (cs.MM)

[ total of 114 entries: 1-25 | 26-50 | 51-75 | 76-100 | 101-114 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, 2405, contact, help (Access key information)

> cs > cs.MM

Multimedia

Authors and titles for cs.MM in Mar 2024