Multimodal Representation and Retrieval [MRR 2024]

Xinliang Zhu, Arnab Dhua, Douglas Gray, I. Zeki Yalniz, Tan Yu, Mohamed Elhoseiny, Bryan Plummer

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Multimodal data is available in many applications like e-commerce production listings, social media posts and short videos. However, existing algorithms dealing with those types of data still focus on uni-modal representation learning by vision-language alignment and cross-modal retrieval. In this workshop, we target to bring a new retrieval problem where both queries and documents are multimodal. With the popularity of vision language modeling, large language models (LLMs), retrieval augmented generation (RAG), and multimodal LLM, we see a lot of new opportunities for multimodal representation and retrieval tasks. This event will be a comprehensive half-day workshop focusing on the subject of multimodal representation and retrieval. The agenda includes keynote speeches, oral presentations, and an interactive panel discussion.

Original languageEnglish (US)
Title of host publicationSIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval
PublisherAssociation for Computing Machinery, Inc
Pages3047-3050
Number of pages4
ISBN (Electronic)9798400704314
DOIs
StatePublished - Jul 10 2024
Event47th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2024 - Washington, United States
Duration: Jul 14 2024Jul 18 2024

Publication series

NameSIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval

Conference

Conference47th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2024
Country/TerritoryUnited States
CityWashington
Period07/14/2407/18/24

Keywords

  • large language model
  • multimodal large language model
  • multimodal representation
  • multimodal retrieval
  • vision language modeling

ASJC Scopus subject areas

  • Information Systems
  • Software

Fingerprint

Dive into the research topics of 'Multimodal Representation and Retrieval [MRR 2024]'. Together they form a unique fingerprint.

Cite this