Automatic annotation of structured facts in images

Mohamed Elhoseiny, Scott Cohen, Walter Chang, Brian Price, Ahmed Elgammal

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Motivated by the application of fact-level image understanding, we present an automatic method for data collection of structured visual facts from images with captions. Example structured facts include attributed objects (e.g., <flower, red>), actions (e.g., <baby, smile>), interactions (e.g., <man, walking, dog>), and positional information (e.g., <vase, on, table>). The collected annotations are in the form of fact-image pairs (e.g.,<man, walking, dog> and an image region containing this fact). With a language approach, the proposed method is able to collect hundreds of thousands of visual fact annotations with accuracy of 83% according to human judgment. Our method automatically collected more than 380,000 visual fact annotations and more than 110,000 unique visual facts from images with captions and localized them in images in less than one day of processing time on standard CPU platforms. We will make the data publically available.

Original languageEnglish (US)
Title of host publicationProceedings of the 5th Workshop on Vision and Language, VL 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
EditorsAnya Belz, Erkut Erdem, Krystian Mikolajczyk, Katerina Pastra
PublisherAssociation for Computational Linguistics (ACL)
Pages1-9
Number of pages9
ISBN (Electronic)9781945626111
StatePublished - 2016
Event5th Workshop on Vision and Language, VL 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Berlin, Germany
Duration: Aug 12 2016 → …

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (Print)0736-587X

Conference

Conference5th Workshop on Vision and Language, VL 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
Country/TerritoryGermany
CityBerlin
Period08/12/16 → …

ASJC Scopus subject areas

  • Computer Science Applications
  • Linguistics and Language
  • Language and Linguistics

Fingerprint

Dive into the research topics of 'Automatic annotation of structured facts in images'. Together they form a unique fingerprint.

Cite this