TY - GEN
T1 - Automatic annotation of structured facts in images
AU - Elhoseiny, Mohamed
AU - Cohen, Scott
AU - Chang, Walter
AU - Price, Brian
AU - Elgammal, Ahmed
N1 - Publisher Copyright:
© ACL 2016.All right reserved.
PY - 2016
Y1 - 2016
N2 - Motivated by the application of fact-level image understanding, we present an automatic method for data collection of structured visual facts from images with captions. Example structured facts include attributed objects (e.g., ), actions (e.g., ), interactions (e.g., ), and positional information (e.g., ). The collected annotations are in the form of fact-image pairs (e.g., and an image region containing this fact). With a language approach, the proposed method is able to collect hundreds of thousands of visual fact annotations with accuracy of 83% according to human judgment. Our method automatically collected more than 380,000 visual fact annotations and more than 110,000 unique visual facts from images with captions and localized them in images in less than one day of processing time on standard CPU platforms. We will make the data publically available.
AB - Motivated by the application of fact-level image understanding, we present an automatic method for data collection of structured visual facts from images with captions. Example structured facts include attributed objects (e.g., ), actions (e.g., ), interactions (e.g., ), and positional information (e.g., ). The collected annotations are in the form of fact-image pairs (e.g., and an image region containing this fact). With a language approach, the proposed method is able to collect hundreds of thousands of visual fact annotations with accuracy of 83% according to human judgment. Our method automatically collected more than 380,000 visual fact annotations and more than 110,000 unique visual facts from images with captions and localized them in images in less than one day of processing time on standard CPU platforms. We will make the data publically available.
UR - http://www.scopus.com/inward/record.url?scp=85090248209&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85090248209
T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics
SP - 1
EP - 9
BT - Proceedings of the 5th Workshop on Vision and Language, VL 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
A2 - Belz, Anya
A2 - Erdem, Erkut
A2 - Mikolajczyk, Krystian
A2 - Pastra, Katerina
PB - Association for Computational Linguistics (ACL)
T2 - 5th Workshop on Vision and Language, VL 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
Y2 - 12 August 2016
ER -