Social image parsing by cross-modal data refinement

Zhiwu Lu, Xin Gao, Songfang Huang, Liwei Wang, Ji Rong Wen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations


This paper presents a cross-modal data refinement algorithm for social image parsing, or segmenting all the objects within a social image and then identifying their categories. Different from the traditional fully supervised image parsing that takes pixel-level labels as strong supervisory information, our social image parsing is initially provided with the noisy tags of images (i.e. image-level labels), which are shared by social users. By oversegmenting each image into multiple regions, we formulate social image parsing as a cross-modal data refinement problem over a large set of regions, where the initial labels of each region are inferred from image-level labels. Furthermore, we develop an efficient algorithm to solve such cross-modal data refinement problem. The experimental results on several benchmark datasets show the effectiveness of our algorithm. More notably, our algorithm can be considered to provide an alternative and natural way to address the challenging problem of image parsing, since image-level labels are much easier to access than pixel-level labels.

Original languageEnglish (US)
Title of host publicationIJCAI 2015 - Proceedings of the 24th International Joint Conference on Artificial Intelligence
EditorsMichael Wooldridge, Qiang Yang
PublisherInternational Joint Conferences on Artificial Intelligence
Number of pages7
ISBN (Electronic)9781577357384
StatePublished - 2015
Event24th International Joint Conference on Artificial Intelligence, IJCAI 2015 - Buenos Aires, Argentina
Duration: Jul 25 2015Jul 31 2015

Publication series

NameIJCAI International Joint Conference on Artificial Intelligence
ISSN (Print)1045-0823


Other24th International Joint Conference on Artificial Intelligence, IJCAI 2015
CityBuenos Aires

ASJC Scopus subject areas

  • Artificial Intelligence


Dive into the research topics of 'Social image parsing by cross-modal data refinement'. Together they form a unique fingerprint.

Cite this