TY - GEN
T1 - A Generative Adversarial Approach for Zero-Shot Learning from Noisy Texts
AU - Zhu, Yizhe
AU - Elhoseiny, Mohamed
AU - Liu, Bingchen
AU - Peng, Xi
AU - Elgammal, Ahmed
N1 - Generated from Scopus record by KAUST IRTS on 2019-11-20
PY - 2018/12/14
Y1 - 2018/12/14
N2 - Most existing zero-shot learning methods consider the problem as a visual semantic embedding one. Given the demonstrated capability of Generative Adversarial Networks(GANs) to generate images, we instead leverage GANs to imagine unseen categories from text descriptions and hence recognize novel classes with no examples being seen. Specifically, we propose a simple yet effective generative model that takes as input noisy text descriptions about an unseen class (e.g. Wikipedia articles) and generates synthesized visual features for this class. With added pseudo data, zero-shot learning is naturally converted to a traditional classification problem. Additionally, to preserve the inter-class discrimination of the generated features, a visual pivot regularization is proposed as an explicit supervision. Unlike previous methods using complex engineered regularizers, our approach can suppress the noise well without additional regularization. Empirically, we show that our method consistently outperforms the state of the art on the largest available benchmarks on Text-based Zero-shot Learning.
AB - Most existing zero-shot learning methods consider the problem as a visual semantic embedding one. Given the demonstrated capability of Generative Adversarial Networks(GANs) to generate images, we instead leverage GANs to imagine unseen categories from text descriptions and hence recognize novel classes with no examples being seen. Specifically, we propose a simple yet effective generative model that takes as input noisy text descriptions about an unseen class (e.g. Wikipedia articles) and generates synthesized visual features for this class. With added pseudo data, zero-shot learning is naturally converted to a traditional classification problem. Additionally, to preserve the inter-class discrimination of the generated features, a visual pivot regularization is proposed as an explicit supervision. Unlike previous methods using complex engineered regularizers, our approach can suppress the noise well without additional regularization. Empirically, we show that our method consistently outperforms the state of the art on the largest available benchmarks on Text-based Zero-shot Learning.
UR - https://ieeexplore.ieee.org/document/8578209/
UR - http://www.scopus.com/inward/record.url?scp=85055663093&partnerID=8YFLogxK
U2 - 10.1109/CVPR.2018.00111
DO - 10.1109/CVPR.2018.00111
M3 - Conference contribution
SN - 9781538664209
BT - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
PB - IEEE Computer [email protected]
ER -