Joint object recognition and pose estimation using a nonlinear view-invariant latent generative model

Amr Bakry, Tarek Elgaaly, Mohamed Elhoseiny, Ahmed Elgammal

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations

Abstract

Object recognition and pose estimation are two fundamental problems in the field of computer vision. Recognizing objects and their poses/viewpoints are critical components of ample vision and robotic systems. Multiple viewpoints of an object lie on an intrinsic low-dimensional manifold in the input space (i.e. descriptor space). Different objects captured from the same set of viewpoints have manifolds with a common topology. In this paper we utilize this common topology between object manifolds by learning a low-dimensional latent space which non-linearly maps between a common unified manifold and the object manifold in the input space. Using a supervised embedding approach, the latent space is computed and used to jointly infer the category and pose of objects. We empirically validate our model by using multiple inference approaches and testing on multiple challenging datasets. We compare our results with the state-of-the-art and present our increased category recognition and pose estimation accuracy.
Original languageEnglish (US)
Title of host publication2016 IEEE Winter Conference on Applications of Computer Vision, WACV 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Print)9781509006410
DOIs
StatePublished - May 23 2016
Externally publishedYes

Fingerprint

Dive into the research topics of 'Joint object recognition and pose estimation using a nonlinear view-invariant latent generative model'. Together they form a unique fingerprint.

Cite this