TY - GEN
T1 - Exploring the Challenges Towards Lifelong Fact Learning
AU - Elhoseiny, Mohamed
AU - Babiloni, Francesca
AU - Aljundi, Rahaf
AU - Rohrbach, Marcus
AU - Paluri, Manohar
AU - Tuytelaars, Tinne
N1 - Generated from Scopus record by KAUST IRTS on 2019-11-20
PY - 2019/1/1
Y1 - 2019/1/1
N2 - So far life-long learning (LLL) has been studied in relatively small-scale and relatively artificial setups. Here, we introduce a new large-scale alternative. What makes the proposed setup more natural and closer to human-like visual systems is threefold: First, we focus on concepts (or facts, as we call them) of varying complexity, ranging from single objects to more complex structures such as objects performing actions, and objects interacting with other objects. Second, as in real-world settings, our setup has a long-tail distribution, an aspect which has mostly been ignored in the LLL context. Third, facts across tasks may share structure (e.g., ⟨ person, riding, wave ⟩ and ⟨ dog, riding, wave ⟩ ). Facts can also be semantically related (e.g., “liger” relates to seen categories like “tiger” and “lion”). Given the large number of possible facts, a LLL setup seems a natural choice. To avoid model size growing over time and to optimally exploit the semantic relations and structure, we combine it with a visual semantic embedding instead of discrete class labels. We adapt existing datasets with the properties mentioned above into new benchmarks, by dividing them semantically or randomly into disjoint tasks. This leads to two large-scale benchmarks with 906,232 images and 165,150 unique facts, on which we evaluate and analyze state-of-the-art LLL methods.
AB - So far life-long learning (LLL) has been studied in relatively small-scale and relatively artificial setups. Here, we introduce a new large-scale alternative. What makes the proposed setup more natural and closer to human-like visual systems is threefold: First, we focus on concepts (or facts, as we call them) of varying complexity, ranging from single objects to more complex structures such as objects performing actions, and objects interacting with other objects. Second, as in real-world settings, our setup has a long-tail distribution, an aspect which has mostly been ignored in the LLL context. Third, facts across tasks may share structure (e.g., ⟨ person, riding, wave ⟩ and ⟨ dog, riding, wave ⟩ ). Facts can also be semantically related (e.g., “liger” relates to seen categories like “tiger” and “lion”). Given the large number of possible facts, a LLL setup seems a natural choice. To avoid model size growing over time and to optimally exploit the semantic relations and structure, we combine it with a visual semantic embedding instead of discrete class labels. We adapt existing datasets with the properties mentioned above into new benchmarks, by dividing them semantically or randomly into disjoint tasks. This leads to two large-scale benchmarks with 906,232 images and 165,150 unique facts, on which we evaluate and analyze state-of-the-art LLL methods.
UR - http://link.springer.com/10.1007/978-3-030-20876-9_5
UR - http://www.scopus.com/inward/record.url?scp=85066922471&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-20876-9_5
DO - 10.1007/978-3-030-20876-9_5
M3 - Conference contribution
SN - 9783030208752
BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PB - Springer [email protected]
ER -