TY - GEN
T1 - Improving differentiable neural computers through memory masking, de-allocation, and link distribution sharpness control
AU - Csordás, Róbert
AU - Schmidhuber, Jürgen
N1 - Generated from Scopus record by KAUST IRTS on 2022-09-14
PY - 2019/1/1
Y1 - 2019/1/1
N2 - The Differentiable Neural Computer (DNC) can learn algorithmic and question answering tasks. An analysis of its internal activation patterns reveals three problems: Most importantly, the lack of key-value separation makes the address distribution resulting from content-based look-up noisy and flat, since the value influences the score calculation, although only the key should. Second, DNC's deallocation of memory results in aliasing, which is a problem for content-based look-up. Thirdly, chaining memory reads with the temporal linkage matrix exponentially degrades the quality of the address distribution. Our proposed fixes of these problems yield improved performance on arithmetic tasks, and also improve the mean error rate on the bAbI question answering dataset by 43%.
AB - The Differentiable Neural Computer (DNC) can learn algorithmic and question answering tasks. An analysis of its internal activation patterns reveals three problems: Most importantly, the lack of key-value separation makes the address distribution resulting from content-based look-up noisy and flat, since the value influences the score calculation, although only the key should. Second, DNC's deallocation of memory results in aliasing, which is a problem for content-based look-up. Thirdly, chaining memory reads with the temporal linkage matrix exponentially degrades the quality of the address distribution. Our proposed fixes of these problems yield improved performance on arithmetic tasks, and also improve the mean error rate on the bAbI question answering dataset by 43%.
UR - http://www.scopus.com/inward/record.url?scp=85083953722&partnerID=8YFLogxK
M3 - Conference contribution
BT - 7th International Conference on Learning Representations, ICLR 2019
PB - International Conference on Learning Representations, ICLR
ER -