TY - GEN
T1 - Memory in Motion
T2 - 2024 IEEE Biomedical Circuits and Systems Conference, BioCAS 2024
AU - Boretti, Chiara
AU - Bich, Philippe
AU - Prono, Luciano
AU - Pareschi, Fabio
AU - Rovatti, Riccardo
AU - Setti, Gianluca
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Augmented and Virtual Reality (AR/VR) technologies are gaining popularity to improve healthcare professionals training, with precise eye tracking playing a crucial role in enhancing performance. However, these systems need to be both low-latency and low-power to operate in real-time scenarios on resource-constrained devices. Event-based cameras can be employed to address these requirements, as they offer energy-efficient, high temporal resolution data with minimal battery drain. However, their sparse data format necessitates specialized processing algorithms. In this work, we propose a data preprocessing technique that improves the performance of nonrecurrent Deep Neural Networks (DNNs) for pupil position estimation. With this approach, we integrate over time - with a leakage factor - multiple time surfaces of events, so that the input data is enriched with information from past events. Additionally, in order to better distinguish between recent and old information, we generate multiple memory channels characterized by different leakage/forgetting rates. These memory channels are fed to well-known non-recurrent neural estimators to predict the position of the pupil. As an example, by using time surfaces only and feeding them to a MobileNet-V3L model to track the pupil in DVS recordings, we achieve a P10 accuracy (Euclidean error lower than ten pixels) of 85.40%, whether by using memory channels we achieve a P10 accuracy of 94.37% with a negligible time overhead.
AB - Augmented and Virtual Reality (AR/VR) technologies are gaining popularity to improve healthcare professionals training, with precise eye tracking playing a crucial role in enhancing performance. However, these systems need to be both low-latency and low-power to operate in real-time scenarios on resource-constrained devices. Event-based cameras can be employed to address these requirements, as they offer energy-efficient, high temporal resolution data with minimal battery drain. However, their sparse data format necessitates specialized processing algorithms. In this work, we propose a data preprocessing technique that improves the performance of nonrecurrent Deep Neural Networks (DNNs) for pupil position estimation. With this approach, we integrate over time - with a leakage factor - multiple time surfaces of events, so that the input data is enriched with information from past events. Additionally, in order to better distinguish between recent and old information, we generate multiple memory channels characterized by different leakage/forgetting rates. These memory channels are fed to well-known non-recurrent neural estimators to predict the position of the pupil. As an example, by using time surfaces only and feeding them to a MobileNet-V3L model to track the pupil in DVS recordings, we achieve a P10 accuracy (Euclidean error lower than ten pixels) of 85.40%, whether by using memory channels we achieve a P10 accuracy of 94.37% with a negligible time overhead.
UR - http://www.scopus.com/inward/record.url?scp=85216251923&partnerID=8YFLogxK
U2 - 10.1109/BioCAS61083.2024.10798345
DO - 10.1109/BioCAS61083.2024.10798345
M3 - Conference contribution
AN - SCOPUS:85216251923
T3 - 2024 IEEE Biomedical Circuits and Systems Conference, BioCAS 2024
BT - 2024 IEEE Biomedical Circuits and Systems Conference, BioCAS 2024
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 24 October 2024 through 26 October 2024
ER -