TY - JOUR
T1 - Towards Efficient In-memory Computing Hardware for Quantized Neural Networks: State-of-the-art, Open Challenges and Perspectives
AU - Krestinskaya, Olga
AU - Zhang, Li
AU - Salama, Khaled N.
N1 - KAUST Repository Item: Exported on 2023-07-10
Acknowledged KAUST grant number(s): URF/1/4704-01-01
Acknowledgements: This work was supported by KAUST CRG grant URF/1/4704-01-01.
PY - 2023/7/6
Y1 - 2023/7/6
N2 - The amount of data processed in the cloud, the development of Internet-of-Things (IoT) applications, and growing data privacy concerns force the transition from cloud-based to edge-based processing. Limited energy and computational resources on edge push the transition from traditional von Neumann architectures to In-memory Computing (IMC), especially for machine learning and neural network applications. Network compression techniques are applied to implement a neural network on limited hardware resources. Quantization is one of the most efficient network compression techniques allowing to reduce the memory footprint, latency, and energy consumption. This paper provides a comprehensive review of IMC-based Quantized Neural Networks (QNN) and links software-based quantization approaches to IMC hardware implementation. Moreover, open challenges, QNN design requirements, recommendations, and perspectives along with an IMC-based QNN hardware roadmap are provided.
AB - The amount of data processed in the cloud, the development of Internet-of-Things (IoT) applications, and growing data privacy concerns force the transition from cloud-based to edge-based processing. Limited energy and computational resources on edge push the transition from traditional von Neumann architectures to In-memory Computing (IMC), especially for machine learning and neural network applications. Network compression techniques are applied to implement a neural network on limited hardware resources. Quantization is one of the most efficient network compression techniques allowing to reduce the memory footprint, latency, and energy consumption. This paper provides a comprehensive review of IMC-based Quantized Neural Networks (QNN) and links software-based quantization approaches to IMC hardware implementation. Moreover, open challenges, QNN design requirements, recommendations, and perspectives along with an IMC-based QNN hardware roadmap are provided.
UR - http://hdl.handle.net/10754/692854
UR - https://ieeexplore.ieee.org/document/10174681/
U2 - 10.1109/tnano.2023.3293026
DO - 10.1109/tnano.2023.3293026
M3 - Article
SN - 1536-125X
SP - 1
EP - 10
JO - IEEE Transactions on Nanotechnology
JF - IEEE Transactions on Nanotechnology
ER -