TY - GEN
T1 - Hardware-Friendly Lightweight Convolutional Neural Network Derivation at The Edge
AU - Zhang, Li
AU - Eltawil, Ahmed M.
AU - Salama, Khaled N.
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Convolutional neural networks(CNNs) have demonstrated remarkable capability and scalability in a variety of vision-related tasks. Due to privacy and latency constraints, in some scenarios, the CNNs are deployed on-site where power supply, computation power, and memory capacity are limited. These constraints hinder the traditional training or modification of CNN models, which typically involves network-scale backpropagation of the gradients. In this work, we proposed a framework enabling the derivation of lightweight models from the original model at the edge only utilizing hardware-friendly operations. In the proposed framework, all models are binary quantized and the gradients are obtained by layer-wise decision boundary matching. Hence, the whole flow can be executed with bit-wise and fixed-point arithmetic operations without network-scale gradient backpropagations. The derived model serves as a viable alternative to the original, in scenarios where the accuracy requirements are less stringent, delivering enhanced efficiencies in power and memory consumption. We validate the framework on digit recognition tasks, obtaining a better accuracy than naively deploying the same lightweight model. Furthermore, an FPGA demonstration of our framework achieved a throughput of 2.2 TOPS/s, underscoring its practical applicability.
AB - Convolutional neural networks(CNNs) have demonstrated remarkable capability and scalability in a variety of vision-related tasks. Due to privacy and latency constraints, in some scenarios, the CNNs are deployed on-site where power supply, computation power, and memory capacity are limited. These constraints hinder the traditional training or modification of CNN models, which typically involves network-scale backpropagation of the gradients. In this work, we proposed a framework enabling the derivation of lightweight models from the original model at the edge only utilizing hardware-friendly operations. In the proposed framework, all models are binary quantized and the gradients are obtained by layer-wise decision boundary matching. Hence, the whole flow can be executed with bit-wise and fixed-point arithmetic operations without network-scale gradient backpropagations. The derived model serves as a viable alternative to the original, in scenarios where the accuracy requirements are less stringent, delivering enhanced efficiencies in power and memory consumption. We validate the framework on digit recognition tasks, obtaining a better accuracy than naively deploying the same lightweight model. Furthermore, an FPGA demonstration of our framework achieved a throughput of 2.2 TOPS/s, underscoring its practical applicability.
KW - binary neural network(BNN)
KW - edge computing
KW - FPGA accelerator
UR - http://www.scopus.com/inward/record.url?scp=85199902510&partnerID=8YFLogxK
U2 - 10.1109/AICAS59952.2024.10595961
DO - 10.1109/AICAS59952.2024.10595961
M3 - Conference contribution
AN - SCOPUS:85199902510
T3 - 2024 IEEE 6th International Conference on AI Circuits and Systems, AICAS 2024 - Proceedings
SP - 11
EP - 15
BT - 2024 IEEE 6th International Conference on AI Circuits and Systems, AICAS 2024 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 6th IEEE International Conference on AI Circuits and Systems, AICAS 2024
Y2 - 22 April 2024 through 25 April 2024
ER -