TY - GEN
T1 - Hybrid optimization/heuristic instruction scheduling for programmable accelerator codesign
AU - Nowatzki, Tony
AU - Ardalani, Newsha
AU - Sankaralingam, Karthikeyan
AU - Weng, Jian
N1 - Publisher Copyright:
© 2018 Association for Computing Machinery.
PY - 2018/11/1
Y1 - 2018/11/1
N2 - Recent programmable accelerators are faster and more energy efficient than general purpose processors, but expose complex hardware/software abstractions for compilers. A key problem is instruction scheduling, which requires sophisticated algorithms for mapping instructions to distributed processing elements, routing of operand dependences, and timing the arrival of operands to enable high throughput. The complex dependences between mapping, communication and timing make prior scheduling techniques insufficient. Optimization-based approaches are too slow, and heuristic-based approaches cannot achieve high quality. Our first insight is that the algorithm can be solved in a series of phases with overlapping responsibilities to reduce complexity. Second, it is possible to combine optimization-based and stochastic-heuristic based search strategies, to exploit the best features of both. This leads to the two primary techniques we explore, phase overlapping and hybridization. In this work we explore the codesign of scheduling algorithms with a challenging-to-schedule programmable accelerator. We show we can improve its area by 35% by trimming its scheduling-friendly structures, using a scheduling algorithm that is 5× faster than the state-of-the-art optimizationbased scheduler, with up to 2× better throughput.
AB - Recent programmable accelerators are faster and more energy efficient than general purpose processors, but expose complex hardware/software abstractions for compilers. A key problem is instruction scheduling, which requires sophisticated algorithms for mapping instructions to distributed processing elements, routing of operand dependences, and timing the arrival of operands to enable high throughput. The complex dependences between mapping, communication and timing make prior scheduling techniques insufficient. Optimization-based approaches are too slow, and heuristic-based approaches cannot achieve high quality. Our first insight is that the algorithm can be solved in a series of phases with overlapping responsibilities to reduce complexity. Second, it is possible to combine optimization-based and stochastic-heuristic based search strategies, to exploit the best features of both. This leads to the two primary techniques we explore, phase overlapping and hybridization. In this work we explore the codesign of scheduling algorithms with a challenging-to-schedule programmable accelerator. We show we can improve its area by 35% by trimming its scheduling-friendly structures, using a scheduling algorithm that is 5× faster than the state-of-the-art optimizationbased scheduler, with up to 2× better throughput.
UR - http://www.scopus.com/inward/record.url?scp=85061532363&partnerID=8YFLogxK
U2 - 10.1145/3243176.3243212
DO - 10.1145/3243176.3243212
M3 - Conference contribution
AN - SCOPUS:85061532363
T3 - Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT
BT - Proceedings - 27th International Conference on Parallel Architectures and Compilation Techniques, PACT 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 27th International Conference on Parallel Architectures and Compilation Techniques, PACT 2018
Y2 - 1 November 2018 through 4 November 2018
ER -