Abstract
The performance optimization of scientific applications usually requires an in-depth knowledge of the hardware and software. A performance tuning mechanism is suggested to automatically tune OpenACC parameters to adapt to the execution environment on a given system. A historic learning based methodology is suggested to prune the parameter search space for a more efficient auto-tuning process. This approach is applied to tune the OpenACC gang and vector clauses for a better mapping of the compute kernels onto the underlying architecture. Our experiments show a significant performance improvement against the default compiler parameters and drastic reduction in tuning time compared to a brute force search-based approach.
Original language | English (US) |
---|---|
Title of host publication | High Performance Computing for Computational Science -- VECPAR 2014 |
Publisher | Springer Nature |
Pages | 224-235 |
Number of pages | 12 |
ISBN (Print) | 9783319173528 |
DOIs | |
State | Published - Apr 18 2015 |