TY - JOUR
T1 - A Visual Approach to Investigating Shared and Global Memory Behavior of CUDA Kernels
AU - Rosen, Paul
N1 - KAUST Repository Item: Exported on 2020-10-01
Acknowledged KAUST grant number(s): KUS-C1-016-04
Acknowledgements: We thank Kristi Potter for her feedback. This work was supported by DOE NETL and KAUST award KUS-C1-016-04.
This publication acknowledges KAUST support, but has no KAUST affiliated authors.
PY - 2013/7/1
Y1 - 2013/7/1
N2 - We present an approach to investigate the memory behavior of a parallel kernel executing on thousands of threads simultaneously within the CUDA architecture. Our top-down approach allows for quickly identifying any significant differences between the execution of the many blocks and warps. As interesting warps are identified, we allow further investigation of memory behavior by visualizing the shared memory bank conflicts and global memory coalescence, first with an overview of a single warp with many operations and, subsequently, with a detailed view of a single warp and a single operation. We demonstrate the strength of our approach in the context of a parallel matrix transpose kernel and a parallel 1D Haar Wavelet transform kernel. © 2013 The Author(s) Computer Graphics Forum © 2013 The Eurographics Association and Blackwell Publishing Ltd.
AB - We present an approach to investigate the memory behavior of a parallel kernel executing on thousands of threads simultaneously within the CUDA architecture. Our top-down approach allows for quickly identifying any significant differences between the execution of the many blocks and warps. As interesting warps are identified, we allow further investigation of memory behavior by visualizing the shared memory bank conflicts and global memory coalescence, first with an overview of a single warp with many operations and, subsequently, with a detailed view of a single warp and a single operation. We demonstrate the strength of our approach in the context of a parallel matrix transpose kernel and a parallel 1D Haar Wavelet transform kernel. © 2013 The Author(s) Computer Graphics Forum © 2013 The Eurographics Association and Blackwell Publishing Ltd.
UR - http://hdl.handle.net/10754/597436
UR - http://doi.wiley.com/10.1111/cgf.12103
UR - http://www.scopus.com/inward/record.url?scp=84879744833&partnerID=8YFLogxK
U2 - 10.1111/cgf.12103
DO - 10.1111/cgf.12103
M3 - Article
SN - 0167-7055
VL - 32
SP - 161
EP - 170
JO - Computer Graphics Forum
JF - Computer Graphics Forum
IS - 3pt2
ER -