TY - GEN
T1 - Composing Algorithmic Skeletons to Express High-Performance Scientific Applications
AU - Zandifar, Mani
AU - AbdulJabbar, Mustafa Abdulmajeed
AU - Majidi, Alireza
AU - Keyes, David E.
AU - Amato, Nancy M.
AU - Rauchwerger, Lawrence
N1 - KAUST Repository Item: Exported on 2021-08-12
PY - 2015
Y1 - 2015
N2 - Algorithmic skeletons are high-level representations for parallel programs that hide the underlying parallelism details from program specification. These skeletons are defined in terms of higher-order functions that can be composed to build larger programs. Many skeleton frameworks support efficient implementations for stand-alone skeletons such as map, reduce, and zip for both shared-memory systems and small clusters. However, in these frameworks, expressing complex skeletons that are constructed through composition of fundamental skeletons either requires complete reimplementation or suffers from limited scalability due to required global synchronization. In the stapl Skeleton Framework, we represent skeletons as parametric data flow graphs and describe composition of skeletons by point-to-point dependencies of their data flow graph representations. As a result, we eliminate the need for reimplementation and global synchronizations in composed skeletons. In this work, we describe the process of translating skeleton-based programs to data flow graphs and define rules for skeleton composition. To show the expressivity and ease of use of our framework, we show skeleton-based representations of the NAS EP, IS, and FT benchmarks. To show reusability and applicability of our framework on real-world applications we show an NBody application using the FMM (Fast Multipole Method) hierarchical algorithm. Our results show that expressivity can be achieved without loss of performance even in complex real-world applications.
AB - Algorithmic skeletons are high-level representations for parallel programs that hide the underlying parallelism details from program specification. These skeletons are defined in terms of higher-order functions that can be composed to build larger programs. Many skeleton frameworks support efficient implementations for stand-alone skeletons such as map, reduce, and zip for both shared-memory systems and small clusters. However, in these frameworks, expressing complex skeletons that are constructed through composition of fundamental skeletons either requires complete reimplementation or suffers from limited scalability due to required global synchronization. In the stapl Skeleton Framework, we represent skeletons as parametric data flow graphs and describe composition of skeletons by point-to-point dependencies of their data flow graph representations. As a result, we eliminate the need for reimplementation and global synchronizations in composed skeletons. In this work, we describe the process of translating skeleton-based programs to data flow graphs and define rules for skeleton composition. To show the expressivity and ease of use of our framework, we show skeleton-based representations of the NAS EP, IS, and FT benchmarks. To show reusability and applicability of our framework on real-world applications we show an NBody application using the FMM (Fast Multipole Method) hierarchical algorithm. Our results show that expressivity can be achieved without loss of performance even in complex real-world applications.
UR - http://hdl.handle.net/10754/670576
UR - https://dl.acm.org/doi/10.1145/2751205.2751241
UR - http://www.scopus.com/inward/record.url?scp=84957608189&partnerID=8YFLogxK
U2 - 10.1145/2751205.2751241
DO - 10.1145/2751205.2751241
M3 - Conference contribution
SN - 9781450335591
SP - 415
EP - 424
BT - Proceedings of the 29th ACM on International Conference on Supercomputing
PB - ACM
ER -