Abstract
Summary Analyzing and understanding the performance behavior of parallel applications on parallel computing platforms is a long-standing concern in the High Performance Computing community. When the targeted platforms are not available, simulation is a reasonable approach to obtain objective performance indicators and explore various hypothetical scenarios. In the context of applications implemented with the Message Passing Interface, two simulation methods have been proposed, on-line simulation and off-line simulation, both with their own drawbacks and advantages. In this work, we present an off-line simulation framework, that is, one that simulates the execution of an application based on event traces obtained from an actual execution. The main novelty of this work, when compared to previously proposed off-line simulators, is that traces that drive the simulation can be acquired on large, distributed, heterogeneous, and non-dedicated platforms. As a result, the scalability of trace acquisition is increased, which is achieved by enforcing that traces contain no time-related information. Moreover, our framework is based on a state-of-the-art scalable, fast, and validated simulation kernel. We introduce the notion of performing off-line simulation from time-independent traces, propose and evaluate several trace acquisition strategies, describe our simulation framework, and assess its quality in terms of trace acquisition scalability, simulation accuracy, and simulation time.
Original language | English (US) |
---|---|
Pages (from-to) | 1145-1168 |
Number of pages | 24 |
Journal | Concurrency and Computation: Practice and Experience |
Volume | 27 |
Issue number | 5 |
DOIs | |
State | Published - Apr 10 2015 |
Keywords
- MPI
- performance prediction
- simulation
ASJC Scopus subject areas
- Software
- Theoretical Computer Science
- Computer Networks and Communications
- Computer Science Applications
- Computational Theory and Mathematics