TY - GEN
T1 - Toward better simulation of MPI applications on ethernet/TCP networks
AU - Bédaride, Paul
AU - Degomme, Augustin
AU - Genaud, Stéphane
AU - Legrand, Arnaud
AU - Markomanolis, George S.
AU - Quinson, Martin
AU - Stillwell, Mark
AU - Suter, Frédéric
AU - Videau, Brice
N1 - Publisher Copyright:
© Springer International Publishing Switzerland 2014.
PY - 2014
Y1 - 2014
N2 - Simulation and modeling for performance prediction and profiling is essential for developing and maintaining HPC code that is expected to scale for next-generation exascale systems, and correctly modeling network behavior is essential for creating realistic simulations. In this article we describe an implementation of a flow-based hybrid network model that accounts for factors such as network topology and contention, which are commonly ignored by other approaches. We focus on large-scale, Ethernet-connected systems, as these currently compose 37.8% of the TOP500 index, and this share is expected to increase as higher-speed 10 and 100GbE become more available. The European Mont-Blanc project, which studies exascale computing by developing prototype systems with low-power embedded devices, uses Ethernetbased interconnect. Our model is implemented within SMPI, an opensource MPI implementation that connects real applications to the SimGrid simulation framework. SMPI provides implementations of collective communications based on current versions of both OpenMPI and MPICH. SMPI and SimGrid also provide methods for easing the simulation of large-scale systems, including shadow execution, memory folding, and support for both online and offline (i.e., post-mortem) simulation. We validate our proposed model by comparing traces produced by SMPI with those from real world experiments, as well as with those obtained using other established network models. Our study shows that SMPI has a consistently better predictive power than classical LogPbased models for a wide range of scenarios including both established HPC benchmarks and real applications.
AB - Simulation and modeling for performance prediction and profiling is essential for developing and maintaining HPC code that is expected to scale for next-generation exascale systems, and correctly modeling network behavior is essential for creating realistic simulations. In this article we describe an implementation of a flow-based hybrid network model that accounts for factors such as network topology and contention, which are commonly ignored by other approaches. We focus on large-scale, Ethernet-connected systems, as these currently compose 37.8% of the TOP500 index, and this share is expected to increase as higher-speed 10 and 100GbE become more available. The European Mont-Blanc project, which studies exascale computing by developing prototype systems with low-power embedded devices, uses Ethernetbased interconnect. Our model is implemented within SMPI, an opensource MPI implementation that connects real applications to the SimGrid simulation framework. SMPI provides implementations of collective communications based on current versions of both OpenMPI and MPICH. SMPI and SimGrid also provide methods for easing the simulation of large-scale systems, including shadow execution, memory folding, and support for both online and offline (i.e., post-mortem) simulation. We validate our proposed model by comparing traces produced by SMPI with those from real world experiments, as well as with those obtained using other established network models. Our study shows that SMPI has a consistently better predictive power than classical LogPbased models for a wide range of scenarios including both established HPC benchmarks and real applications.
UR - http://www.scopus.com/inward/record.url?scp=84908686748&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-10214-6_8
DO - 10.1007/978-3-319-10214-6_8
M3 - Conference contribution
AN - SCOPUS:84908686748
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 158
EP - 181
BT - High Performance Computing Systems
A2 - Jarvis, Stephen A.
A2 - Wright, Steven A.
A2 - Hammond, Simon D.
PB - Springer Verlag
T2 - 4th International Workshop on Performance Modeling, Benchmarking and Simulation of High-Performance Computing Systems, PMBS 2013
Y2 - 18 November 2013 through 18 November 2013
ER -