TY - JOUR
T1 - Scaling big data neuroscience: From interactive analytics to HPC platforms
AU - Petruzza, Steve
AU - Venkat, Aniketh
AU - Gyulassy, Attila
AU - Scorzelli, Giorgio
AU - Federer, Frederick
AU - Angelucci, Alessandra
AU - Pascucci, Valerio
AU - Bremer, Peer Timo
N1 - KAUST Repository Item: Exported on 2022-06-10
Acknowledgements: This work is supported in part by NSF: CGV: Award:1314896, NSF:IIP Award:1602127 NSF:OAC O ce of Advanced Cyberinfrastructure (OAC): Award 1649923, DOE/SciDAC DESC0007446, CCMSC DE- NA0002375, and PIPER: ER26142 DE-SC0010498. This material is based upon work supported by the Department of Energy, National Nuclear Security Administration, under Award Number(s) DE-NA0002375. For computer time this research used the resources of the Super- computing Laboratory at King Abdullah University of Science and Technology (KAUST) in Thuwal, Saudi Arabia.
This publication acknowledges KAUST support, but has no KAUST affiliated authors.
PY - 2018/1/1
Y1 - 2018/1/1
N2 - High-throughput microscopy techniques generate an ever growing amount of data that are fundamental to gather scientific, biologically and medically relevant insights. This growing amount of data dramatically affects the scientific workflow at every step. Visualization and analysis tasks are performed with limited interactivity and the implementations often require HPC skills and lack of portability, usability and maintainability. In this work we explore a software infrastructure that simplifies end-to-end visualization and analysis of massive data. Data management and movement is performed using a hierarchical streaming data access layer which enable interactive exploration of remote data. The analysis tasks are expressed and performed using a library for rapid prototyping of algorithms using an Embedded Domain Specific Language which enables portable deployment in both desktop and HPC environments. Finally, we use a scalable runtime system (Charm++) to automate the mapping of the analysis algorithm to the computational resources available, reducing the complexity of developing scaling algorithms. We present large scale experimentations using tera-scale microscopy data executing some of the most common neuroscience use cases: data filtering, visualization using two different image compositing algorithms, and image registration.
AB - High-throughput microscopy techniques generate an ever growing amount of data that are fundamental to gather scientific, biologically and medically relevant insights. This growing amount of data dramatically affects the scientific workflow at every step. Visualization and analysis tasks are performed with limited interactivity and the implementations often require HPC skills and lack of portability, usability and maintainability. In this work we explore a software infrastructure that simplifies end-to-end visualization and analysis of massive data. Data management and movement is performed using a hierarchical streaming data access layer which enable interactive exploration of remote data. The analysis tasks are expressed and performed using a library for rapid prototyping of algorithms using an Embedded Domain Specific Language which enables portable deployment in both desktop and HPC environments. Finally, we use a scalable runtime system (Charm++) to automate the mapping of the analysis algorithm to the computational resources available, reducing the complexity of developing scaling algorithms. We present large scale experimentations using tera-scale microscopy data executing some of the most common neuroscience use cases: data filtering, visualization using two different image compositing algorithms, and image registration.
UR - http://hdl.handle.net/10754/678858
UR - https://www.medra.org/servlet/aliasResolver?alias=iospressISBN&isbn=978-1-61499-881-5&spage=53&doi=10.3233/978-1-61499-882-2-53
UR - http://www.scopus.com/inward/record.url?scp=85066909595&partnerID=8YFLogxK
U2 - 10.3233/978-1-61499-882-2-53
DO - 10.3233/978-1-61499-882-2-53
M3 - Article
SN - 1879-808X
VL - 33
SP - 53
EP - 68
JO - Advances in Parallel Computing
JF - Advances in Parallel Computing
ER -