A Parallel Butterfly Algorithm

Jack Poulson, Laurent Demanet, Nicholas Maxwell, Lexing Ying

Research output: Contribution to journalArticlepeer-review

19 Scopus citations

Abstract

The butterfly algorithm is a fast algorithm which approximately evaluates a discrete analogue of the integral transform (Equation Presented.) at large numbers of target points when the kernel, K(x, y), is approximately low-rank when restricted to subdomains satisfying a certain simple geometric condition. In d dimensions with O(Nd) quasi-uniformly distributed source and target points, when each appropriate submatrix of K is approximately rank-r, the running time of the algorithm is at most O(r2Nd logN). A parallelization of the butterfly algorithm is introduced which, assuming a message latency of α and per-process inverse bandwidth of β, executes in at most (Equation Presented.) time using p processes. This parallel algorithm was then instantiated in the form of the open-source DistButterfly library for the special case where K(x, y) = exp(iΦ(x, y)), where Φ(x, y) is a black-box, sufficiently smooth, real-valued phase function. Experiments on Blue Gene/Q demonstrate impressive strong-scaling results for important classes of phase functions. Using quasi-uniform sources, hyperbolic Radon transforms, and an analogue of a three-dimensional generalized Radon transform were, respectively, observed to strong-scale from 1-node/16-cores up to 1024-nodes/16,384-cores with greater than 90% and 82% efficiency, respectively. © 2014 Society for Industrial and Applied Mathematics.
Original languageEnglish (US)
Pages (from-to)C49-C65
Number of pages1
JournalSIAM Journal on Scientific Computing
Volume36
Issue number1
DOIs
StatePublished - Feb 4 2014
Externally publishedYes

Fingerprint

Dive into the research topics of 'A Parallel Butterfly Algorithm'. Together they form a unique fingerprint.

Cite this