Algorithm 967
Algorithm 967: a distributed-memory fast multipole method for volume potentials. The solution of a constant-coefficient elliptic partial differential equation (PDE) can be computed using an integral transform: A convolution with the fundamental solution of the PDE, also known as a volume potential. We present a fast multipole method (FMM) for computing volume potentials and use them to construct spatially adaptive solvers for the Poisson, Stokes, and low-frequency Helmholtz problems. Conventional $N$-body methods apply to discrete particle interactions. With volume potentials, one replaces the sums with volume integrals. Particle $N$-body methods can be used to accelerate such integrals. but it is more efficient to develop a special FMM. In this article, we discuss the efficient implementation of such an FMM. We use high-order piecewise Chebyshev polynomials and an octree data structure to represent the input and output fields and enable spectrally accurate approximation of the near-field and the Kernel Independent FMM (KIFMM) for the far-field approximation. For distributed-memory parallelism, we use space-filling curves, locally essential trees, and a hypercube-like communication scheme developed previously in our group. We present new near and far interaction traversals that optimize cache usage. Also, unlike particle $N$-body codes, we need a 2:1 balanced tree to allow for precomputations. We present a fast scheme for 2:1 balancing. Finally, we use vectorization, including the AVX instruction set on the Intel Sandy Bridge architecture to get better than 50% of peak floating-point performance. We use task parallelism to employ the Xeon Phi on the Stampede platform at the Texas Advanced Computing Center (TACC). We achieve about 600{sc gflop}/s of double-precision performance on a single node. Our largest run on Stampede took 3.5s on 16K cores for a problem with 18{sc e}+9 unknowns for a highly nonuniform particle distribution (corresponding to an effective resolution exceeding 3{sc e}+23 unknowns since we used 23 levels in our octree).
Keywords for this software
References in zbMATH (referenced in 6 articles , 1 standard article )
Showing results 1 to 6 of 6.
Sorted by year (- Wang, Lei; Krasny, Robert; Tlupova, Svetlana: A kernel-independent treecode based on barycentric Lagrange interpolation (2020)
- Abduljabbar, Mustafa; Al Farhan, Mohammed; Al-Harthi, Noha; Chen, Rui; Yokota, Rio; Bagci, Hakan; Keyes, David: Extreme scale FMM-accelerated boundary integral equation solver for wave scattering (2019)
- Wang, Lei; Tlupova, Svetlana; Krasny, Robert: A treecode algorithm for 3D stokeslets and stresslets (2019)
- Wang, Jun; Greengard, Leslie: An adaptive fast Gauss transform in two dimensions (2018)
- Yan, Wen; Shelley, Michael: Flexibly imposing periodicity in kernel independent FMM: a multipole-to-local operator approach (2018)
- Malhotra, Dhairya; Biros, George: Algorithm 967: A distributed-memory fast multipole method for volume potentials (2016)