
LOCA
 differential equations, and to run on distributed memory parallel machines. The approach in LOCA...

PFFT
 fast Fourier transforms (FFTs) on massively parallel, distributed memory architectures based on the message passing ... established transpose FFT algorithms, we propose a parallel FFT framework that is based ... calculate pruned FFTs more efficiently on distributed memory architectures. For example, we provide performance measurements...

ALPS
 porting a serial code onto a parallel, distributed memory machine. Major changes in release...

XGBoost
 XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible ... Gradient Boosting framework. XGBoost provides a parallel tree boosting (also known as GBDT, GBM) that ... same code runs on major distributed environment (Hadoop, SGE, MPI) and can solve problems beyond...

Pluto
 Chombo library which provides a distributed infrastructure for parallel calculations over blockstructured, adaptively refined...

ParaSCIP
 SCIP, which realizes a parallelization on a distributed memory computing environment. ParaSCIP uses SCIP solvers ... branching tree) locally. This makes the parallelization development independent of the SCIP development. Thus, ParaSCIP...

FLICAOVAP
 research purpose. The architecture also enables distributed parallel calculations, multidisciplinary couplings (with the neutronics codes...

SCALEA
 SCALEA: A performance analysis tool for distributed and parallel programs. In this paper we present ... measurement, analysis, and visualization tool for parallel and distributed programs that supports postmortem...

PSP
 preconditioner for heterogeneous 3D Helmholtz equations. A parallelization of a sweeping preconditioner for threedimensional ... counts are reported for highfrequency problems distributed over thousands of cores. Two opensource ... with this paper: parallel sweeping preconditioner (PSP) and the underlying distributed multifrontal solver, clique...

clique
 preconditioner for heterogeneous 3D Helmholtz equations. A parallelization of a sweeping preconditioner for threedimensional ... counts are reported for highfrequency problems distributed over thousands of cores. Two opensource ... with this paper: parallel sweeping preconditioner (PSP) and the underlying distributed multifrontal solver, clique...

MiBench
 SPEC benchmarks including instruction distribution, memory behavior, and available parallelism. The embedded benchmarks, called MiBench...

Elemental
 Framework for Distributed Memory Dense Matrix Computations. Parallelizing dense matrix computations to distributed memory architectures ... among the best understood domains of parallel computing. Two packages, developed in the mid 1990s ... very well take the shape of distributed memory architectures within a single processor, these packages...

PBGL
 Generic C++ Library for HighPerformance Parallel and Distributed Graph Computation The Parallel BGL builds ... data structures, algorithms, and syntax for distributed, parallel computation that the BGL offers for sequential...

PARDISO
 systems of equations on sharedmemory and distributedmemory multiprocessors. The solver has has been ... indefinite, hermitian. LU with complete pivoting. Parallel on SMPs and Cluster of SMPs. Automatic combination...

P  ARPACK
 portable large scale eigenvalue package for distributed memory parallel architectures. P_ARPACK is a parallel ... parallel implementation of ARPACK is presented which is portable across a wide range of distributed...

pARMS
 scientific and engineering applications. The most common parallel preconditioners used for sparse linear systems adapt ... more general framework of “distributed sparse linear systems”. The parallel Algebraic Recursive Multilevel Solver (pARMS...

AmgX
 large that they require large scale distributed parallel computing to obtain the solution of interest ... which provides dropin GPU acceleration of distributed algebraic multigrid (AMG) and preconditioned iterative methods ... available multigrid methods or simpler preconditioners. The parallelism in the aggregation scheme exploits parallel graph...

IBAMR
 IBAMR: An adaptive and distributedmemory parallel implementation of the immersed boundary method. IBAMR: IBAMR ... distributedmemory parallel implementation of the immersed boundary (IB) method with support for Cartesian grid ... adaptive mesh refinement (AMR). Support for distributedmemory parallelism is via MPI, the Message Passing...

TAU
 pace with the growing complexity of parallel and distributed systems depends on robust performance frameworks ... presents the TAU (Tuning and Analysis Utilities) parallel performance sytem and describe how it addresses...

Dryad
 Dryad: distributed dataparallel programs from sequential building blocks. Dryad is a generalpurpose distributed ... execution engine for coarsegrain dataparallel applications. A Dryad application combines computational ”vertices” with ... difficult problems of creating a large distributed, concurrent application: scheduling the use of computers...