Elemental: A New Framework for Distributed Memory Dense Matrix Computations Parallelizing dense matrix computations to distributed memory architectures is a well-studied subject and generally considered to be among the best understood domains of parallel computing. Two packages, developed in the mid 1990s, still enjoy regular use: ScaLAPACK and PLAPACK. With the advent of many-core architectures, which may very well take the shape of distributed memory architectures within a single processor, these packages must be revisited since the traditional MPI-based approaches will likely need to be extended. Thus, this is a good time to review lessons learned since the introduction of these two packages and to propose a simple yet effective alternative. Preliminary performance results show the new solution achieves competitive, if not superior, performance on large clusters.

This software is also peer reviewed by journal TOMS.

References in zbMATH (referenced in 13 articles )

Showing results 1 to 13 of 13.
Sorted by year (citations)

  1. Lu, Jianfeng; Yang, Haizhao: Preconditioning orbital minimization method for planewave discretization (2017)
  2. Martinsson, Per-Gunnar; Quintana Ortí, Gregorio; Heavner, Nathan; van de Geijn, Robert: Householder QR factorization with randomization for column pivoting (HQRRP) (2017)
  3. Beliakov, Gleb; Matiyasevich, Yuri: A parallel algorithm for calculation of determinants and minors using arbitrary precision arithmetic (2016)
  4. Nourgaliev, R.; Luo, H.; Weston, B.; Anderson, A.; Schofield, S.; Dunn, T.; Delplanque, J.-P.: Fully-implicit orthogonal reconstructed discontinuous Galerkin method for fluid dynamics with phase change (2016)
  5. Schatz, Martin D.; van de Geijn, Robert A.; Poulson, Jack: Parallel matrix multiplication: a systematic journey (2016)
  6. Banerjee, Amartya S.; Elliott, Ryan S.; James, Richard D.: A spectral scheme for Kohn-Sham density functional theory of clusters (2015)
  7. Vecharynski, Eugene; Yang, Chao; Pask, John E.: A projected preconditioned conjugate gradient algorithm for computing many extreme eigenpairs of a Hermitian matrix (2015)
  8. Fabregat-Traver, Diego; Aulchenko, Yurii S.; Bientinesi, Paolo: Solving sequences of generalized least-squares problems on multi-threaded architectures (2014)
  9. Martinsson, P.G.: A direct solver for variable coefficient elliptic PDEs discretized via a composite spectral collocation method (2013)
  10. Petschow, M.; Peise, E.; Bientinesi, P.: High-performance solvers for dense Hermitian eigenproblems (2013)
  11. Poulson, Jack; Marker, Bryan; van de Geijn, Robert A.; Hammond, Jeff R.; Romero, Nichols A.: Elemental, a new framework for distributed memory dense matrix computations (2013)
  12. Petra, Cosmin G.; Anitescu, Mihai: A preconditioning technique for Schur complement systems arising in stochastic optimization (2012)
  13. Träff, Jesper Larsson: Alternative, uniformly expressive and more scalable interfaces for collective communication in MPI (2012) ioport