HPL - A Portable Implementation of the High-Performance Linpack Benchmark for Distributed-Memory Computers. HPL is a software package that solves a (random) dense linear system in double precision (64 bits) arithmetic on distributed-memory computers. It can thus be regarded as a portable as well as freely available implementation of the High Performance Computing Linpack Benchmark. The algorithm used by HPL can be summarized by the following keywords: Two-dimensional block-cyclic data distribution - Right-looking variant of the LU factorization with row partial pivoting featuring multiple look-ahead depths - Recursive panel factorization with pivot search and column broadcast combined - Various virtual panel broadcast topologies - bandwidth reducing swap-broadcast algorithm - backward substitution with look-ahead of depth 1. The HPL package provides a testing and timing program to quantify the accuracy of the obtained solution as well as the time it took to compute it. The best performance achievable by this software on your system depends on a large variety of factors. Nonetheless, with some restrictive assumptions on the interconnection network, the algorithm described here and its attached implementation are scalable in the sense that their parallel efficiency is maintained constant with respect to the per processor memory usage. The HPL software package requires the availibility on your system of an implementation of the Message Passing Interface MPI (1.1 compliant). An implementation of either the Basic Linear Algebra Subprograms BLAS or the Vector Signal Image Processing Library VSIPL is also needed. Machine-specific as well as generic implementations of MPI, the BLAS and VSIPL are available for a large variety of systems.
Keywords for this software
References in zbMATH (referenced in 10 articles )
Showing results 1 to 10 of 10.
- Fasi, Massimiliano; Higham, Nicholas J.: Generating extreme-scale matrices with specified singular values or condition number (2021)
- Fasi, Massimiliano; Higham, Nicholas J.: Matrices with tunable infinity-norm condition number and no need for pivoting in LU factorization (2021)
- Hokpunna, Arpiruk; Misaka, Takashi; Obayashi, Shigeru; Wongwises, Somchai; Manhart, Michael: Finite surface discretization for incompressible Navier-Stokes equations and coupled conservation laws (2020)
- Iakymchuk, Roman; Barreda, Maria; Wiesenberger, Matthias; Aliaga, José I.; Quintana-Ortí, Enrique S.: Reproducibility strategies for parallel preconditioned conjugate gradient (2020)
- Sukkari, Dalal; Ltaief, Hatem; Esposito, Aniello; Keyes, David: A QDWH-based SVD software framework on distributed-memory manycore systems (2019)
- Chen, Cheng; Fang, Jianbin; Tang, Tao; Yang, Canqun: LU factorization on heterogeneous systems: an energy-efficient approach towards high performance (2017)
- Mikhail Smelyanskiy, Nicolas P. D. Sawaya, Alan Aspuru-Guzik: qHiPSTER: The Quantum High Performance Software Testing Environment (2016) arXiv
- Liao, Xiang-Ke; Yung, Can-Qun; Tang, Tao; Yi, Hui-Zhan; Wang, Feng; Wu, Qiang; Xue, Jingling: OpenMC: towards simplifying programming for TianHe supercomputers (2014) ioport
- Sun, Ning-Hui; Xing, Jing; Huo, Zhi-Gang; Tan, Guang-Ming; Xiong, Jin; Li, Bo; Ma, Can: Dawning Nebulae: a PetaFLOPS supercomputer with a heterogeneous structure (2011) ioport
- Wang, Feng; Yang, Can-Qun; Du, Yun-Fei; Chen, Juan; Yi, Hui-Zhan; Xu, Wei-Xia: Optimizing Linpack benchmark on GPU-accelerated petascale supercomputer (2011) ioport