Accelerating scientific computations with mixed precision algorithms. On modern architectures, the performance of 32-bit operations is often at least twice as fast as the performance of 64-bit operations. By using a combination of 32-bit and 64-bit floating point arithmetic, the performance of many dense and sparse linear algebra algorithms can be significantly enhanced while maintaining the 64-bit accuracy of the resulting solution. The approach presented here can apply not only to conventional processors but also to other technologies such as Field Programmable Gate Arrays (FPGA), Graphical Processing Units (GPU), and the STI Cell BE processor. Results on modern processor architectures and the STI Cell BE are presented.
Keywords for this software
References in zbMATH (referenced in 9 articles , 1 standard article )
Showing results 1 to 9 of 9.
- Banaś, Krzysztof; Płaszewski, Przemysław; Macioł, Paweł: Numerical integration on GPUs for higher order finite elements (2014)
- Glimberg, S.L.; Engsig-Karup, A.P.; Madsen, M.G.: A fast GPU-accelerated mixed-precision strategy for fully nonlinear water wave computations (2013)
- Tsuchida, Eiji; Choe, Yoong-Kee: Iterative diagonalization of symmetric matrices in mixed precision and its application to electronic structure calculations (2012)
- Anzt, Hartwig; Heuveline, Vincent; Rocker, Björn: An error correction solver for linear systems: evaluation of mixed precision implementations (2011)
- Knibbe, H.; Oosterlee, C.W.; Vuik, C.: GPU implementation of a Helmholtz Krylov solver preconditioned by a shifted Laplace multigrid method (2011)
- Rocker, Björn; Kolberg, Mariana; Heuveline, Vincent: The impact of data distribution in accuracy and performance of parallel linear algebra subroutines (2011)
- Baboulin, Marc; Buttari, Alfredo; Dongarra, Jack; Kurzak, Jakub; Langou, Julie; Langou, Julien; Luszczek, Piotr; Tomov, Stanimire: Accelerating scientific computations with mixed precision algorithms (2009)
- Iushchenko, R.A.: Measuring the performance of parallel computers with distributed memory (2009)
- Baboulin, Marc; Buttari, Alfredo; Dongarra, Jack; Kurzak, Jakub; Langou, Julie; Langou, Julien; Luszczek, Piotr; Tomov, Stanimire: Accelerating scientific computations with mixed precision algorithms (2008) ioport