Towards distributed heterogenous high-performance computing with ViennaCL. One of the major drawbacks of computing with graphics adapters is the limited available memory for relevant problem sizes. To overcome this limitation for the ViennaCL library, we investigate a partitioning approach for one of the standard benchmark problems in High-Performance Computing (HPC), namely the dense matrix-matrix product. We apply this partitioning approach to problems exceeding the available memory on graphics adapters. Moreover, we investigate the applicability on distributed memory systems by facilitating the Message Passing Interface (MPI). Our approach is presented in detail and benchmark results are given.
Keywords for this software
References in zbMATH (referenced in 7 articles , 1 standard article )
Showing results 1 to 7 of 7.
- Gremse, Felix; Küpper, Kerstin; Naumann, Uwe: Memory-efficient sparse matrix-matrix multiplication by row merging on many-core architectures (2018)
- Li, Ang; Serban, Radu; Negrut, Dan: Analysis of a splitting approach for the parallel solution of linear systems on GPU cards (2017)
- Rupp, Karl; Tillet, Philippe; Rudolf, Florian; Weinbub, Josef; Morhammer, Andreas; Grasser, Tibor; Jüngel, Ansgar; Selberherr, Siegfried: ViennaCL-linear algebra library for multi- and many-core architectures (2016)
- Rupp, Karl; Weinbub, Josef; Jüngel, Ansgar; Grasser, Tibor: Pipelined iterative solvers with kernel fusion for graphics processing units (2016)
- Lani, Andrea; Yalim, Mehmet Sarp; Poedts, Stefaan: A GPU-enabled finite volume solver for global magnetospheric simulations on unstructured grids (2014)
- Demidov, Denis; Ahnert, Karsten; Rupp, Karl; Gottschling, Peter: Programming CUDA and OpenCL: a case study using modern C++ libraries (2013)
- Georgescu, Serban; Chow, Peter; Okuda, Hiroshi: GPU acceleration for FEM-based structural analysis (2013)
Further publications can be found at: http://viennacl.sourceforge.net/viennacl-publications.html