Towards distributed heterogenous high-performance computing with ViennaCL. One of the major drawbacks of computing with graphics adapters is the limited available memory for relevant problem sizes. To overcome this limitation for the ViennaCL library, we investigate a partitioning approach for one of the standard benchmark problems in High-Performance Computing (HPC), namely the dense matrix-matrix product. We apply this partitioning approach to problems exceeding the available memory on graphics adapters. Moreover, we investigate the applicability on distributed memory systems by facilitating the Message Passing Interface (MPI). Our approach is presented in detail and benchmark results are given.

References in zbMATH (referenced in 21 articles , 1 standard article )

Showing results 1 to 20 of 21.
Sorted by year (citations)

1 2 next

  1. Jæger, Karoline Horgmo; Hustad, Kristian Gregorius; Cai, Xing; Tveito, Aslak: Operator splitting and finite difference schemes for solving the EMI model (2021)
  2. Rizzi, Francesco; Parish, Eric J.; Blonigan, Patrick J.; Tencer, John: A compute-bound formulation of Galerkin model reduction for linear time-invariant dynamical systems (2021)
  3. Anil, Robin; Capan, Gokhan; Drost-Fromm, Isabel; Dunning, Ted; Friedman, Ellen; Grant, Trevor; Quinn, Shannon; Ranjan, Paritosh; Schelter, Sebastian; Yılmazel, Özgür: Apache Mahout: machine learning on distributed dataflow systems (2020)
  4. Abduljabbar, Mustafa; Al Farhan, Mohammed; Al-Harthi, Noha; Chen, Rui; Yokota, Rio; Bagci, Hakan; Keyes, David: Extreme scale FMM-accelerated boundary integral equation solver for wave scattering (2019)
  5. Chow, Alex D.; Rogers, Benedict D.; Lind, Steven J.; Stansby, Peter K.: Numerical wave basin using incompressible smoothed particle hydrodynamics (ISPH) on a single GPU with vertical cylinder test cases (2019)
  6. Demidov, D.: AMGCL: an efficient, flexible, and extensible algebraic multigrid implementation (2019)
  7. Erofeev, K. Yu.; Khramchenkov, E. M.; Biryal’tsev, E. V.: High-performance processing of covariance matrices using GPU computations (2019)
  8. Gremse, Felix; Küpper, Kerstin; Naumann, Uwe: Memory-efficient sparse matrix-matrix multiplication by row merging on many-core architectures (2018)
  9. Jeon, Kiwan; Lee, Chang-Ock; Woo, Eung Je: A harmonic (B_z)-based conductivity reconstruction method in MREIT with influence of non-transversal current density (2018)
  10. Li, Ang; Serban, Radu; Negrut, Dan: Analysis of a splitting approach for the parallel solution of linear systems on GPU cards (2017)
  11. Rupp, Karl; Tillet, Philippe; Rudolf, Florian; Weinbub, Josef; Morhammer, Andreas; Grasser, Tibor; Jüngel, Ansgar; Selberherr, Siegfried: ViennaCL-linear algebra library for multi- and many-core architectures (2016)
  12. Rupp, Karl; Weinbub, Josef; Jüngel, Ansgar; Grasser, Tibor: Pipelined iterative solvers with kernel fusion for graphics processing units (2016)
  13. Markopoulos, Alexandros; Hapla, Vaclav; Cermak, Martin; Fusek, Martin: Massively parallel solution of elastoplasticity problems with tens of millions of unknowns using permoncube and FLLOP packages (2015)
  14. David S Medina, Amik St-Cyr, T. Warburton: OCCA: A unified approach to multi-threading languages (2014) arXiv
  15. Lani, Andrea; Yalim, Mehmet Sarp; Poedts, Stefaan: A GPU-enabled finite volume solver for global magnetospheric simulations on unstructured grids (2014)
  16. Weinbub, Josef; Rupp, Karl; Selberherr, Siegfried: Highly flexible and reusable finite element simulations with ViennaX (2014)
  17. Demidov, Denis; Ahnert, Karsten; Rupp, Karl; Gottschling, Peter: Programming CUDA and OpenCL: a case study using modern C++ libraries (2013)
  18. Georgescu, Serban; Chow, Peter; Okuda, Hiroshi: GPU acceleration for FEM-based structural analysis (2013)
  19. Rossi, R.; Mossaiby, F.; Idelsohn, S. R.: A portable OpenCL-based unstructured edge-based finite element Navier-Stokes solver on graphics hardware (2013)
  20. Viñas, Moisés; Bozkus, Zeki; Fraguela, Basilio B.: Exploiting heterogeneous parallelism with the heterogeneous programming library (2013) ioport

1 2 next

Further publications can be found at: