The CUBLAS library is an implementation of BLAS (Basic Linear Algebra Subprograms) on top of the NVIDIA®CUDA™ runtime. It allows the user to access the computational resources of NVIDIA Graphics Processing Unit (GPU), but does not auto-parallelize across multiple GPUs. To use the CUBLAS library, the application must allocate the required matrices and vectors in the GPU memory space, fill them with data, call the sequence of desired CUBLAS functions, and then upload the results from the GPU memory space back to the host. The CUBLAS library also provides helper functions for writing and retrieving data from the GPU.

References in zbMATH (referenced in 75 articles )

Showing results 1 to 20 of 75.
Sorted by year (citations)

1 2 3 4 next

  1. Ahrens, Peter; Demmel, James; Nguyen, Hong Diep: Algorithms for efficient reproducible floating point summation (2020)
  2. Huang, Jianyu; Yu, Chenhan D.; Geijn, Robert A. van de: Strassen’s algorithm reloaded on GPUs (2020)
  3. Seyoon Ko, Hua Zhou, Jin Zhou, Joong-Ho Won: DistStat.jl: Towards Unified Programming for High-Performance Statistical Computing Environments in Julia (2020) arXiv
  4. Xiaohui Wang, Ying Xiong, Yang Wei, Mingxuan Wang, Lei Li: LightSeq: A High Performance Inference Library for Sequence Processing and Generation (2020) arXiv
  5. Bernaschi, Massimo; Carrozzo, Mauro; Franceschini, Andrea; Janna, Carlo: A dynamic pattern factored sparse approximate inverse preconditioner on graphics processing units (2019)
  6. Berrone, S.; D’Auria, A.; Vicini, F.: Fast and robust flow simulations in discrete fracture networks with gpgpus (2019)
  7. Cheng, Xuan; Zeng, Ming; Lin, Jinpeng; Wu, Zizhao; Liu, Xinguo: Efficient (L_0) resampling of point sets (2019)
  8. Chopp, D. L.: Introduction to high performance scientific computing (2019)
  9. Defez, Emilio; Ibáñez, Javier; Peinado, Jesús; Sastre, Jorge; Alonso-Jordá, Pedro: An efficient and accurate algorithm for computing the matrix cosine based on new Hermite approximations (2019)
  10. Du, Cheng-Han; Chiou, Yih-Peng; Wang, Weichung: Compressed hierarchical Schur algorithm for frequency-domain analysis of photonic structures (2019)
  11. Flegar, Goran; Scheidegger, Florian; Novaković, Vedran; Mariani, Giovani; Tomás, Andrés E.; Malossi, A. Cristiano I.; Quintana-Ortí, Enrique S.: FloatX: A C++ library for customized floating-point arithmetic (2019)
  12. Jaber J. Hasbestan, Inanc Senocak: PittPack: An Open-Source Poisson’s Equation Solver for Extreme-Scale Computing with Accelerators (2019) arXiv
  13. Li, Ruipeng; Xi, Yuanzhe; Erlandson, Lucas; Saad, Yousef: The eigenvalues slicing library (EVSL): algorithms, implementation, and software (2019)
  14. Sastre, Jorge; Ibáñez, Javier; Alonso-Jordá, Pedro; Peinado, Jesús; Defez, Emilio: Fast Taylor polynomial evaluation for the computation of the matrix cosine (2019)
  15. Tim Besard, Valentin Churavy, Alan Edelman, Bjorn De Sutter: Rapid software prototyping for heterogeneous and distributed platforms (2019) not zbMATH
  16. van den Berg, E.: The Ocean Tensor Package (2019) not zbMATH
  17. Wu, Rongteng; Xie, Xiaohong: A heterogeneous parallel LU factorization algorithm based on a basic column block uniform allocation strategy (2019)
  18. Bosner, Nela; Bujanović, Zvonimir; Drmač, Zlatko: Parallel solver for shifted systems in a hybrid CPU-GPU framework (2018)
  19. Defez, Emilio; Ibáñez, Javier; Sastre, Jorge; Peinado, Jesús; Alonso, Pedro: A new efficient and accurate spline algorithm for the matrix exponential computation (2018)
  20. Kůs, Pavel; Lederer, Hermann; Marek, Andreas: GPU optimization of large-scale eigenvalue solver (2018)

1 2 3 4 next