CUBLAS

The CUBLAS library is an implementation of BLAS (Basic Linear Algebra Subprograms) on top of the NVIDIA®CUDA™ runtime. It allows the user to access the computational resources of NVIDIA Graphics Processing Unit (GPU), but does not auto-parallelize across multiple GPUs. To use the CUBLAS library, the application must allocate the required matrices and vectors in the GPU memory space, fill them with data, call the sequence of desired CUBLAS functions, and then upload the results from the GPU memory space back to the host. The CUBLAS library also provides helper functions for writing and retrieving data from the GPU.


References in zbMATH (referenced in 65 articles )

Showing results 1 to 20 of 65.
Sorted by year (citations)

1 2 3 4 next

  1. Bernaschi, Massimo; Carrozzo, Mauro; Franceschini, Andrea; Janna, Carlo: A dynamic pattern factored sparse approximate inverse preconditioner on graphics processing units (2019)
  2. Berrone, S.; D’Auria, A.; Vicini, F.: Fast and robust flow simulations in discrete fracture networks with gpgpus (2019)
  3. Cheng, Xuan; Zeng, Ming; Lin, Jinpeng; Wu, Zizhao; Liu, Xinguo: Efficient (L_0) resampling of point sets (2019)
  4. Chopp, D. L.: Introduction to high performance scientific computing (2019)
  5. Defez, Emilio; Ibáñez, Javier; Peinado, Jesús; Sastre, Jorge; Alonso-Jordá, Pedro: An efficient and accurate algorithm for computing the matrix cosine based on new Hermite approximations (2019)
  6. Du, Cheng-Han; Chiou, Yih-Peng; Wang, Weichung: Compressed hierarchical Schur algorithm for frequency-domain analysis of photonic structures (2019)
  7. Li, Ruipeng; Xi, Yuanzhe; Erlandson, Lucas; Saad, Yousef: The eigenvalues slicing library (EVSL): algorithms, implementation, and software (2019)
  8. Sastre, Jorge; Ibáñez, Javier; Alonso-Jordá, Pedro; Peinado, Jesús; Defez, Emilio: Fast Taylor polynomial evaluation for the computation of the matrix cosine (2019)
  9. Tim Besard, Valentin Churavy, Alan Edelman, Bjorn De Sutter: Rapid software prototyping for heterogeneous and distributed platforms (2019) not zbMATH
  10. van den Berg, E.: The Ocean Tensor Package (2019) not zbMATH
  11. Bosner, Nela; Bujanović, Zvonimir; Drmač, Zlatko: Parallel solver for shifted systems in a hybrid CPU-GPU framework (2018)
  12. Defez, Emilio; Ibáñez, Javier; Sastre, Jorge; Peinado, Jesús; Alonso, Pedro: A new efficient and accurate spline algorithm for the matrix exponential computation (2018)
  13. Pikle, Nileshchandra K.; Sathe, Shailesh R.; Vyavhare, Arvind Y.: GPGPU-based parallel computing applied in the FEM using the conjugate gradient algorithm: a review (2018)
  14. Wen, Zeyi; Shi, Jiashuai; Li, Qinbin; He, Bingsheng; Chen, Jian: ThunderSVM: a fast SVM library on GPUs and CPUs (2018)
  15. Yang, Wangdong; Li, Kenli; Li, Keqin: A parallel computing method using blocked format with optimal partitioning for SpMV on GPU (2018)
  16. Alonso, Pedro; Ibáñez, Javier; Sastre, Jorge; Peinado, Jesús; Defez, Emilio: Efficient and accurate algorithms for computing matrix trigonometric functions (2017)
  17. Al-Refaie, Ahmed F.; Yurchenko, Sergei N.; Tennyson, Jonathan: GPU accelerated intensities MPI (GAIN-MPI): a new method of computing Einstein-(A) coefficients (2017)
  18. Aurentz, Jared L.; Kalantzis, Vassilis; Saad, Yousef: Cucheb: a GPU implementation of the filtered Lanczos procedure (2017)
  19. Bosner, Nela; Karlsson, Lars: Parallel and heterogeneous (m)-Hessenberg-triangular-triangular reduction (2017)
  20. Cedric Nugteren: CLBlast: A Tuned OpenCL BLAS Library (2017) arXiv

1 2 3 4 next