CUDA

The NVIDIA® CUDA® Toolkit provides a comprehensive development environment for C and C++ developers building GPU-accelerated applications. The CUDA Toolkit includes a compiler for NVIDIA GPUs, math libraries, and tools for debugging and optimizing the performance of your applications. You’ll also find programming guides, user manuals, API reference, and other documentation to help you get started quickly accelerating your application with GPUs.


References in zbMATH (referenced in 677 articles , 2 standard articles )

Showing results 1 to 20 of 677.
Sorted by year (citations)

1 2 3 ... 32 33 34 next

  1. Alonso, Pedro; Ibáñez, Javier; Sastre, Jorge; Peinado, Jesús; Defez, Emilio: Efficient and accurate algorithms for computing matrix trigonometric functions (2017)
  2. Antti-Pekka Hynninen, Dmitry I. Lyakh: cuTT: A High-Performance Tensor Transpose Library for CUDA Compatible GPUs (2017) arXiv
  3. Cedric Nugteren: CLBlast: A Tuned OpenCL BLAS Library (2017) arXiv
  4. Chen, Tianran; Lee, Tsung-Lin; Li, Tien-Yien: Mixed cell computation in HOM4ps (2017)
  5. Chen, Tianran; Mehta, Dhagash: Parallel degree computation for binomial systems (2017)
  6. Conte, Dajana; Paternoster, Beatrice: Parallel methods for weakly singular Volterra integral equations on GPUs (2017)
  7. Francesco Giannini, Vincenzo Laveglia, Alessandro Rossi, Dario Zanca, Andrea Zugarini: Neural Networks for Beginners. A fast implementation in Matlab, Torch, TensorFlow (2017) arXiv
  8. Ingo Steinwart, Philipp Thomann: liquidSVM: A Fast and Versatile SVM package (2017) arXiv
  9. Lefticaru, Raluca; Macías-Ramos, Luis F.; Niculescu, Ionuţ Mihai; Mierlă, Laurenţiu: Agent-based simulation of kernel P systems with division rules using FLAME (2017)
  10. Peter Steinbach, Matthias Werner: gearshifft - The FFT Benchmark Suite for Heterogeneous Platforms (2017) arXiv
  11. Phipps, E.; D’Elia, M.; Edwards, H.C.; Hoemmen, M.; Hu, J.; Rajamanickam, S.: Embedded ensemble propagation for improving performance, portability, and scalability of uncertainty quantification on emerging computational architectures (2017)
  12. Rafael B. Frigori: PHAST: Protein-like heteropolymer analysis by statistical thermodynamics (2017) arXiv
  13. Tranquilli, Paul; Glandon, S.Ross; Sarshar, Arash; Sandu, Adrian: Analytical Jacobian-vector products for the matrix-free time integration of partial differential equations (2017)
  14. Andersson, Fredrik; Carlsson, Marcus; Nikitin, Viktor V.: Fast algorithms and efficient GPU implementations for the Radon transform and the back-projection operator represented as convolution operators (2016)
  15. Anzt, Hartwig; Chow, Edmond; Saak, Jens; Dongarra, Jack: Updating incomplete factorization preconditioners for model order reduction (2016)
  16. Bäumelt, Zdeněk; Dvořák, Jan; Šucha, Přemysl; Hanzálek, Zdeněk: A novel approach for nurse rerostering based on a parallel algorithm (2016)
  17. Bernaschi, Massimo; Bisson, Mauro; Fantozzi, Carlo; Janna, Carlo: A factored sparse approximate inverse preconditioned conjugate gradient solver on graphics processing units (2016)
  18. Bialas, Piotr; Strzelecki, Adam: Benchmarking the cost of thread divergence in CUDA (2016) ioport
  19. Bock, Nicolas; Challacombe, Matt; Kalé, Laxmikant V.: Solvers for $\mathcalO(N)$ electronic structure in the strong scaling limit (2016)
  20. Boschetti, Marco A.; Maniezzo, Vittorio; Strappaveccia, Francesco: Using GPU computing for solving the two-dimensional guillotine cutting problem (2016)

1 2 3 ... 32 33 34 next