• SLEEF

  • Referenced in 40 articles [sw05889]
  • SLEEF - SIMD Library for Evaluating Elementary Functions Most of today’s processors have capabilities ... execute SIMD instructions, and we can expect significant speed-ups in various kinds of computation ... look-ups, scattering from, or gathering into SIMD registers, or conditional branches using SSE2...
  • GTEngine

  • Referenced in 38 articles [sw24041]
  • computing using general purpose GPU programming (GPGPU). SIMD code is also available using Intel Streaming ... SIMD Extensions (SSE). Portions of the code are described in various books as well...
  • MediaBench

  • Referenced in 33 articles [sw08949]
  • architectures have emerged which have VLIW and SIMD structures that are well matched...
  • HElib

  • Referenced in 31 articles [sw09518]
  • specifies their cost. This “platform” is a SIMD environment (somewhat similar to Intel...
  • SFMT

  • Referenced in 17 articles [sw22111]
  • SIMD-oriented Fast Mersenne Twister (SFMT): twice faster than Mersenne Twister. SFMT ... CPUs, such as multi-stage pipelining and SIMD (e.g. 128-bit integer) instructions. It supports...
  • SUPERB

  • Referenced in 20 articles [sw07276]
  • analysis component, a catalog of MIMD and SIMD parallelization transformations, and a flexible dialog facility...
  • SELL_C_sigma

  • Referenced in 11 articles [sw11232]
  • vector multiplication on modern processors with wide SIMD units. Sparse matrix-vector multiplication (spMVM ... wide single instruction multiple data (SIMD) units in current multi- and many-core processors should ... variant of Sliced ELLPACK, as a SIMD-friendly data format which combines long-standing ideas...
  • Vc

  • Referenced in 6 articles [sw21533]
  • architectures this is implemented via SIMD registers and instructions. A single SIMD register can store ... values and a single SIMD instruction can execute N operations on those values ... automatic transformation of scalar codes to SIMD instructions (auto-vectorization). However, the compiler must reconstruct ... will often not be transformed into efficient SIMD code. The Vc library provides the missing...
  • GKLEE

  • Referenced in 10 articles [sw12794]
  • conservative static analysis or conservative modeling of SIMD concurrency generate false alarms resulting in wasted...
  • ARGO

  • Referenced in 8 articles [sw03139]
  • computing engines -- a massively parallel special-purpose SIMD architecture and a general-purpose system -- while...
  • BLAKE

  • Referenced in 8 articles [sw11099]
  • assembly and vectorized code using SIMD CPU instructions; they describe BLAKE’s properties with respect...
  • SIMD

  • Referenced in 3 articles [sw03445]
  • complete Euclidean distance transform on mesh-connected SIMD The Euclidean distance transform (EDT) converts ... into a parallel algorithm for mesh-connected SIMD computers. For an n× n image...
  • ALPBench

  • Referenced in 5 articles [sw13521]
  • parallelism using POSIX threads and sub-word SIMD (Inters SSE2) instructions respectively. Second, the paper...
  • GBLA

  • Referenced in 5 articles [sw15152]
  • prime fields of various size using SIMD vectorization. In various different experimental results we show...
  • PRAND

  • Referenced in 3 articles [sw12445]
  • Barash and Shchur (2006) and the efficient SIMD realizations proposed in Barash and Shchur ... Using massive parallelism of modern GPUs and SIMD parallelism of modern CPUs substantially improves performance...
  • Riposte

  • Referenced in 3 articles [sw20328]
  • design has sought improved performance through increased SIMD and multi-core parallelism. At the same ... memory traffic, compile them to use hardware SIMD units, and schedule them to run across...
  • Jacket

  • Referenced in 4 articles [sw11529]
  • Users do not need to learn CUDA, SIMD, HPC, and other complicated parallel programming technologies...
  • BEAGLE

  • Referenced in 4 articles [sw12586]
  • CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets...
  • exafmm

  • Referenced in 4 articles [sw30090]
  • extract the full potential of the SIMD units on the latest CPUs, the inner kernels...
  • EmptyHeaded

  • Referenced in 4 articles [sw32535]
  • layouts that leverage single-instruction multiple data (SIMD) parallelism. With this architecture, EmptyHeaded outperforms high...