Mint

Mint: realizing CUDA performance in 3D stencil methods with Annotated C. We present Mint, a programming model that enables the non-expert to enjoy the performance benefits of hand coded CUDA without becoming entangled in the details. Mint targets stencil methods, which are an important class of scientific applications. We have implemented the Mint programming model with a source-to-source translator that generates optimized CUDA C from traditional C source. The translator relies on annotations to guide translation at a high level. The set of pragmas is small, and the model is compact and simple. Yet, Mint is able to deliver performance competitive with painstakingly hand-optimized CUDA. We show that, for a set of widely used stencil kernels, Mint realized 80% of the performance obtained from aggressively optimized CUDA on the 200 series NVIDIA GPUs. Our optimizations target three dimensional kernels, which present a daunting array of optimizations.

This software is also peer reviewed by journal TOMS.


References in zbMATH (referenced in 6 articles )

Showing results 1 to 6 of 6.
Sorted by year (citations)

  1. Akhtar, Muhammad Naveed; Durad, Muhammad Hanif; Usman, Anila; Mughal, Muhammad Abid: Efficient memory access patterns for solving 3D Laplace equation on GPU (2018)
  2. Lusher, David J.; Jammy, Satya P.; Sandham, Neil D.: Shock-wave/boundary-layer interactions in the automatic source-code generation framework OpenSBLI (2018)
  3. Zhang, Weiqun; Almgren, Ann; Day, Marcus; Nguyen, Tan; Shalf, John; Unat, Didem: BoxLib with tiling: an adaptive mesh refinement software framework (2016) ioport
  4. Malas, T.; Hager, G.; Ltaief, H.; Stengel, H.; Wellein, G.; Keyes, D.: Multicore-optimized wavefront diamond blocking for optimizing stencil updates (2015)
  5. Mo, Tieqiang; Li, Renfa: A new memory mapping mechanism for GPGPUs’ stencil computation (2015)
  6. Nguyen, Tan; Hefenbrock, Daniel; Oberg, Jason; Kastner, Ryan; Baden, Scott: A software-based dynamic-warp scheduling approach for load-balancing the Viola-Jones face detection algorithm on gpus (2013) ioport