Portable and architecture independent parallel performance tuning using BSP. A call-graph profiling tool has been designed and implemented to analyse the efficiency of programs written in BSPlib. This tool highlights computation and communication imbalance in parallel programs, exposing portions of program code which are amenable to improvement. A unique feature of this profiler is that it uses the bulk synchronous parallel cost model, thus providing a mechanism for portable and architecture-independent parallel performance tuning. In order to test the capabilities of the model on a real-world example, the performance characteristics of an SQL query processing application are investigated on a number of different parallel architectures.

References in zbMATH (referenced in 38 articles , 1 standard article )

Showing results 1 to 20 of 38.
Sorted by year (citations)

1 2 next

  1. Langguth, Johannes; Patwary, Md. Mostofa Ali; Manne, Fredrik: Parallel algorithms for bipartite matching problems on distributed memory computers (2011)
  2. Valiant, Leslie G.: A bridging model for multi-core computing (2011)
  3. Gerbessiotis, Alexandros V.: Parallel option price valuations with the explicit finite difference method (2010)
  4. Rauber, Thomas; Rünger, Gudula: Parallel programming for multicore and cluster systems (2010)
  5. Marowka, Ami: BSP2OMP: a compiler for translating BSP programs to OpenMP (2009)
  6. Abu Salem, Fatima K.; Yang, Laurence T.: Parallel methods for absolute irreducibility testing (2008)
  7. Sala, Marzio; Spotz, William F.; Heroux, Michael A.: PyTrilinos: High-performance distributed-memory solvers for Python (2008)
  8. Merlin, Armelle; Hains, Gaétan: A bulk-synchronous parallel process algebra (2007)
  9. Kendall, Ricky A.; Sosonkina, Masha; Gropp, William D.; Numrich, Robert W.; Sterling, Thomas: Parallel programming models applicable to cluster computing and beyond (2006)
  10. Climent, Joan-Josep; Perea, Carmen; Tortosa, Leandro; Zamora, Antonio: An overlapped two-way method for solving tridiagonal linear systems in a BSP computer (2005)
  11. Vastenhouw, Brendan; Bisseling, Rob H.: A two-dimensional data distribution method for parallel sparse matrix-vector multiplication (2005)
  12. Zhou, Jianguo; Chen, Yifeng: Generating C code from LOGS specifications (2005)
  13. Abu Salem, Fatima: A BSP parallel model for the Göttfert algorithm over $F _2$ (2004)
  14. Climent, Joan-Josep; Perea, Carmen; Tortosa, Leandro; Zamora, Antonio: Sequential and parallel synchronous alternating iterative methods (2004)
  15. Climent, Joan-Josep; Perea, Carmen; Tortosa, Leandro; Zamora, Antonio: A BSP recursive divide and conquer algorithm to solve a tridiagonal linear system (2004)
  16. Loulergue, Frédéric: Communication primitives for minimally synchronous parallel ML (2004)
  17. Goldman, Alfredo; Mounie, Gregory; Trystram, Denis: 1-optimality of static BSP computations: Scheduling independent chains as a case study. (2003)
  18. Tong, Weiqin; Ding, Jingbo; Cai, Lizhi: A parallel programming environment on grid (2003)
  19. Aldinucci, Marco: Automatic program transformation: the META tool for skeleton-based languages (2002)
  20. Beran, Martin: Pipelined decomposable BSP computers (2002)

1 2 next