The SPLASH-2 programs: characterization and methodological considerations. The SPLASH-2 suite of parallel applications has recently been released to facilitate the study of centralized and distributed shared-address-space multiprocessors. In this context, this paper has two goals. One is to quantitatively characterize the SPLASH-2 programs in terms of fundamental properties and architectural interactions that are important to understand them well. The properties we study include the computational load balance, communication to computation ratio and traffic needs, important working set sizes, and issues related to spatial locality, as well as how these properties scale with problem size and the number of processors. The other, related goal is methodological: to assist people who will use the programs in architectural evaluations to prune the space of application and machine parameters in an informed and meaningful way. For example, by characterizing the working sets of the applications, we describe which operating points in terms of cache size and problem size are representative of realistic situations, which are not, and which re redundant. Using SPLASH-2 as an example, we hope to convey the importance of understanding the interplay of problem size, number of processors, and working sets in designing experiments and interpreting their results.

References in zbMATH (referenced in 41 articles )

Showing results 1 to 20 of 41.
Sorted by year (citations)

1 2 3 next

  1. Wardi, Y.; Seatzu, C.; Chen, X.; Yalamanchili, S.: Performance regulation of event-driven dynamical systems using infinitesimal perturbation analysis (2016)
  2. Sánchez, Daniel; Cebrián, Juan M.; García, José M.; Aragón, Juan L.: Soft-error mitigation by means of decoupled transactional memory threads (2015)
  3. Wang, Hui; Wang, Rui; Luan, Zhongzhi; Qian, Xuehai; Qian, Depei: Improving multiprocessor performance with fine-grain coherence bypass (2015)
  4. Liu, Peng; Fang, Lei; Huang, Michael C.: DEAM: decoupled, expressive, area-efficient metadata cache (2014)
  5. Munir, Arslan; Gordon-Ross, Ann; Ranka, Sanjay; Koushanfar, Farinaz: A queueing theoretic approach for performance evaluation of low-power multi-core embedded systems (2014)
  6. Mushtaq, Hamid; Al-Ars, Zaid; Bertels, Koen: Efficent and highly portable deterministic multithreading (DetLock) (2014)
  7. Pankratius, Victor; Adl-Tabatabai, Ali-Reza: Software engineering with transactional memory versus locks in practice (2014)
  8. Abellán, José L.; Fernández, Juan; Acacio, Manuel E.: Design of an efficient communication infrastructure for highly contended locks in many-core CMPs (2013)
  9. Oz, Isil; Topcuoglu, Haluk Rahmi; Kandemir, Mahmut; Tosun, Oguz: Thread vulnerability in parallel applications (2012)
  10. Sahelices, Benjamín; De Dios, Agustín; Ibáñez, Pablo; Viñals-Yúfera, Víctor; Llabería, José María: Effcient handling of lock hand-off in DSM multiprocessors with buffering coherence controllers (2012)
  11. Zhou, Xu; Lu, Kai; Wang, Xiaoping; Li, Xu: Exploiting parallelism in deterministic shared memory multiprocessing (2012)
  12. Chiu, Yung-Chang; Shieh, Ce-Kuen; Huang, Tzu-Chi; Liang, Tyng-Yeu; Chu, Kuo-Chih: Data race avoidance and replay scheme for developing and debugging parallel programs on distributed shared memory systems (2011)
  13. Fensch, Christian; Cintra, Marcelo: An evaluation of an OS-based coherence scheme for tiled CMPs (2011)
  14. Hammoud, Mohammad; Cho, Sangyeun; Melhem, Rami: C-AMTE: A location mechanism for flexible cache management in chip multiprocessors (2011)
  15. Hoffmann, Ralf; Rauber, Thomas: Adaptive task pools: Efficiently balancing large number of tasks on shared-address spaces (2011)
  16. Kim, Hyunhee; Kim, Jihong: A leakage-aware L2 cache management technique for producer-consumer sharing in low-power chip multiprocessors (2011)
  17. Rutzig, Mateus B.; Beck, Antonio C.S.; Madruga, Felipe; Alves, Marco A.; Freitas, Henrique C.; Maillard, Nicolas; Navaux, Philippe O.A.; Carro, Luigi: Boosting parallel applications performance on applying DIM technique in a multiprocessing environment (2011)
  18. Akram, Shoaib; Papakonstantinou, Alexandros; Kumar, Rakesh; Chen, Deming: A workload-adaptive and reconfigurable bus architecture for multicore processors (2010)
  19. Guironnet de Massas, Pierre; Pétrot, Frédéric: Evaluation of the implementation cost of cache coherence protocols using omniscient actions (2010)
  20. Harmanci, Derin; Gramoli, Vincent; Felber, Pascal; Fetzer, Christof: Extensible transactional memory testbed (2010)

1 2 3 next