PAPI

PAPI (Performance Application Programmer’s Interface) is designed to efficiently access the performance hardware counters on modern computer processors. PAPI is being developed at the University of Tennessee’s Innovative Computing Laboratory in the Computer Science Department. (Source: http://www.psc.edu/)


References in zbMATH (referenced in 32 articles )

Showing results 1 to 20 of 32.
Sorted by year (citations)

1 2 next

  1. Chen, Xinwei; Wardi, Yorai; Yalamanchili, Sudhakar: Instruction-throughput regulation in computer processors with data-center applications (2018)
  2. Cebrián, Juan M.; Cecilia, José M.; Hernández, Mario; García, José M.: Code modernization strategies to 3-D stencil-based applications on intel Xeon Phi: KNC and KNL (2017)
  3. Iwen, M. A.; Ong, B. W.: A distributed and incremental SVD algorithm for agglomerative data analysis on large networks (2016)
  4. Li, Shengguo; Liao, Xiangke; Liu, Jie; Jiang, Hao: New fast divide-and-conquer algorithms for the symmetric tridiagonal eigenvalue problem. (2016)
  5. Cebrian, Juan M.; Jahre, Magnus; Natvig, Lasse: ParVec: vectorizing the PARSEC benchmark suite (2015)
  6. Cebrián, Juan M.; Natvig, Lasse; Meyer, Jan Christian: Performance and energy impact of parallelization and vectorization techniques in modern microprocessors (2014) ioport
  7. de la Cruz, Raúl; Araya-Polo, Mauricio: Algorithm 942: Semi-stencil (2014)
  8. Ding, Chen; Xiang, Xiaoya; Bao, Bin; Luo, Hao; Luo, Ying-Wei; Wang, Xiao-Lin: Performance metrics and models for shared cache (2014) ioport
  9. Zhang, Wei; Wei, Wenjie; Cai, Xing: Performance modeling of serial and parallel implementations of the fractional Adams-Bashforth-Moulton method (2014)
  10. Bock, Nicolas; Challacombe, Matt: An optimized sparse approximate matrix multiply for matrices with decay (2013)
  11. Buttari, Alfredo: Fine-grained multithreading for the multifrontal $QR$ factorization of sparse matrices (2013)
  12. Gai, Jiading; Obeid, Nady; Holtrop, Joseph L.; Wu, Xiao-Long; Lam, Fan; Fu, Maojing; Haldar, Justin P.; Hwu, Wen-mei W.; Liang, Zhi-Pei; Sutton, Bradley P.: More IMPATIENT: a gridding-accelerated Toeplitz-based strategy for non-Cartesian high-resolution 3D MRI on gpus (2013) ioport
  13. Gracioli, Giovani; Fröhlich, Ant^onio Augusto; Pellizzoni, Rodolfo; Fischmeister, Sebastian: Implementation and evaluation of global and partitioned scheduling in a real-time OS (2013)
  14. Russell, Francis P.; Kelly, Paul H. J.: Optimized code generation for finite element local assembly using symbolic manipulation (2013)
  15. Rutar, Nick; Hollingsworth, Jeffrey K.: Data centric techniques for mapping performance data to program variables (2012) ioport
  16. Yaseen, Ashraf; Li, Yaohang: Accelerating knowledge-based energy evaluation in protein structure modeling with graphics processing units (2012) ioport
  17. Kalinnik, Natalia; Korch, Matthias; Rauber, Thomas: An efficient time-step-based self-adaptive algorithm for predictor-corrector methods of Runge-Kutta type (2011)
  18. Askitis, Nikolas; Sinha, Ranjan: Engineering scalable, cache and space efficient tries for strings (2010) ioport
  19. Hejazialhosseini, Babak; Rossinelli, Diego; Bergdorf, Michael; Koumoutsakos, Petros: High order finite volume methods on wavelet-adapted grids with local time-stepping on multicore architectures for the simulation of shock-bubble interactions (2010)
  20. Hölldobler, Steffen; Manthey, Norbert; Saptawijaya, Ari: Improving resource-unaware SAT solvers (2010)

1 2 next


Further publications can be found at: http://icl.cs.utk.edu/papi/pubs/index.html