Nektar++

Exploiting batch processing on streaming architectures to solve 2D elliptic finite element problems: a hybridized discontinuous Galerkin (HDG) case study. Numerical methods for elliptic partial differential equations (PDEs) within both continuous and hybridized discontinuous Galerkin (HDG) frameworks share the same general structure: local (elemental) matrix generation followed by a global linear system assembly and solve. The lack of inter-element communication and easily parallelizable nature of the local matrix generation stage coupled with the parallelization techniques developed for the linear system solvers make a numerical scheme for elliptic PDEs a good candidate for implementation on streaming architectures such as modern graphical processing units (GPUs). We propose an algorithmic pipeline for mapping an elliptic finite element method to the GPU and perform a case study for a particular method within the HDG framework. This study provides comparison between CPU and GPU implementations of the method as well as highlights certain performance-crucial implementation details. The choice of the HDG method for the case study was dictated by the computationally-heavy local matrix generation stage as well as the reduced trace-based communication pattern, which together make the method amenable to the fine-grained parallelism of GPUs. We demonstrate that the HDG method is well-suited for GPU implementation, obtaining total speedups on the order of 30-35 times over a serial CPU implementation for moderately sized problems.


References in zbMATH (referenced in 55 articles , 1 standard article )

Showing results 1 to 20 of 55.
Sorted by year (citations)

1 2 3 next

  1. Kumar, Abhishek; Pothérat, Alban: Mixed baroclinic convection in a cavity (2020)
  2. Moratilla-Vega, M. A.; Lackhove, K.; Janicka, J.; Xia, H.; Page, G. J.: Jet noise analysis using an efficient LES/high-order acoustic coupling method (2020)
  3. Cantwell, Chris D.; Nielsen, Allan S.: A minimally intrusive low-memory approach to resilience for existing transient solvers (2019)
  4. Cervi, Jessica; Spiteri, Raymond J.: A comparison of fourth-order operator splitting methods for cardiac simulations (2019)
  5. Jallepalli, Ashok; Haimes, Robert; Kirby, Robert M.: Adaptive characteristic length for L-SIAC filtering of FEM data (2019)
  6. Jallepalli, Ashok; Kirby, Robert M.: Efficient algorithms for the line-SIAC filter (2019)
  7. Jayaraman, Balaji; Lu, Chen; Whitman, Joshua; Chowdhary, Girish: Sparse feature map-based Markov models for nonlinear fluid flows (2019)
  8. Moxey, David; Sastry, Shankar P.; Kirby, Robert M.: Interpolation error bounds for curvilinear finite elements and their implications on adaptive mesh refinement (2019)
  9. Perry, Daniel J.; Kirby, Robert M.; Narayan, Akil; Whitaker, Ross T.: Allocation strategies for high fidelity models in the multifidelity regime (2019)
  10. Ren, Chengjiao; Cheng, Liang; Tong, Feifei; Xiong, Chengwang; Chen, Tingguo: Oscillatory flow regimes around four cylinders in a diamond arrangement (2019)
  11. Wang, Rui; Bao, Yan; Zhou, Dai; Zhu, Hongbo; Ping, Huan; Han, Zhaolong; Serson, Douglas; Xu, Hui: Flow instabilities in the wake of a circular cylinder with parallel dual splitter plates attached (2019)
  12. Abide, Stéphane; Viazzo, Stéphane; Raspo, Isabelle; Randriamampianina, Anthony: Higher-order compact scheme for high-performance computing of stratified rotating flows (2018)
  13. Badia, Santiago; Martín, Alberto F.; Principe, Javier: \textttFEMPAR: an object-oriented parallel finite element framework (2018)
  14. Kopriva, David A.: Stability of overintegration methods for nodal discontinuous Galerkin spectral element methods (2018)
  15. Mengaldo, Gianmarco; De Grazia, Daniele; Moura, Rodrigo C.; Sherwin, Spencer J.: Spatial eigensolution analysis of energy-stable flux reconstruction schemes and influence of the numerical flux on accuracy and robustness (2018)
  16. Mengaldo, G.; Moura, R. C.; Giralda, B.; Peiró, J.; Sherwin, S. J.: Spatial eigensolution analysis of discontinuous Galerkin schemes with practical insights for under-resolved computations and implicit LES (2018)
  17. Minjeaud, Sebastian; Pasquetti, Richard: High order (C^0)-continuous Galerkin schemes for high order PDEs, conservation of quadratic invariants and application to the Korteweg-de Vries model (2018)
  18. Winters, Andrew R.; Moura, Rodrigo C.; Mengaldo, Gianmarco; Gassner, Gregor J.; Walch, Stefanie; Peiro, Joaquim; Sherwin, Spencer J.: A comparative study on polynomial dealiasing and split form discontinuous Galerkin schemes for under-resolved turbulence computations (2018)
  19. Xin, Dabo; Zhang, Hongfu; Ou, Jinping: Secondary wake instability of a bridge model and its application in wake control (2018)
  20. Xiong, Chengwang; Cheng, Liang; Tong, Feifei; An, Hongwei: Oscillatory flow regimes for a circular cylinder near a plane boundary (2018)

1 2 3 next


Further publications can be found at: https://www.nektar.info/community/publications/