An MPI/OpenACC implementation of a high-order electromagnetics solver with GPUDirect communication
2016 (English)In: The international journal of high performance computing applications, ISSN 1094-3420, E-ISSN 1741-2846, Vol. 30, no 3, 320-334 p.Article in journal (Refereed) Published
We present performance results and an analysis of a message passing interface (MPI)/OpenACC implementation of an electromagnetic solver based on a spectral-element discontinuous Galerkin discretization of the time-dependent Maxwell equations. The OpenACC implementation covers all solution routines, including a highly tuned element-by-element operator evaluation and a GPUDirect gather-scatter kernel to effect nearest neighbor flux exchanges. Modifications are designed to make effective use of vectorization, streaming, and data management. Performance results using up to 16,384 graphics processing units of the Cray XK7 supercomputer Titan show more than 2.5x speedup over central processing unit-only performance on the same number of nodes (262,144 MPI ranks) for problem sizes of up to 6.9 billion grid points. We discuss performance-enhancement strategies and the overall potential of GPU-based computing for this class of problems.
Place, publisher, year, edition, pages
Sage Publications, 2016. Vol. 30, no 3, 320-334 p.
Hybrid MPI, OpenACC, GPUDirect, spectral element-discontinuous Galerkin
IdentifiersURN: urn:nbn:se:kth:diva-194020DOI: 10.1177/1094342015626584ISI: 000382958000005ScopusID: 2-s2.0-84983414976OAI: oai:DiVA.org:kth-194020DiVA: diva2:1037621
FunderSwedish e‐Science Research Center
QC 201610172016-10-172016-10-142016-10-17Bibliographically approved