Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
OpenACC acceleration of the Nek5000 spectral element code
KTH, School of Computer Science and Communication (CSC), High Performance Computing and Visualization (HPCViz). KTH, Centres, SeRC - Swedish e-Science Research Centre.ORCID iD: 0000-0003-0639-0639
KTH, Centres, SeRC - Swedish e-Science Research Centre. KTH, School of Computer Science and Communication (CSC), Centres, Centre for High Performance Computing, PDC. KTH, School of Computer Science and Communication (CSC), High Performance Computing and Visualization (HPCViz).ORCID iD: 0000-0002-3859-9480
KTH, School of Computer Science and Communication (CSC), High Performance Computing and Visualization (HPCViz). KTH, Centres, SeRC - Swedish e-Science Research Centre.ORCID iD: 0000-0002-5415-1248
KTH, School of Computer Science and Communication (CSC), High Performance Computing and Visualization (HPCViz). KTH, Centres, SeRC - Swedish e-Science Research Centre.ORCID iD: 0000-0002-9901-9857
Show others and affiliations
2015 (English)In: The international journal of high performance computing applications, ISSN 1094-3420, E-ISSN 1741-2846, Vol. 29, no 3, 311-319 p.Article in journal (Refereed) Published
Abstract [en]

We present a case study of porting NekBone, a skeleton version of the Nek5000 code, to a parallel GPU-accelerated system. Nek5000 is a computational fluid dynamics code based on the spectral element method used for the simulation of incompressible flow. The original NekBone Fortran source code has been used as the base and enhanced by OpenACC directives. The profiling of NekBone provided an assessment of the suitability of the code for GPU systems, and indicated possible kernel optimizations. To port NekBone to GPU systems required little effort and a small number of additional lines of code (approximately one OpenACC directive per 1000 lines of code). The naïve implementation using OpenACC leads to little performance improvement: on a single node, from 16 Gflops obtained with the version without OpenACC, we reached 20 Gflops with the naïve OpenACC implementation. An optimized NekBone version leads to a 43 Gflop performance on a single node. In addition, we ported and optimized NekBone to parallel GPU systems, reaching a parallel efficiency of 79.9% on 1024 GPUs of the Titan XK7 supercomputer at the Oak Ridge National Laboratory.

Place, publisher, year, edition, pages
Sage Publications, 2015. Vol. 29, no 3, 311-319 p.
National Category
Computer Science Computational Mathematics
Identifiers
URN: urn:nbn:se:kth:diva-171357DOI: 10.1177/1094342015576846ISI: 000358414200006Scopus ID: 2-s2.0-84938095938OAI: oai:DiVA.org:kth-171357DiVA: diva2:843223
Funder
Swedish e‐Science Research Center
Note

QC 20150804

Available from: 2015-07-27 Created: 2015-07-27 Last updated: 2017-08-14Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textScopus

Authority records BETA

Markidis, StefanoGong, JingSchliephake, MichaelLaure, Erwin

Search in DiVA

By author/editor
Markidis, StefanoGong, JingSchliephake, MichaelLaure, Erwin
By organisation
High Performance Computing and Visualization (HPCViz)SeRC - Swedish e-Science Research CentreCentre for High Performance Computing, PDC
In the same journal
The international journal of high performance computing applications
Computer ScienceComputational Mathematics

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 184 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf