Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
OpenACC acceleration of the Nek5000 spectral element code
KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz). KTH, Centra, SeRC - Swedish e-Science Research Centre.ORCID-id: 0000-0003-0639-0639
KTH, Centra, SeRC - Swedish e-Science Research Centre. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC. KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).ORCID-id: 0000-0002-3859-9480
KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz). KTH, Centra, SeRC - Swedish e-Science Research Centre.ORCID-id: 0000-0002-5415-1248
KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz). KTH, Centra, SeRC - Swedish e-Science Research Centre.ORCID-id: 0000-0002-9901-9857
Vise andre og tillknytning
2015 (engelsk)Inngår i: The international journal of high performance computing applications, ISSN 1094-3420, E-ISSN 1741-2846, Vol. 29, nr 3, s. 311-319Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

We present a case study of porting NekBone, a skeleton version of the Nek5000 code, to a parallel GPU-accelerated system. Nek5000 is a computational fluid dynamics code based on the spectral element method used for the simulation of incompressible flow. The original NekBone Fortran source code has been used as the base and enhanced by OpenACC directives. The profiling of NekBone provided an assessment of the suitability of the code for GPU systems, and indicated possible kernel optimizations. To port NekBone to GPU systems required little effort and a small number of additional lines of code (approximately one OpenACC directive per 1000 lines of code). The naïve implementation using OpenACC leads to little performance improvement: on a single node, from 16 Gflops obtained with the version without OpenACC, we reached 20 Gflops with the naïve OpenACC implementation. An optimized NekBone version leads to a 43 Gflop performance on a single node. In addition, we ported and optimized NekBone to parallel GPU systems, reaching a parallel efficiency of 79.9% on 1024 GPUs of the Titan XK7 supercomputer at the Oak Ridge National Laboratory.

sted, utgiver, år, opplag, sider
Sage Publications, 2015. Vol. 29, nr 3, s. 311-319
HSV kategori
Identifikatorer
URN: urn:nbn:se:kth:diva-171357DOI: 10.1177/1094342015576846ISI: 000358414200006Scopus ID: 2-s2.0-84938095938OAI: oai:DiVA.org:kth-171357DiVA, id: diva2:843223
Forskningsfinansiär
Swedish e‐Science Research Center
Merknad

QC 20150804

Tilgjengelig fra: 2015-07-27 Laget: 2015-07-27 Sist oppdatert: 2018-01-11bibliografisk kontrollert

Open Access i DiVA

Fulltekst mangler i DiVA

Andre lenker

Forlagets fulltekstScopus

Personposter BETA

Markidis, StefanoGong, JingSchliephake, MichaelLaure, Erwin

Søk i DiVA

Av forfatter/redaktør
Markidis, StefanoGong, JingSchliephake, MichaelLaure, Erwin
Av organisasjonen
I samme tidsskrift
The international journal of high performance computing applications

Søk utenfor DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric

doi
urn-nbn
Totalt: 268 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf