Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Nekbone performance on GPUs with OpenACC and CUDA Fortran implementations
KTH, School of Computer Science and Communication (CSC), Centres, Centre for High Performance Computing, PDC. KTH, Centres, SeRC - Swedish e-Science Research Centre.ORCID iD: 0000-0002-3859-9480
KTH, School of Computer Science and Communication (CSC), Centres, Centre for High Performance Computing, PDC.ORCID iD: 0000-0003-0639-0639
KTH, School of Computer Science and Communication (CSC), Centres, Centre for High Performance Computing, PDC.ORCID iD: 0000-0002-9901-9857
Show others and affiliations
2016 (English)In: Journal of Supercomputing, ISSN 0920-8542, E-ISSN 1573-0484, Vol. 72, no 11, p. 4160-4180Article in journal (Refereed) Published
Abstract [en]

We present a hybrid GPU implementation and performance analysis of Nekbone, which represents one of the core kernels of the incompressible Navier-Stokes solver Nek5000. The implementation is based on OpenACC and CUDA Fortran for local parallelization of the compute-intensive matrix-matrix multiplication part, which significantly minimizes the modification of the existing CPU code while extending the simulation capability of the code to GPU architectures. Our discussion includes the GPU results of OpenACC interoperating with CUDA Fortran and the gather-scatter operations with GPUDirect communication. We demonstrate performance of up to 552 Tflops on 16, 384 GPUs of the OLCF Cray XK7 Titan.

Place, publisher, year, edition, pages
Springer, 2016. Vol. 72, no 11, p. 4160-4180
Keywords [en]
Nekbone/Nek5000, OpenACC, CUDA Fortran, GPUDirect, Gather-scatter communication, Spectral element discretization
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
URN: urn:nbn:se:kth:diva-198970DOI: 10.1007/s11227-016-1744-5ISI: 000387234200007Scopus ID: 2-s2.0-84978656496OAI: oai:DiVA.org:kth-198970DiVA, id: diva2:1065628
Funder
Swedish eā€Science Research Center
Note

QC 20170116

Available from: 2017-01-16 Created: 2016-12-22 Last updated: 2017-08-16Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records BETA

Gong, JingMarkidis, StefanoLaure, Erwin

Search in DiVA

By author/editor
Gong, JingMarkidis, StefanoLaure, Erwin
By organisation
Centre for High Performance Computing, PDCSeRC - Swedish e-Science Research Centre
In the same journal
Journal of Supercomputing
Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 148 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf