kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Strong Scaling of OpenACC enabled Nek5000 on several GPU based HPC systems
KTH, School of Electrical Engineering and Computer Science (EECS), Centres, Centre for High Performance Computing, PDC.
Uppsala University.
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST).ORCID iD: 0000-0003-3374-8093
KTH, School of Engineering Sciences (SCI), Engineering Mechanics, Fluid Mechanics and Engineering Acoustics.ORCID iD: 0000-0002-7448-3290
Show others and affiliations
2022 (English)In: HPCAsia2022: International Conference on High Performance Computing in Asia-Pacific Region, Association for Computing Machinery (ACM) , 2022, p. 94-102Conference paper, Published paper (Refereed)
Abstract [en]

We present new results on the strong parallel scaling for the OpenACC-accelerated implementation of the high-order spectral element fluid dynamics solver Nek5000. The test case considered consists of a direct numerical simulation of fully-developed turbulent flow in a straight pipe, at two different Reynolds numbers Reτ = 360 and Reτ = 550, based on friction velocity and pipe radius. The strong scaling is tested on several GPU-enabled HPC systems, including the Swiss Piz Daint system, TACC's Longhorn, Jülich's JUWELS Booster, and Berzelius in Sweden. The performance results show that speed-up between 3-5 can be achieved using the GPU accelerated version compared with the CPU version on these different systems. The run-time for 20 timesteps reduces from 43.5 to 13.2 seconds with increasing the number of GPUs from 64 to 512 for Reτ = 550 case on JUWELS Booster system. This illustrates the GPU accelerated version the potential for high throughput. At the same time, the strong scaling limit is significantly larger for GPUs, at about 2000 - 5000 elements per rank; compared to about 50 - 100 for a CPU-rank.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM) , 2022. p. 94-102
Series
ACM International Conference Proceeding Series
National Category
Computer Sciences Fluid Mechanics and Acoustics
Identifiers
URN: urn:nbn:se:kth:diva-309189DOI: 10.1145/3492805.3492818Scopus ID: 2-s2.0-85122621284OAI: oai:DiVA.org:kth-309189DiVA, id: diva2:1639938
Conference
HPC Asia2022: International Conference on High Performance Computing in Asia-Pacific Region Virtual Event Japan January 12 - 14, 2022
Note

QC 20220223

Part of conference proceedings: ISBN 978-145038498-8

Available from: 2022-02-22 Created: 2022-02-22 Last updated: 2024-03-18Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Vincent, JonathanKarp, MartinPeplinski, AdamJansson, NiclasPodobas, ArturMarkidis, StefanoPleiter, DirkSchlatter, Philipp

Search in DiVA

By author/editor
Vincent, JonathanKarp, MartinPeplinski, AdamJansson, NiclasPodobas, ArturMarkidis, StefanoPleiter, DirkSchlatter, Philipp
By organisation
Centre for High Performance Computing, PDCComputational Science and Technology (CST)Fluid Mechanics and Engineering Acoustics
Computer SciencesFluid Mechanics and Acoustics

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 119 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf