kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Leveraging HPC Profiling & Tracing Tools to Understand the Performance of Particle-in-Cell Monte Carlo Simulations
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST).ORCID iD: 0000-0003-2095-3063
Institute of Plasma Physics of the CAS, Prague, Czech Republic.ORCID iD: 0000-0002-4229-0961
LeCAD, University of Ljubljana, Ljubljana, Slovenia.ORCID iD: 0000-0003-0594-0555
KTH, School of Electrical Engineering and Computer Science (EECS), Centres, Centre for High Performance Computing, PDC. KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST).ORCID iD: 0000-0003-4158-3583
Show others and affiliations
2023 (English)In: arXiv:2306.16512 / [ed] Demetris Zeinalipour, Limassol, Cyprus: Springer Nature, 2023, article id arXiv:2306.16512Conference paper, Published paper (Refereed)
Abstract [en]

Large-scale plasma simulations are critical for designing and developing next-generation fusion energy devices and modeling industrial plasmas. BIT1 is a massively parallel Particle-in-Cell code designed for specifically studying plasma material interaction in fusion devices. Its most salient characteristic is the inclusion of collision Monte Carlo models for different plasma species. In this work, we characterize single node, multiple nodes, and I/O performances of the BIT1 code in two realistic cases by using several HPC profilers, such as perf, IPM, Extrae/Paraver, and Darshan tools. We find that the BIT1 sorting function on-node performance is the main performance bottleneck. Strong scaling tests show a parallel performance of 77% and 96% on 2,560 MPI ranks for the two test cases. We demonstrate that communication, load imbalance and self-synchronization are important factors impacting the performance of the BIT1 on large-scale runs.

Place, publisher, year, edition, pages
Limassol, Cyprus: Springer Nature, 2023. article id arXiv:2306.16512
Keywords [en]
Performance Monitoring and Analysis · PIC Performance Bottleneck · Large-Scale PIC Simulations
National Category
Computer Sciences Fusion, Plasma and Space Physics
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-339259OAI: oai:DiVA.org:kth-339259DiVA, id: diva2:1809638
Conference
Euro-Par 2023: Parallel Processing Workshops: Euro-Par 2023 International Workshops, Limassol, Cyprus, August 28–September 1, 2023
Note

QC 20231106

Available from: 2023-11-05 Created: 2023-11-05 Last updated: 2023-11-29Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Conference website

Authority records

Williams, Jeremy J.Peng, Ivy BoMarkidis, Stefano

Search in DiVA

By author/editor
Williams, Jeremy J.Tskhakaya, DavidCostea, StefanPeng, Ivy BoGarcia-Gasulla, MartaMarkidis, Stefano
By organisation
Computational Science and Technology (CST)Centre for High Performance Computing, PDCSeRC - Swedish e-Science Research Centre
Computer SciencesFusion, Plasma and Space Physics

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 239 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf