kth.sePublikationer
Ändra sökning
Avgränsa sökresultatet
1 - 13 av 13
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Träffar per sida
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
Markera
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1.
    Alam, Sadaf R.
    et al.
    Swiss Fed Inst Technol, Swiss Natl Supercomp Ctr CSCS, Zurich, Switzerland..
    Bartolome, Javier
    Barcelona Supercomp Ctr BSC, Barcelona, Spain..
    Carpene, Michele
    Italian Supercomp Ctr CINECA, Casalecchio Di Reno, Italy..
    Happonen, Kalle
    Finnish Supercomp Ctr CSC, Espoo, Finland..
    Lafoucriere, Jacques-Charles
    Commissariat Energie Atom & Energies Alternat CEA, Paris, France..
    Pleiter, Dirk
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC. KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap. Juelich Supercomp Ctr JSC, Julich, Germany..
    Fenix: A Pan-European Federation of Supercomputing and Cloud e-Infrastructure Services2022Ingår i: Communications of the ACM, ISSN 0001-0782, E-ISSN 1557-7317, Vol. 65, nr 4, s. 46-47Artikel i tidskrift (Övrigt vetenskapligt)
  • 2. Batelaan, M.
    et al.
    Horsley, R.
    Nakamura, Y.
    Perlt, H.
    Pleiter, Dirk
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Rakow, P. E. L.
    Schierholz, G.
    Stüben, H.
    Young, R. D.
    Zanotti, J. M.
    Collaboration, Q C D S F-U K Q C D-C S S M
    Nucleon Form Factors from the Feynman-Hellmann Method in Lattice QCD2022Ingår i: Proceedings of Science, Sissa Medialab Srl , 2022Konferensbidrag (Refereegranskat)
    Abstract [en]

    Lattice QCD calculations of the nucleon electromagnetic form factors are of interest at both the high and low momentum transfer regions. For high momentum transfers especially there are open questions which require more intense study, such as the potential zero crossing in the proton's electric form factor. We will present recent progress from the QCDSF/UKQCD/CSSM collaboration on the calculation of these form factors using the Feynman-Hellmann method in lattice QCD. The Feynman-Hellmann method allows for greater control over excited states which we take advantage of by going to high values of the momentum transfer. In this proceeding we present results of the form factors up to 6 GeV2, using Nf = 2 + 1 flavour fermions for three different pion masses in the range 310-470 MeV. The results are extrapolated to the physical pion mass through the use of a flavour breaking expansion. 

  • 3. Bickerton, J. M.
    et al.
    Cooke, A. N.
    Horsley, R.
    Nakamura, Y.
    Perlt, H.
    Pleiter, Dirk
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Rakow, P. E. L.
    Schierholz, G.
    Stüben, H.
    Young, R. D.
    Zanotti, J. M.
    Patterns of flavour symmetry breaking in hadron matrix elements involving u, d and s quarks2022Ingår i: Proceedings of Science, Sissa Medialab Srl , 2022Konferensbidrag (Refereegranskat)
    Abstract [en]

    Using an SU(3)-flavour symmetry breaking expansion between the strange and light quark masses, we determine how this constrains the extrapolation of baryon octet matrix elements and form factors. In particular we can construct certain combinations, which fan out from the symmetric point (when all the quark masses are degenerate) to the point where the light and strange quarks take their physical values. As a further example we consider the vector amplitude at zero momentum transfer for flavour changing currents.

  • 4.
    Brank, Bine
    et al.
    Forschungszentrum Julich, Julich Supercomp Ctr, Julich, Germany..
    Pleiter, Dirk
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Assessing the State of Autovectorization Support based on SVE2022Ingår i: 2022 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER 2022), Institute of Electrical and Electronics Engineers (IEEE) , 2022, s. 556-562Konferensbidrag (Refereegranskat)
    Abstract [en]

    So-called SIMD instructions, which trigger operations that process in each clock cycle a data tuple, have become widespread in modern processor architectures. In particular, processors for high-performance computing (HPC) systems rely on this additional level of parallelism to reach a high throughput of arithmetic operations. Leveraging these SIMD instructions can still be challenging for application software developers. This challenge has become simpler due to a compiler technique called auto-vectorization. In this paper, we explore the current state of auto-vectorization capabilities using state-of-the-art compilers using a recent extension of the Arm instruction set architecture, called SVE. We measure the performance gains on a recent processor architecture supporting SVE, namely the Fujitsu A64FX processor.

  • 5.
    Brank, Bine
    et al.
    Microsoft, Munich, Germany.
    Pleiter, Dirk
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Beräkningsvetenskap och beräkningsteknik (CST).
    CPU Architecture Modelling and Co-design2023Ingår i: High Performance Computing - 38th International Conference, ISC High Performance 2023, Proceedings, Springer Nature , 2023, s. 3-21Konferensbidrag (Refereegranskat)
    Abstract [en]

    Co-design has become an established process for both developing high-performance computing (HPC) architectures (and, more specifically, CPU architectures) as well as HPC applications. The co-design process is frequently based on models. This paper discusses an approach to CPU architecture modelling and its relation to modelling theory. The approach is implemented using the gem5 simulator for Arm-based CPU architectures and applied for the purpose of generating co-design knowledge using two applications that are widely used on HPC systems.

  • 6. De La Motte, S. A.
    et al.
    Hollitt, S. E.
    Horsley, R.
    Jackson, P. D.
    Nakamura, Y.
    Perlt, H.
    Pleiter, Dirk
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Rakow, P. E. L.
    Schierholz, G.
    Stüben, H.
    Young, R. D.
    Zanotti, J. M.
    Measurements of SU(3) f symmetry breaking in B meson decay constants2022Ingår i: Proceedings of Science, Sissa Medialab Srl , 2022Konferensbidrag (Refereegranskat)
    Abstract [en]

    We present updates from QCDSF/UKQCD/CSSM on the SU(3) f breaking in B meson decay constants. The b-quarks are generated with an anisotropic clover-improved action, and are tuned to match properties of the physical B and B∗ mesons. Configurations are generated with m = 1/3(2ml + ms) kept constant to control symmetry breaking effects. Various sources of systematic uncertainty will be discussed, including those from continuum extrapolations and extrapolations to the physical point. We also present new efforts to calculate fB and fBs using weighted averages across multiple time fitting regions. The use of an automated weighted averaging technique over multiple fitting ranges allows for timely tuning of the b-quark and reduces the impact of systematic errors from fitting range biases in calculations of fB and fBs. 

  • 7.
    Haine, Christopher
    et al.
    HPE HPC AI Res Lab, Basel, Switzerland..
    Haus, Utz-Uwe
    HPE HPC AI Res Lab, Basel, Switzerland..
    Martinasso, Maxime
    Swiss Natl Supercomp Ctr, CSCS, CH-6900 Lugano, Switzerland..
    Pleiter, Dirk
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC. Forschungszentrum Julich, D-52425 Julich, Germany..
    Tessier, Francois
    Inria Rennes Bretagne Atlantique, F-35042 Rennes, France..
    Sarmany, Domokos
    European Ctr Medium Range Weather Forecasts ECMWF, Reading RG2 9AX, Berks, England..
    Smart, Simon
    European Ctr Medium Range Weather Forecasts ECMWF, Reading RG2 9AX, Berks, England..
    Quintino, Tiago
    European Ctr Medium Range Weather Forecasts ECMWF, Reading RG2 9AX, Berks, England..
    Tate, Adrian
    NAG, Oxford, England..
    A Middleware Supporting Data Movement in Complex and Software-Defined Storage and Memory Architectures2021Ingår i: High Performance Computing - ISC High Performance Digital 2021 International Workshops / [ed] Jagode, H Anzt, H Ltaief, H Luszczek, P, Springer Nature , 2021, Vol. 12761, s. 346-357Konferensbidrag (Refereegranskat)
    Abstract [en]

    Among the broad variety of challenges that arise from workloads in a converged HPC and Cloud infrastructure, data movement is of paramount importance, especially oncoming exascale systems featuring multiple tiers of memory and storage. While the focus has, for years, been primarily on optimizing computations, the importance of improving data handling on such architectures is now well understood. As optimization techniques can be applied at different stages (operating system, run-time system, programming environment, and so on), a middleware providing a uniform and consistent data awareness becomes necessary. In this paper, we introduce a novel memory- and data-aware middleware called Maestro, designed for data orchestration.

  • 8.
    Kunkel, Julian Martin
    et al.
    Universität Göttingen, Göttingen, Germany.
    Boehme, Christian
    GWDG, Göttingen, Germany.
    Decker, Jonathan
    Universität Göttingen, Göttingen, Germany.
    Magugliani, Fabrizio
    E4 Computer Engineering S.p.A., Scandiano, Italy.
    Pleiter, Dirk
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Beräkningsvetenskap och beräkningsteknik (CST).
    Koller, Bastian
    HLRS Stuttgart, Germany.
    Sivalingam, Karthee
    Huawei Technologies Düsseldorf GmbH, Düsseldorf, Germany.
    Pllana, Sabri
    Forschung Burgenland GmbH, Eisenstadt, Austria.
    Nikolov, Alexander
    SYNYO GmbH, Wien, Austria.
    Soyturk, Mujdat
    Marmara University, Istanbul, Turkey.
    Racca, Christian
    Consorzio TOP-IX, Torino e Piemonte, Exchange Point, Torino, Italy, Exchange Point.
    Bartolini, Andrea
    Alma Mater Studiorum-Università di Bologna, Bologna, Italy.
    Tate, Adrian
    Numerical Algorithms Group Limited, Oxford, United Kingdom.
    Yaman, Berkay
    BigTRI Bilisim A.S., Istanbul, Turkey.
    DECICE: Device-Edge-Cloud Intelligent Collaboration Framework2023Ingår i: Proceedings of the 20th ACM International Conference on Computing Frontiers 2023, CF 2023, Association for Computing Machinery (ACM) , 2023, s. 266-271Konferensbidrag (Refereegranskat)
    Abstract [en]

    DECICE is a Horizon Europe project that is developing an AI-enabled open and portable management framework for automatic and adaptive optimization and deployment of applications in computing continuum encompassing from IoT sensors on the Edge to large-scale Cloud/HPC computing infrastructures. In this paper, we describe the DECICE framework and architecture. Furthermore, we highlight use-cases for framework evaluation: intelligent traffic intersection, magnetic resonance imaging, and emergency response.

  • 9.
    Portero, Antoni
    et al.
    Jülich Supercomputing Centre, Novel System Architectures Design, Forschungszentrum Jülich GmbH, Jülich, Germany.
    Falquez, Carlos
    Jülich Supercomputing Centre, Novel System Architectures Design, Forschungszentrum Jülich GmbH, Jülich, Germany.
    Ho, Nam
    Jülich Supercomputing Centre, Novel System Architectures Design, Forschungszentrum Jülich GmbH, Jülich, Germany.
    Petrakis, Polydoros
    Institute of Computer Science, Foundation for Research and Technology - Hellas (FORTH), Heraklion, Greece.
    Nassyr, Stepan
    Jülich Supercomputing Centre, Novel System Architectures Design, Forschungszentrum Jülich GmbH, Jülich, Germany.
    Marazakis, Manolis
    Institute of Computer Science, Foundation for Research and Technology - Hellas (FORTH), Heraklion, Greece.
    Dolbeau, Romain
    SiPearl, Rennes, France.
    Cifuentes, Jorge Alejandro Nocua
    ATOS, Les Clayes-sous-Bois, France.
    Alvarez, Luis Bertran
    ATOS, Les Clayes-sous-Bois, France.
    Pleiter, Dirk
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Beräkningsvetenskap och beräkningsteknik (CST).
    Suarez, Estela
    Jülich Supercomputing Centre, Novel System Architectures Design, Forschungszentrum Jülich GmbH, Jülich, Germany.
    COMPESCE: A Co-design Approach for Memory Subsystem Performance Analysis in HPC Many-Cores2023Ingår i: Architecture of Computing Systems: 36th International Conference, ARCS 2023, Proceedings, Springer Nature , 2023, s. 105-119Konferensbidrag (Refereegranskat)
    Abstract [en]

    This paper explores the memory subsystem design through gem5 simulations of a non-uniform memory access (NUMA) architecture with ARM cores equipped with vector engines. And connected to a Network-on-Chip (NoC) following the Coherent Hub Interface (CHI) protocol. The study quantifies the benefits of vectorization, prefetching, and multichannel NoC configurations using a benchmark for generating memory patterns and indexed accesses. The outcomes provide insights into improving bus utilization and bandwidth and reducing stalls in the system. The paper proposes hardware/software (HW/SW) advancements to reach and use the HBM device with a higher percentage than 80% at the memory controllers in the simulated manycore system.

  • 10.
    Qasem, Apan
    et al.
    Texas State Univ, Dept Comp Sci, San Marcos, TX 78666 USA..
    Anzt, Hartwig
    Karlsruhe Inst Technol KIT, Karlsruhe, Germany.;Univ Tennessee UTK, Knoxville, TN USA..
    Ayguade, Eduard
    Barcelona Supercomp Ctr, Barcelona, Spain.;Univ Politecn Cataluna, Barcelona, Spain..
    Cahill, Katharine
    Ohio Supercomp Ctr, Columbus, OH USA..
    Canal, Ramon
    Barcelona Supercomp Ctr, Barcelona, Spain.;Univ Politecn Cataluna, Barcelona, Spain..
    Chan, Jany
    Ohio State Univ, Coll Engn, Columbus, OH USA..
    Fosler-Lussier, Eric
    Ohio State Univ, Coll Engn, Columbus, OH USA..
    Goebel, Fritz
    Karlsruhe Inst Technol KIT, Karlsruhe, Germany..
    Jain, Arpan
    Ohio State Univ, Coll Engn, Columbus, OH USA..
    Koch, Marcel
    Karlsruhe Inst Technol KIT, Karlsruhe, Germany..
    Kuzak, Mateusz
    Netherlands eSci Ctr, Amsterdam, Netherlands..
    Llosa, Josep
    Barcelona Supercomp Ctr, Barcelona, Spain.;Univ Politecn Cataluna, Barcelona, Spain..
    Machiraju, Raghu
    Ohio State Univ, Coll Engn, Columbus, OH USA..
    Martorell, Xavier
    Barcelona Supercomp Ctr, Barcelona, Spain.;Univ Politecn Cataluna, Barcelona, Spain..
    Nayak, Pratik
    Karlsruhe Inst Technol KIT, Karlsruhe, Germany..
    Oottikkal, Shameema
    Ostasz, Marcin
    ETP4HPC, Amsterdam, Netherlands..
    Panda, Dhabaleswar K.
    Ohio State Univ, Coll Engn, Columbus, OH USA..
    Pleiter, Dirk
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Ramnath, Rajiv
    Ohio State Univ, Coll Engn, Columbus, OH USA..
    Sancho, Maria-Ribera
    Sclocco, Alessio
    Netherlands eSci Ctr, Amsterdam, Netherlands..
    Shafi, Aamir
    Ohio State Univ, Coll Engn, Columbus, OH USA..
    Spreeuw, Hanno
    Netherlands eSci Ctr, Amsterdam, Netherlands..
    Subramoni, Hari
    Ohio State Univ, Coll Engn, Columbus, OH USA..
    Tomko, Karen
    Netherlands eSci Ctr, Amsterdam, Netherlands..
    Lightning Talks of EduHPC 20222022Ingår i: 2022 IEEE/ACM INTERNATIONAL WORKSHOP ON EDUCATION FOR HIGH PERFORMANCE COMPUTING (EDUHPC), Institute of Electrical and Electronics Engineers (IEEE) , 2022, s. 42-49Konferensbidrag (Refereegranskat)
    Abstract [en]

    The lightning talks at EduHPC provide an opportunity to share early results and insights on parallel and distributed computing (PDC) education and training efforts. The four lightning talks at EduHPC 2022 cover a range of topics in broadening PDC education: (i) curriculum development efforts for the European Masters in HPC program, (ii) bootcamps for CI professionals who support the running of AI workloads on HPC systems, (iii) a GPU programming course following the Carpentries model and (iv) peer-review assignments to help students write efficient parallel algorithms within sustainable software libraries.

  • 11.
    Smail, R. E.
    et al.
    CSSM, Department of Physics, University of Adelaide, Adelaide, South Australia 5005, Australia.
    Batelaan, M.
    CSSM, Department of Physics, University of Adelaide, Adelaide, South Australia 5005, Australia.
    Horsley, R.
    School of Physics and Astronomy, University of Edinburgh, Edinburgh EH9 3FD, United Kingdom.
    Nakamura, Y.
    RIKEN Center for Computational Science, Kobe, Hyogo 650-0047, Japan.
    Perlt, H.
    Institut für Theoretische Physik, Universität Leipzig, 04103 Leipzig, Germany.
    Pleiter, Dirk
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Beräkningsvetenskap och beräkningsteknik (CST). KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Rakow, P. E.L.
    Theoretical Physics Division, Department of Mathematical Sciences, University of Liverpool, Liverpool L69 3BX, United Kingdom.
    Schierholz, G.
    Deutsches Elektronen-Synchrotron DESY, Notkestraße 85, 22607 Hamburg, Germany.
    Stüben, H.
    Universität Hamburg, Regionales Rechenzentrum, 20146 Hamburg, Germany.
    Young, R. D.
    CSSM, Department of Physics, University of Adelaide, Adelaide, South Australia 5005, Australia; Center for Theoretical Physics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA.
    Zanotti, J. M.
    CSSM, Department of Physics, University of Adelaide, Adelaide, South Australia 5005, Australia.
    Constraining beyond the standard model nucleon isovector charges2023Ingår i: Physical Review D: covering particles, fields, gravitation, and cosmology, ISSN 2470-0010, E-ISSN 2470-0029, Vol. 108, nr 9, artikel-id 094511Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    At the TeV scale, low-energy precision observations of neutron characteristics provide unique probes of novel physics. Precision studies of neutron decay observables are susceptible to beyond the Standard Model (BSM) tensor and scalar interactions, while the neutron electric dipole moment, dn, also has high sensitivity to new BSM CP-violating interactions. To fully utilize the potential of future experimental neutron physics programs, matrix elements of appropriate low-energy effective operators within neutron states must be precisely calculated. We present results from the QCDSF/UKQCD/CSSM Collaboration for the isovector charges gT, gA and gS of the nucleon, ς and Ξ baryons using lattice QCD methods and the Feynman-Hellmann theorem. We use a flavor symmetry breaking method to systematically approach the physical quark mass using ensembles that span five lattice spacings and multiple volumes. We extend this existing flavor-breaking expansion to also account for lattice spacing and finite volume effects in order to quantify all systematic uncertainties. Our final estimates of the nucleon isovector charges are gT=1.010(21)stat(12)sys,gA=1.253(63)stat(41)sys and gS=1.08(21)stat(03)sys renormalized, where appropriate, at μ=2 GeV in the MS¯ scheme.

  • 12. Smail, R. E.
    et al.
    Horsley, R.
    Nakamura, Y.
    Perlt, H.
    Pleiter, Dirk
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Rakow, P. E. L.
    Schierholz, G.
    Stüben, H.
    Young, R. D.
    Zanotti, J. M.
    Tensor Charges and their Impact on Physics Beyond the Standard Model2022Ingår i: Proceedings of Science, Sissa Medialab Srl , 2022Konferensbidrag (Refereegranskat)
    Abstract [en]

    The nucleon tensor charge, gT, is an important quantity in the search for beyond the Standard Model tensor interactions in neutron and nuclear β-decays as well as the contribution of the quark electric dipole moment (EDM) to the neutron EDM. We present results from the QCDSF/UKQCD/CSSM collaboration for the tensor charge, gT, using lattice QCD methods and the Feynman-Hellmann theorem. We use a flavour symmetry breaking method to systematically approach the physical quark mass using ensembles that span three lattice spacings. 

  • 13.
    Vincent, Jonathan
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Gong, Jing
    Uppsala University.
    Karp, Martin
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Beräkningsvetenskap och beräkningsteknik (CST).
    Peplinski, Adam
    KTH, Skolan för teknikvetenskap (SCI), Teknisk mekanik, Strömningsmekanik och Teknisk Akustik.
    Jansson, Niclas
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Podobas, Artur
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Beräkningsvetenskap och beräkningsteknik (CST).
    Jocksch, Andreas
    CSCS - Swiss National Supercomputing Centre.
    Yao, Jie
    Texas Tech University.
    Hussain, Fazle
    Texas Tech University.
    Markidis, Stefano
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Beräkningsvetenskap och beräkningsteknik (CST).
    Karlsson, Matts
    Linköping University.
    Pleiter, Dirk
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Laure, Erwin
    Max Planck Computing and Data Facility.
    Schlatter, Philipp
    KTH, Skolan för teknikvetenskap (SCI), Teknisk mekanik, Strömningsmekanik och Teknisk Akustik.
    Strong Scaling of OpenACC enabled Nek5000 on several GPU based HPC systems2022Ingår i: HPCAsia2022: International Conference on High Performance Computing in Asia-Pacific Region, Association for Computing Machinery (ACM) , 2022, s. 94-102Konferensbidrag (Refereegranskat)
    Abstract [en]

    We present new results on the strong parallel scaling for the OpenACC-accelerated implementation of the high-order spectral element fluid dynamics solver Nek5000. The test case considered consists of a direct numerical simulation of fully-developed turbulent flow in a straight pipe, at two different Reynolds numbers Reτ = 360 and Reτ = 550, based on friction velocity and pipe radius. The strong scaling is tested on several GPU-enabled HPC systems, including the Swiss Piz Daint system, TACC's Longhorn, Jülich's JUWELS Booster, and Berzelius in Sweden. The performance results show that speed-up between 3-5 can be achieved using the GPU accelerated version compared with the CPU version on these different systems. The run-time for 20 timesteps reduces from 43.5 to 13.2 seconds with increasing the number of GPUs from 64 to 512 for Reτ = 550 case on JUWELS Booster system. This illustrates the GPU accelerated version the potential for high throughput. At the same time, the strong scaling limit is significantly larger for GPUs, at about 2000 - 5000 elements per rank; compared to about 50 - 100 for a CPU-rank.

1 - 13 av 13
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf