Change search
ReferencesLink to record
Permanent link

Direct link
Improving Parallel Performance of FEniCS Finite Element Computations by Hybrid MPI/PGAS
KTH, School of Computer Science and Communication (CSC), High Performance Computing and Visualization (HPCViz).ORCID iD: 0000-0002-5020-1631
KTH, School of Computer Science and Communication (CSC), High Performance Computing and Visualization (HPCViz).ORCID iD: 0000-0003-4256-0463
(English)Manuscript (preprint) (Other academic)
Abstract [en]

We present our work on developing a hybrid parallel programming model for a general finite element solver. The main focus of our work is to demonstrate that legacy codes with high latency, two-sided communication in the form of message passing can be improved using lightweight one-sided communication. We introduce a new hybrid MPI/PGAS implementation of the open source general finite element framework FEniCS, replacing the linear algebra backend (PETSc) with a new library written in UPC. A detailed description of the linear algebra backend implementation and the hybrid interface to FEniCS is given. We also present a detailed analysis of the performance of this hybrid solver on the Cray XE6 Lindgren at PDC/KTH including a comparison with the MPI only implementation, where we find that the hybrid implementation results in significant improvements in performance of the solver.

National Category
Computational Mathematics Computer Science
URN: urn:nbn:se:kth:diva-125617OAI: diva2:640285

QS 2013

Available from: 2013-08-13 Created: 2013-08-13 Last updated: 2013-08-16Bibliographically approved
In thesis
1. High Performance Adaptive Finite Element Methods: With Applications in Aerodynamics
Open this publication in new window or tab >>High Performance Adaptive Finite Element Methods: With Applications in Aerodynamics
2013 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The massive computational cost for resolving all scales in a turbulent flow makes a direct numerical simulation of the underlying Navier-Stokes equations impossible in most engineering applications. Recent advances in adaptive finite element methods offer a new powerful tool in Computational Fluid Dynamics (CFD). The computational cost for simulating turbulent flow can be minimized by adaptively resolution of the mesh, based on a posteriori error estimation. Such adaptive methods have previously been implemented for efficient serial computations, but the extension to an efficient parallel solver is a challenging task. This work concerns the development of an adaptive finite element method that enables efficient computation of time resolved approximations of turbulent flow for complex geometries with a posteriori error control. We present efficient data structures and data decomposition methods for distributed unstructured tetrahedral meshes. Our work also concerns an efficient parallelization of local mesh refinement methods such as recursive longest edge bisection, and the development of an a priori predictive dynamic load balancing method, based on a weighted dual graph. We also address the challenges of emerging supercomputer architectures with the development of new hybrid parallel programming models, combining traditional message passing with lightweight one-sided communication. Our implementation has proven to be both general and efficient, scaling up to more than twelve thousands cores.

Abstract [sv]

Den höga beräkningskostnaden för att lösa upp alla turbulenta skalor för ett realistiskt problem gör en direkt numerisk simulering av Navier-Stokes ekvationer omöjlig. De senaste framstegen inom adaptiva finita element metoder ger ett nytt kraftfullt verktyg inom Computational Fluid Dynamics (CFD). Beräkningskostnaden för en simulering av turbulent flöde kan minimeras genom att beräkningsnätet adaptivt förfinas baserat på en a posteriori feluppskattning. Dessa adaptiva metoder har tidigare implementerats för seriella beräkningar, medan en effektiv parallellisering av metoden inte är trivial. I denna avhandling presenterar vi vår utveckling av en adaptiv finita element lösare, anpassad för att effektivt beräkna tidsupplösta approximationer i komplicerade geometrier med a posteriori felkontroll. Effektiva datastrukturer och metoder för ostrukturerade beräkningsnät av tetrahedrar presenteras. Avhandlingen behandlar även effektiv parallellisering av lokala nätförfiningsmetoder, exempelvis recursive longest edge bisection. Även lastbalanseringsproblematiken behandlas, där problemet lösts genom utvecklandet av en prediktiv dynamisk lastbalanseringsmetod, baserad på en viktad dualgraf av beräkningsnätet. Slutligen avhandlas även problematiken med att effektivt utnyttja nytillkomna superdatorarkitekturer, genom utvecklandet av en hybrid parallelliserings modell som kombinerar traditionell meddelande baserad parallellisering med envägskommunikation. Detta har resulterat i en generell samt effektiv implementation med god skalning upp till fler än tolv tusen processorkärnor.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2013. xii, 50 p.
TRITA-CSC-A, ISSN 1653-5723 ; 2013:07
National Category
Computational Mathematics
urn:nbn:se:kth:diva-125742 (URN)978-91-7501-814-0 (ISBN)
Public defence
2013-09-11, F3, Lindstedtsvägen 26, KTH, Stockholm, 10:15 (English)

QC 20130816

Available from: 2013-08-16 Created: 2013-08-13 Last updated: 2016-02-02Bibliographically approved

Open Access in DiVA

No full text

Search in DiVA

By author/editor
Jansson, NiclasHoffman, Johan
By organisation
High Performance Computing and Visualization (HPCViz)
Computational MathematicsComputer Science

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 66 hits
ReferencesLink to record
Permanent link

Direct link