kth.sePublications
Change search
Link to record
Permanent link

Direct link
Publications (10 of 41) Show all publications
D’Orto, M., Sjöblom, S., Chien, L. S., Axner, L. & Gong, J. (2021). Comparing Different Approaches for Solving Large Scale Power-flow Problems with the Newton-Raphson Method. IEEE Access, 9, 56604-56615
Open this publication in new window or tab >>Comparing Different Approaches for Solving Large Scale Power-flow Problems with the Newton-Raphson Method
Show others...
2021 (English)In: IEEE Access, E-ISSN 2169-3536, Vol. 9, p. 56604-56615Article in journal (Refereed) Published
Abstract [en]

This paper focuses on using the Newton-Raphson method to solve the power-flow problems. Since the most computationally demanding part of the Newton-Raphson method is to solve the linear equations at each iteration, this study investigates different approaches to solve the linear equations on both central processing unit (CPU) and graphical processing unit (GPU). Six different approaches have been developed and evaluated in this paper: two approaches of these run entirely on CPU while other two of these run entirely on GPU, and the remaining two are hybrid approaches that run on both CPU and GPU. All six direct linear solvers use either LU or QR factorization to solve the linear equations. Two different hardware platforms have been used to conduct the experiments. The performance results show that the CPU version with LU factorization gives better performance compared to the GPU version using standard library called cuSOLVER even for the larger power-flow problems. Moreover, it has been proven that the best performance is achieved using a hybrid method where the Jacobian matrix is assembled on GPU, the preprocessing with a sparse high performance linear solver called KLU is performed on the CPU in the first iteration, and the linear equation is factorized on the GPU and solved on the CPU. Maximum speed up in this study is obtained on the largest case with 25000 buses. The hybrid version shows a speedup factor of 9.6 with a NVIDIA P100 GPU while 13.1 with a NVIDIA V100 GPU in comparison with baseline CPU version on an Intel Xeon Gold 6132 CPU.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2021
National Category
Computational Mathematics Other Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:kth:diva-292698 (URN)10.1109/ACCESS.2021.3072338 (DOI)000641940300001 ()2-s2.0-85104180269 (Scopus ID)
Note

QC 20210427

Available from: 2021-04-17 Created: 2021-04-17 Last updated: 2024-03-15Bibliographically approved
Zhang, M., Gong, J., Axner, L. & Barth, M. (2020). Automation of High-Fidelity CFD Analysis for Aircraft Design and Optimization Aided by HPC. In: Proceeding of 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP): . Paper presented at 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2020, Västerås, Sweden, March 11-13, 2020 (pp. 395-399). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Automation of High-Fidelity CFD Analysis for Aircraft Design and Optimization Aided by HPC
2020 (English)In: Proceeding of 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), Institute of Electrical and Electronics Engineers (IEEE) , 2020, p. 395-399Conference paper, Published paper (Refereed)
Abstract [en]

In this paper, an automation process to perform Reynolds-Averaged Navier-Stokes (RANS) computational fluid dynamics (CFD) analysis is developed to carry out aerodynamic design and optimization. The aircraft model/geometry is defined by a Common Parametric Aircraft Configuration Schema (CPACS) file, and the analyses are facilitated using high performance computers (HPC). As the computational capability of the available HPC systems is a limiting factor in the complexity of analyses that can be performed, a detailed performance analysis of the open source CFD code SU2 is undertaken and the profiling and performance analyses for large simulations are carried out.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2020
National Category
Computer Systems
Identifiers
urn:nbn:se:kth:diva-276248 (URN)10.1109/PDP50117.2020.00067 (DOI)000582555800060 ()2-s2.0-85085483221 (Scopus ID)
Conference
28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2020, Västerås, Sweden, March 11-13, 2020
Note

QC 20200610

Available from: 2020-06-10 Created: 2020-06-10 Last updated: 2023-03-30Bibliographically approved
Marco, K., Gong, J., Axner, L., Laure, E. & Jan, N. (2020). GPU-acceleration of A High Order Finite Difference Code Using Curvilinear Coordinates. In: Proceedings of the 2020 International Conference on Computing, Networks and Internet of Things: . Paper presented at the 2020 International Conference on Computing, Networks and Internet of Things (pp. 41-47). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>GPU-acceleration of A High Order Finite Difference Code Using Curvilinear Coordinates
Show others...
2020 (English)In: Proceedings of the 2020 International Conference on Computing, Networks and Internet of Things, Association for Computing Machinery (ACM) , 2020, p. 41-47Conference paper, Published paper (Refereed)
Abstract [en]

GPU-accelerated computing is becoming a popular technology due to the emergence of techniques such as OpenACC, which makes it easy to port codes in their original form to GPU systems using compiler directives, and thereby speeding up computation times relatively simply. In this study we have developed an OpenACC implementation of the high order finite difference CFD solver ESSENSE for simulating compressible flows. The solver is based on summation-by-part form difference operators, and the boundary and interface conditions are weakly implemented using simultaneous approximation terms. This case study focuses on porting code to GPUs for the most time-consuming parts namely sparse matrix vector multiplications and the evaluations of fluxes. The resulting OpenACC implementation is used to simulate the Taylor-Green vortex which produces a maximum speed-up of 61.3 on a single V100 GPU by compared to serial CPU version.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2020
Keywords
Computational fluid dynamics, GPU programming, High order finite difference method, OpenACC
National Category
Computer Systems
Identifiers
urn:nbn:se:kth:diva-273805 (URN)10.1145/3398329.3398336 (DOI)2-s2.0-85086223863 (Scopus ID)
Conference
the 2020 International Conference on Computing, Networks and Internet of Things
Note

QC 20200819

Available from: 2020-06-26 Created: 2020-06-26 Last updated: 2023-03-30Bibliographically approved
Zhang, M., Gong, J. & Axner, L. (2020). HPC-Enabled Aerodynamic Optimization Studies Using CFD and Design Suite SU2. In: Proceeding of the Work in Progress Session held in connection with the PDP 2020 Parallel, Distributed, and Network-Based Processing: . Paper presented at PDP.
Open this publication in new window or tab >>HPC-Enabled Aerodynamic Optimization Studies Using CFD and Design Suite SU2
2020 (English)In: Proceeding of the Work in Progress Session held in connection with the PDP 2020 Parallel, Distributed, and Network-Based Processing, 2020Conference paper, Published paper (Refereed)
National Category
Computer Engineering
Identifiers
urn:nbn:se:kth:diva-271128 (URN)
Conference
PDP
Note

QC 20200529

Available from: 2020-03-18 Created: 2020-03-18 Last updated: 2024-03-15Bibliographically approved
Otero, E., Gong, J., Min, M., Fischer, P., Schlatter, P. & Laure, E. (2019). OpenACC acceleration for the PN-PN-2 algorithm in Nek5000. Journal of Parallel and Distributed Computing, 132, 69-78
Open this publication in new window or tab >>OpenACC acceleration for the PN-PN-2 algorithm in Nek5000
Show others...
2019 (English)In: Journal of Parallel and Distributed Computing, ISSN 0743-7315, E-ISSN 1096-0848, Vol. 132, p. 69-78Article in journal (Refereed) Published
Abstract [en]

Due to its high performance and throughput capabilities, GPU-accelerated computing is becoming a popular technology in scientific computing, in particular using programming models such as CUDA and OpenACC. The main advantage with OpenACC is that it enables to simply port codes in their "original" form to GPU systems through compiler directives, thus allowing an incremental approach. An OpenACC implementation is applied to the CFD code Nek5000 for simulation of incompressible flows, based on the spectral-element method. The work follows up previous implementations and focuses now on the P-N-PN-2 method for the spatial discretization of the Navier-Stokes equations. Performance results of the ported code show a speed-up of up to 3.1 on multi-GPU for a polynomial order N > 11.

Place, publisher, year, edition, pages
Academic Press, 2019
Keywords
Nek5000; OpenACC; GPU programming; Spectral element method; High performance computing
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-253811 (URN)10.1016/j.jpdc.2019.05.010 (DOI)000476580400006 ()2-s2.0-85066835225 (Scopus ID)
Funder
EU, Horizon 2020Swedish e‐Science Research CenterSwedish Foundation for Strategic Research
Note

QC 20190625

Available from: 2019-06-18 Created: 2019-06-18 Last updated: 2022-06-26Bibliographically approved
Eliasson, P., Gong, J. & Nordström, J. (2018). A stable and conservative coupling of the unsteady compressible navier-stokes equations at interfaces using finite difference and finite volume methods. In: AIAA Aerospace Sciences Meeting, 2018: . Paper presented at AIAA Aerospace Sciences Meeting, 2018, Kissimmee, United States, 8 January 2018 through 12 January 2018. American Institute of Aeronautics and Astronautics Inc, AIAA (210059)
Open this publication in new window or tab >>A stable and conservative coupling of the unsteady compressible navier-stokes equations at interfaces using finite difference and finite volume methods
2018 (English)In: AIAA Aerospace Sciences Meeting, 2018, American Institute of Aeronautics and Astronautics Inc, AIAA , 2018, no 210059Conference paper, Published paper (Refereed)
Abstract [en]

Stable and conservative interface boundary conditions are developed for the unsteady compressible Navier-Stokes equations using finite difference and finite volume methods. The finite difference approach is based on summation-by-part operators and can be made higher order accurate with boundary conditions imposed weakly. The finite volume approach is an edge- and dual grid-based approach for unstructured grids, formally second order accurate in space, with weak boundary conditions as well. Stable and conservative weak boundary conditions are derived for interfaces between finite difference methods, for finite volume methods and for the coupling between the two approaches. The three types of interface boundary conditions are demonstrated for two test cases. Firstly, inviscid vortex propagation with a known analytical solution is considered. The results show expected error decays as the grid is refined for various couplings and spatial accuracy of the finite difference scheme. The second test case involves viscous laminar flow over a cylinder with vortex shedding. Calculations with various coupling and spatial accuracies of the finite difference solver show that the couplings work as expected and that the higher order finite difference schemes provide enhanced vortex propagation.

Place, publisher, year, edition, pages
American Institute of Aeronautics and Astronautics Inc, AIAA, 2018
National Category
Computational Mathematics
Identifiers
urn:nbn:se:kth:diva-225496 (URN)10.2514/6.2018-0597 (DOI)2-s2.0-85141554093 (Scopus ID)9781624105241 (ISBN)
Conference
AIAA Aerospace Sciences Meeting, 2018, Kissimmee, United States, 8 January 2018 through 12 January 2018
Funder
Swedish e‐Science Research CenterThe Swedish Foundation for International Cooperation in Research and Higher Education (STINT)VINNOVA
Note

QC 20180406

Available from: 2018-04-06 Created: 2018-04-06 Last updated: 2023-06-08Bibliographically approved
Larsson, T., Hammar, J., Gong, J., Barth, M. & Axner, L. (2018). ENHANCING COMPUTATIONAL AERO-ACOUSTIC PROCESSES FOR GROUNDVEHICLES RESOLVING OPEN SOURCE CFD. In: The 13th OpenFOAM Workshop: . Paper presented at The 13th OpenFOAM Workshop (pp. 1-4).
Open this publication in new window or tab >>ENHANCING COMPUTATIONAL AERO-ACOUSTIC PROCESSES FOR GROUNDVEHICLES RESOLVING OPEN SOURCE CFD
Show others...
2018 (English)In: The 13th OpenFOAM Workshop, 2018, p. 1-4Conference paper, Oral presentation with published abstract (Refereed)
National Category
Fluid Mechanics and Acoustics
Identifiers
urn:nbn:se:kth:diva-232361 (URN)
Conference
The 13th OpenFOAM Workshop
Note

QC 20180821

Available from: 2018-07-20 Created: 2018-07-20 Last updated: 2024-03-15Bibliographically approved
Zhang, M., Melin, T., Gong, J., Barth, M. & Axner, L. (2018). Mixed Fidelity Aerodynamic and Aero-Structural Optimization for Wings. In: 2018 International Conference on High Performance Computing & Simulation: . Paper presented at Conference: HPC and Modeling & Simulation for the 21st Century, At Orléans, France (pp. 476-483).
Open this publication in new window or tab >>Mixed Fidelity Aerodynamic and Aero-Structural Optimization for Wings
Show others...
2018 (English)In: 2018 International Conference on High Performance Computing & Simulation, 2018, p. 476-483Conference paper, Published paper (Refereed)
Abstract [en]

Automatic multidisciplinary design optimization is one of the challenges that are faced in the processes involved in designing efficient wings for aircraft. In this paper we present mixed fidelity aerodynamic and aero-structural optimization methods for designing wings. A novel shape design methodology has been developed - it is based on a mix of the automatic aerodynamic optimization for a reference aircraft model, and the aero-structural optimization for an uninhabited air vehicle (UAV) with a high aspect ratio wing. This paper is a significant step towards making it possible to perform all the core processes for aerodynamic and aero-structural optimization that require special skills in a fully automatic manner - this covers all the processes from creating the mesh for the wing simulation to executing the high-fidelity computational fluid dynamics (CFD) analysis code. Our results confirm that the simulation tools can make it possible for a far broader range of engineering researchers and developers to design aircraft in much simpler and more efficient ways. This is a vital step in the evolution of wing design processes as it means that the extremely expensive laboratory experiments that were traditionally used when designing the wings can now be replaced with more cost effective high performance computing (HPC) simulation that utilize accurate numerical methods.

Keywords
Multidisciplinary design optimization (MDO); Computational fluid dynamics (CFD); High performance computing
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-232360 (URN)10.1109/HPCS.2018.00081 (DOI)000450677700064 ()2-s2.0-85057381095 (Scopus ID)978-1-5386-7877-0 (ISBN)
Conference
Conference: HPC and Modeling & Simulation for the 21st Century, At Orléans, France
Funder
Swedish e‐Science Research Center
Note

QC 20180808

Available from: 2018-07-20 Created: 2018-07-20 Last updated: 2024-03-15Bibliographically approved
Otero, E., Gong, J., Min, M., Fischer, P., Schlatter, P. & Laure, E. (2018). OpenACC accelerator for the Pn-Pn-2 algorithm in Nek5000. In: Proceedings of the 5th International Conference on Exascale Applications and Software: . Paper presented at The 5th International Conference on Exascale Applications and Software, 17th to 19th April 2018 in Edinburgh, Scotland.
Open this publication in new window or tab >>OpenACC accelerator for the Pn-Pn-2 algorithm in Nek5000
Show others...
2018 (English)In: Proceedings of the 5th International Conference on Exascale Applications and Software, 2018Conference paper, Oral presentation with published abstract (Refereed)
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-232362 (URN)978-0-9926615-3-3 (ISBN)
Conference
The 5th International Conference on Exascale Applications and Software, 17th to 19th April 2018 in Edinburgh, Scotland
Note

QC 20180725

Available from: 2018-07-20 Created: 2018-07-20 Last updated: 2024-03-15Bibliographically approved
Otten, M., Gong, J., Mametjanov, A., Vose, A., Levesque, J., Fischer, P. & Min, M. (2016). An MPI/OpenACC implementation of a high-order electromagnetics solver with GPUDirect communication. The international journal of high performance computing applications, 30(3), 320-334
Open this publication in new window or tab >>An MPI/OpenACC implementation of a high-order electromagnetics solver with GPUDirect communication
Show others...
2016 (English)In: The international journal of high performance computing applications, ISSN 1094-3420, E-ISSN 1741-2846, Vol. 30, no 3, p. 320-334Article in journal (Refereed) Published
Abstract [en]

We present performance results and an analysis of a message passing interface (MPI)/OpenACC implementation of an electromagnetic solver based on a spectral-element discontinuous Galerkin discretization of the time-dependent Maxwell equations. The OpenACC implementation covers all solution routines, including a highly tuned element-by-element operator evaluation and a GPUDirect gather-scatter kernel to effect nearest neighbor flux exchanges. Modifications are designed to make effective use of vectorization, streaming, and data management. Performance results using up to 16,384 graphics processing units of the Cray XK7 supercomputer Titan show more than 2.5x speedup over central processing unit-only performance on the same number of nodes (262,144 MPI ranks) for problem sizes of up to 6.9 billion grid points. We discuss performance-enhancement strategies and the overall potential of GPU-based computing for this class of problems.

Place, publisher, year, edition, pages
Sage Publications, 2016
Keywords
Hybrid MPI, OpenACC, GPUDirect, spectral element-discontinuous Galerkin
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-194020 (URN)10.1177/1094342015626584 (DOI)000382958000005 ()2-s2.0-84983414976 (Scopus ID)
Funder
Swedish e‐Science Research Center
Note

QC 20161017

Available from: 2016-10-17 Created: 2016-10-14 Last updated: 2024-03-15Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-3859-9480

Search in DiVA

Show all publications