kth.sePublikationer
Ändra sökning
Avgränsa sökresultatet
1 - 43 av 43
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Träffar per sida
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
Markera
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1. Bongo, Lars Ailo
    et al.
    Ciegis, Raimondas
    Frasheri, Neki
    Gong, Jing
    KTH, Centra, SeRC - Swedish e-Science Research Centre. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC.
    Kimovski, Dragi
    Kropf, Peter
    Margenov, Svetozar
    Mihajlovic, Milan
    Neytcheva, Maya
    Rauber, Thomas
    Rünger, Gudula
    Trobec, Roman
    Wuyts, Roel
    Wyrzykowski, Roman
    Applications for Ultrascale Computing2015Ingår i: Supercomputing Frontiers and Innovations, ISSN 2409-6008, Vol. 2, nr 1, s. 19-48Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Studies of complex physical and engineering systems, represented by multi-scale and multi-physics computer simulations have an increasing demand for computing power, especially when the simulations of realistic problems are considered. This demand is driven by the increasing size and complexity of the studied systems or the time constraints. Ultrascale computing systems offer a possible solution to this problem. Future ultrascale systems will be large-scale complex computing systems combining technologies from high performance computing, distributed systems, big data, and cloud computing. Thus, the challenge of developing and programming complex algorithms on these systems is twofold. Firstly, the complex algorithms have to be either developed from scratch, or redesigned in order to yield high performance, while retaining correct functional behaviour. Secondly, ultrascale computing systems impose a number of non-functional cross-cutting concerns, such as fault tolerance or energy consumption, which can significantly impact the deployment of applications on large complex systems. This article discusses the state-of-the-art of programming for current and future large scale systems with an emphasis on complex applications. We derive a number of programming and execution support requirements by studying several computing applications that the authors are currently developing and discuss their potential and necessary upgrades for ultrascale execution.

  • 2.
    D’Orto, Manolo
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS).
    Sjöblom, Svante
    Chien, Lung Sheng
    Axner, Lilit
    ENCCS, Uppsala University.
    Gong, Jing
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Comparing Different Approaches for Solving Large Scale Power-flow Problems with the Newton-Raphson Method2021Ingår i: IEEE Access, E-ISSN 2169-3536, Vol. 9, s. 56604-56615Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    This paper focuses on using the Newton-Raphson method to solve the power-flow problems. Since the most computationally demanding part of the Newton-Raphson method is to solve the linear equations at each iteration, this study investigates different approaches to solve the linear equations on both central processing unit (CPU) and graphical processing unit (GPU). Six different approaches have been developed and evaluated in this paper: two approaches of these run entirely on CPU while other two of these run entirely on GPU, and the remaining two are hybrid approaches that run on both CPU and GPU. All six direct linear solvers use either LU or QR factorization to solve the linear equations. Two different hardware platforms have been used to conduct the experiments. The performance results show that the CPU version with LU factorization gives better performance compared to the GPU version using standard library called cuSOLVER even for the larger power-flow problems. Moreover, it has been proven that the best performance is achieved using a hybrid method where the Jacobian matrix is assembled on GPU, the preprocessing with a sparse high performance linear solver called KLU is performed on the CPU in the first iteration, and the linear equation is factorized on the GPU and solved on the CPU. Maximum speed up in this study is obtained on the largest case with 25000 buses. The hybrid version shows a speedup factor of 9.6 with a NVIDIA P100 GPU while 13.1 with a NVIDIA V100 GPU in comparison with baseline CPU version on an Intel Xeon Gold 6132 CPU.

  • 3.
    Efraimsson, Gunilla
    et al.
    KTH, Skolan för teknikvetenskap (SCI), Farkost och flyg, Aeroakustik.
    Gong, Jing
    IT-department, Uppsala University.
    Svärd, Magnus
    Stanford University, Standford, USA.
    Nordström, Jan
    IT-department, Uppsala University.
    An Investigation of the Performance of a High-Order Accurate Navier-Stokes Code2006Ingår i: ECCOMAS CFD 2006, 2006, s. 11-Konferensbidrag (Refereegranskat)
  • 4. Eliasson, P.
    et al.
    Gong, Jing
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Nordström, J.
    A stable and conservative coupling of the unsteady compressible navier-stokes equations at interfaces using finite difference and finite volume methods2018Ingår i: AIAA Aerospace Sciences Meeting, 2018, American Institute of Aeronautics and Astronautics Inc, AIAA , 2018, nr 210059Konferensbidrag (Refereegranskat)
    Abstract [en]

    Stable and conservative interface boundary conditions are developed for the unsteady compressible Navier-Stokes equations using finite difference and finite volume methods. The finite difference approach is based on summation-by-part operators and can be made higher order accurate with boundary conditions imposed weakly. The finite volume approach is an edge- and dual grid-based approach for unstructured grids, formally second order accurate in space, with weak boundary conditions as well. Stable and conservative weak boundary conditions are derived for interfaces between finite difference methods, for finite volume methods and for the coupling between the two approaches. The three types of interface boundary conditions are demonstrated for two test cases. Firstly, inviscid vortex propagation with a known analytical solution is considered. The results show expected error decays as the grid is refined for various couplings and spatial accuracy of the finite difference scheme. The second test case involves viscous laminar flow over a cylinder with vortex shedding. Calculations with various coupling and spatial accuracies of the finite difference solver show that the couplings work as expected and that the higher order finite difference schemes provide enhanced vortex propagation.

  • 5. Eriksson, Sofia
    et al.
    Law, Craig
    Gong, Jing
    Uppsala Univ. IT dept..
    Nordström, Jan
    Shock Calculations using a Very High Order Accurate Euler and Navier-Stokes Solver2008Ingår i: Proc. 6th South African Conference on Computational and Applied Mechanics, 2008, s. 63-73Konferensbidrag (Refereegranskat)
  • 6.
    Gong, Jing
    Uppsala University.
    Hybrid Methods for Unsteady Fluid Flow Problems in Complex Geometries2007Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
  • 7.
    Gong, Jing
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC. KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Hart, Alistair
    Cray Inc..
    Henty, David
    University of Edinburgh.
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Schliephake, Michael
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Fischer, Paul
    Argonne National Laboratory.
    Heisey, Katherine
    Argonne National Laboratory.
    OpenACC Acceleration of Nek5000: a Spectral Element Code2013Konferensbidrag (Övrigt vetenskapligt)
  • 8.
    Gong, Jing
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC. KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC.
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC.
    Otten, Matthew
    Fischer, Paul
    Min, Misun
    Nekbone performance on GPUs with OpenACC and CUDA Fortran implementations2016Ingår i: Journal of Supercomputing, ISSN 0920-8542, E-ISSN 1573-0484, Vol. 72, nr 11, s. 4160-4180Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We present a hybrid GPU implementation and performance analysis of Nekbone, which represents one of the core kernels of the incompressible Navier-Stokes solver Nek5000. The implementation is based on OpenACC and CUDA Fortran for local parallelization of the compute-intensive matrix-matrix multiplication part, which significantly minimizes the modification of the existing CPU code while extending the simulation capability of the code to GPU architectures. Our discussion includes the GPU results of OpenACC interoperating with CUDA Fortran and the gather-scatter operations with GPUDirect communication. We demonstrate performance of up to 552 Tflops on 16, 384 GPUs of the OLCF Cray XK7 Titan.

  • 9.
    Gong, Jing
    et al.
    KTH, Centra, SeRC - Swedish e-Science Research Centre. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC.
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz). KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Schliephake, Michael
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz). KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz). KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Cebamanos, Luis
    Hart, Alistair
    Min, Misun
    Fischer, Paul
    NekBone with Optimizaed OpenACC directives2015Konferensbidrag (Refereegranskat)
    Abstract [en]

    Accelerators and, in particular, Graphics Processing Units (GPUs) have emerged as promising computing technologies which may be suitable for the future Exascale systems. Here, we present performance results of NekBone, a benchmark of the Nek5000 code, implemented with optimized OpenACC directives and GPUDirect communications. Nek5000 is a computational fluid dynamics code based on the spectral element method used for the simulation of incompressible flow. Results of an optimized NekBone version lead to 78 Gflops performance on a single node. In addition, a performance result of 609 Tflops has been reached on 16, 384 GPUs of the Titan supercomputer at Oak Ridge National Laboratory.

     

  • 10.
    Gong, Jing
    et al.
    KTH, Centra, SeRC - Swedish e-Science Research Centre. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC.
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz). KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Schliephake, Michael
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz). KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz). KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Henningson, Dan
    KTH, Skolan för teknikvetenskap (SCI), Mekanik, Stabilitet, Transition, Kontroll. KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Schlatter, Philipp
    KTH, Skolan för teknikvetenskap (SCI), Mekanik, Stabilitet, Transition, Kontroll. KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Peplinski, Adam
    Hart, Alistair
    Doleschal, Jens
    Henty, David
    Fischer, Paul
    Nek5000 with OpenACC2015Ingår i: Solving software challenges for exascale, 2015, s. 57-68Konferensbidrag (Refereegranskat)
    Abstract [en]

    Nek5000 is a computational fluid dynamics code based on the spectral element method used for the simulation of incompressible flows. We follow up on an earlier study which ported the simplified version of Nek5000 to a GPU-accelerated system by presenting the hybrid CPU/GPU implementation of the full Nek5000 code using OpenACC. The matrix-matrix multiplication, the Nek5000 gather-scatter operator and a preconditioned Conjugate Gradient solver have implemented using OpenACC for multi-GPU systems. We report an speed-up of 1.3 on single node of a Cray XK6 when using OpenACC directives in Nek5000. On 512 nodes of the Titan supercomputer, the speed-up can be approached to 1.4. A performance analysis of the Nek5000 code using Score-P and Vampir performance monitoring tools shows that overlapping of GPU kernels with host-accelerator memory transfers would considerably increase the performance of the OpenACC version of Nek5000 code.

  • 11.
    Gong, Jing
    et al.
    Uppsala Univ, IT dept..
    Nordström, Jan
    A Stable and Efficient Hybrid Scheme for Viscous Problems in Complex Geometries2007Rapport (Övrigt vetenskapligt)
  • 12.
    Gong, Jing
    et al.
    Uppsala Univ., IT dept..
    Nordström, Jan
    KTH, Skolan för teknikvetenskap (SCI), Farkost och flyg.
    A stable and efficient hybrid scheme for viscous problems in complex geometries2007Ingår i: Journal of Computational Physics, ISSN 0021-9991, E-ISSN 1090-2716, Vol. 226, nr 2, s. 1291-1309Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    In this paper, we present a stable hybrid scheme for viscous problems. The hybrid method combines the unstructured finite volume method with high-order finite difference methods on complex geometries. The coupling procedure between the two numerical methods is based on energy estimates and stable interface conditions are constructed. Numerical calculations show that the hybrid method is efficient and accurate.

  • 13.
    Gong, Jing
    et al.
    Uppsala Univ., IT dept..
    Nordström, Jan
    A Stable Hybrid Method for Hyperbolic Problems2004Rapport (Övrigt vetenskapligt)
  • 14.
    Gong, Jing
    et al.
    Uppsala University, Sweden.
    Nordström, Jan
    Linköping University, Sweden .
    Interface procedures for finite difference approximations of the advection-diffusion equation2011Ingår i: Journal of Computational and Applied Mathematics, ISSN 0377-0427, Vol. 236, nr 5, s. 602-620Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We investigate several existing interface procedures for finite difference methods applied to advection-diffusion problems. The accuracy, stiffness and reflecting properties of various interface procedures are investigated. The analysis and numerical experiments show that there are only minor differences between various methods once a proper parameter choice has been made.

  • 15.
    Gong, Jing
    et al.
    Uppsala Univ., IT dept..
    Nordström, Jan
    Stable, Accurate and Efficient Interface Procedures for Viscous Problems2006Rapport (Övrigt vetenskapligt)
  • 16.
    Gong, Jing
    et al.
    Uppsala Univ., IT dept..
    Nordström, Jan
    van der Weide, Edwin
    A Hybrid Method for the Unsteady Compressible Navier-Stokes Equations2007Rapport (Övrigt vetenskapligt)
  • 17.
    Gong, Jing
    et al.
    Uppsala Univ., IT dept..
    Svärd, Magnus
    Nordström, Jan
    Artificial Dissipation for Strictly Stable Finite Volume Methods on Unstructured Meshes2004Ingår i: Computational Mechanics Abstracts: Volume II, 2004, s. 7-7Konferensbidrag (Refereegranskat)
  • 18.
    Hess, Berk
    et al.
    KTH, Skolan för teknikvetenskap (SCI), Tillämpad fysik, Biofysik.
    Gong, Jing
    KTH, Centra, SeRC - Swedish e-Science Research Centre. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC.
    Pall, Szilard
    KTH, Skolan för teknikvetenskap (SCI), Tillämpad fysik, Biofysik.
    Schlatter, Philipp
    KTH, Skolan för teknikvetenskap (SCI), Mekanik. KTH, Centra, SeRC - Swedish e-Science Research Centre. KTH, Skolan för teknikvetenskap (SCI), Centra, Linné Flow Center, FLOW.
    Peplinski, Adam
    KTH, Skolan för teknikvetenskap (SCI), Mekanik, Stabilitet, Transition, Kontroll.
    Highly Tuned Small Matrix Multiplications Applied to Spectral Element Code Nek50002016Konferensbidrag (Refereegranskat)
  • 19.
    Ivanov, Ilya
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Gong, Jing
    KTH, Centra, SeRC - Swedish e-Science Research Centre. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC.
    Akhmetova, Dana
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Peng, Ivy Bo
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz). KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz). KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Machado, Rui
    Rahn, Mirko
    Bartsch, Valeria
    Hart, Alistair
    Fischer, Paul
    Evaluation of Parallel Communication Models in Nekbone, a Nek5000 mini-application2015Ingår i: 2015 IEEE International Conference on Cluster Computing, IEEE , 2015, s. 760-767Konferensbidrag (Refereegranskat)
    Abstract [en]

    Nekbone is a proxy application of Nek5000, a scalable Computational Fluid Dynamics (CFD) code used for modelling incompressible flows. The Nekbone mini-application is used by several international co-design centers to explore new concepts in computer science and to evaluate their performance. We present the design and implementation of a new communication kernel in the Nekbone mini-application with the goal of studying the performance of different parallel communication models. First, a new MPI blocking communication kernel has been developed to solve Nekbone problems in a three-dimensional Cartesian mesh and process topology. The new MPI implementation delivers a 13% performance improvement compared to the original implementation. The new MPI communication kernel consists of approximately 500 lines of code against the original 7,000 lines of code, allowing experimentation with new approaches in Nekbone parallel communication. Second, the MPI blocking communication in the new kernel was changed to the MPI non-blocking communication. Third, we developed a new Partitioned Global Address Space (PGAS) communication kernel, based on the GPI-2 library. This approach reduces the synchronization among neighbor processes and is on average 3% faster than the new MPI-based, non-blocking, approach. In our tests on 8,192 processes, the GPI-2 communication kernel is 3% faster than the new MPI non-blocking communication kernel. In addition, we have used the OpenMP in all the versions of the new communication kernel. Finally, we highlight the future steps for using the new communication kernel in the parent application Nek5000.

  • 20.
    Ivanov, Ilya
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Machado, Rui
    Rahn, Mirko
    Akhmetova, Dana
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz). KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Gong, Jing
    KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC. KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Schlatter, Philipp
    KTH, Skolan för teknikvetenskap (SCI), Mekanik. KTH, Centra, SeRC - Swedish e-Science Research Centre. KTH, Skolan för teknikvetenskap (SCI), Centra, Linné Flow Center, FLOW.
    Henningson, Dan
    KTH, Skolan för teknikvetenskap (SCI), Mekanik, Stabilitet, Transition, Kontroll. KTH, Skolan för teknikvetenskap (SCI), Centra, Linné Flow Center, FLOW. KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Fischer, Paul
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz). KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Evaluating New Communication Models in the Nek5000 Code for Exascale2015Konferensbidrag (Övrigt vetenskapligt)
  • 21. Larsson, Torbjörn
    et al.
    Hammar, Johan
    Gong, Jing
    KTH, Centra, SeRC - Swedish e-Science Research Centre. KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Barth, Michaela
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Axner, Lilit
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    ENHANCING COMPUTATIONAL AERO-ACOUSTIC PROCESSES FOR GROUNDVEHICLES RESOLVING OPEN SOURCE CFD2018Ingår i: The 13th OpenFOAM Workshop, 2018, s. 1-4Konferensbidrag (Refereegranskat)
  • 22. Marco, Kupiainen
    et al.
    Gong, Jing
    KTH, Centra, SeRC - Swedish e-Science Research Centre. KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Axner, Lilit
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Laure, Erwin
    KTH, Centra, SeRC - Swedish e-Science Research Centre. KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Jan, Nordström
    GPU-acceleration of A High Order Finite Difference Code Using Curvilinear Coordinates2020Ingår i: Proceedings of the 2020 International Conference on Computing, Networks and Internet of Things, Association for Computing Machinery (ACM) , 2020, s. 41-47Konferensbidrag (Refereegranskat)
    Abstract [en]

    GPU-accelerated computing is becoming a popular technology due to the emergence of techniques such as OpenACC, which makes it easy to port codes in their original form to GPU systems using compiler directives, and thereby speeding up computation times relatively simply. In this study we have developed an OpenACC implementation of the high order finite difference CFD solver ESSENSE for simulating compressible flows. The solver is based on summation-by-part form difference operators, and the boundary and interface conditions are weakly implemented using simultaneous approximation terms. This case study focuses on porting code to GPUs for the most time-consuming parts namely sparse matrix vector multiplications and the evaluations of fluxes. The resulting OpenACC implementation is used to simulate the Taylor-Green vortex which produces a maximum speed-up of 61.3 on a single V100 GPU by compared to serial CPU version.

  • 23.
    Markidis, Stefano
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz). KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Gong, Jing
    KTH, Centra, SeRC - Swedish e-Science Research Centre. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC. KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Schliephake, Michael
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz). KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz). KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Hart, Alistair
    Henty, David
    Heisey, Katherine
    Fischer, Paul
    OpenACC acceleration of the Nek5000 spectral element code2015Ingår i: The international journal of high performance computing applications, ISSN 1094-3420, E-ISSN 1741-2846, Vol. 29, nr 3, s. 311-319Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We present a case study of porting NekBone, a skeleton version of the Nek5000 code, to a parallel GPU-accelerated system. Nek5000 is a computational fluid dynamics code based on the spectral element method used for the simulation of incompressible flow. The original NekBone Fortran source code has been used as the base and enhanced by OpenACC directives. The profiling of NekBone provided an assessment of the suitability of the code for GPU systems, and indicated possible kernel optimizations. To port NekBone to GPU systems required little effort and a small number of additional lines of code (approximately one OpenACC directive per 1000 lines of code). The naïve implementation using OpenACC leads to little performance improvement: on a single node, from 16 Gflops obtained with the version without OpenACC, we reached 20 Gflops with the naïve OpenACC implementation. An optimized NekBone version leads to a 43 Gflop performance on a single node. In addition, we ported and optimized NekBone to parallel GPU systems, reaching a parallel efficiency of 79.9% on 1024 GPUs of the Titan XK7 supercomputer at the Oak Ridge National Laboratory.

  • 24. Mascellaro, L.
    et al.
    Axner, Lilit
    KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC.
    Gong, Jing
    KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC.
    Monotricat® hull, first displacement naval hull navigating at speeds of planing hulls, on spray self-produced, at high hydrodynamic efficiency and energy recovery2015Ingår i: 18th International Conference on Ships and Shipping Research, NAV 2015, The European Marine Energy Centre Ltd , 2015, s. 38-47Konferensbidrag (Refereegranskat)
    Abstract [en]

    From the '50s, with the introduction of the first semi-planing hull of Nelson, which allowed to navigate with a certain tranquility at speeds higher than those of traditional hulls, and with the subsequent availability of more powerful engines, have been reached a speed equal to Fn greater than 0.6, which defines planing hulls. It was created so a clear distinction between displacement and planing hulls, in relation to the performances. The need to have naval units displacing faster has pushed the ship design to achieve increasingly high performance hulls, also focusing on the use of lightweight materials such as aluminum and more powerful engines, etc., but without substantially changing the traditional forms of hull. The patented hull Monotricat high hydrodynamic efficiency and energy saving represents the overcoming of this distinction between displacement and planing hulls, because, unlike previous solutions, is configured as the first hull that combines the characteristics of displacement and planning hull, since it presents an innovative architecture that could be defined as a hybrid between a monohull and catamaran, navigating on spray self-produced. This presentation will show how the hull Monotricat is the first displacement hull that can navigate at both displacement and planning speeds, with a resistance curve almost straight, maintaining the characteristics of a displacement hull. For these reasons the Monotricat hull is able to ensure: safety, comfort navigation, best seakeeping and maneuverability in restricted waters, stability, reduction of resistance to motion, cost management, regularity on the routes even in adverse weather-sea. These characteristics of the hull have been studied, tested and validated by leading research institutes and universities with more ameliorative results in each subsequent experimentation, reported in the present work, which demonstrated a greater hydrodynamic efficiency compared to conventional hulls tending to 20%.

  • 25. Nordström, Jan
    et al.
    Eriksson, Sofia
    Law, Craig
    Gong, Jing
    Uppsala Univ., IT dept..
    Shock and vortex calculations using a very high order accurate Euler and Navier-Stokes solver2009Ingår i: Journal of Mechanics and MEMS, ISSN 0974-8407, Vol. 1, nr 1, s. 19-26Artikel i tidskrift (Refereegranskat)
  • 26. Nordström, Jan
    et al.
    Gong, Jing
    Uppsala Univ., IT dept..
    A Stable and Efficient Hybrid Method for Aeroacoustic Sound Generation and Propagation2003Ingår i: Proc. Computational Aeroacoustics: From acoustic sources modeling to far-field radiated noise prediction, 2003Konferensbidrag (Refereegranskat)
  • 27. Nordström, Jan
    et al.
    Gong, Jing
    Uppsala Univ., IT dept..
    A stable and efficient hybrid method for aeroacoustic sound generation and propagation2005Ingår i: Comptes rendus. Mecanique, ISSN 1631-0721, E-ISSN 1873-7234, Vol. 333, nr 9, s. 713-718Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We discuss how to combine the node based unstructured finite volume method widely used to handle complex geometries and nonlinear phenomena with very efficient high order finite difference methods suitable for wave propagation dominated problems. This fully coupled numerical procedure reflects the coupled character of the sound generation and propagation problem. The coupling procedure is based on energy estimates and stability can be guaranteed. Numerical experiments using finite difference methods that shed light on the theoretical results are performed. To cite this article: J. Nordstrom, J. Gong, C R. Mecanique 333 (2005).

  • 28. Nordström, Jan
    et al.
    Gong, Jing
    Uppsala Univ..
    A stable hybrid method for hyperbolic problems2006Ingår i: Journal of Computational Physics, ISSN 0021-9991, E-ISSN 1090-2716, Vol. 212, nr 2, s. 436-453Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    A stable hybrid method for hyperbolic problems that combines the unstructured finite volume method with high-order finite difference methods has been developed. The coupling procedure is based on energy estimates and stability can be guaranteed. Numerical calculations verify that the hybrid method is efficient and accurate.

  • 29. Nordström, Jan
    et al.
    Gong, Jing
    Uppsala Univ., IT dept..
    van der Weide, Edwin
    Svärd, Magnus
    A stable and conservative high order multi-block method for the compressible Navier-Stokes equations2009Ingår i: Journal of Computational Physics, ISSN 0021-9991, E-ISSN 1090-2716, Vol. 228, nr 24, s. 9020-9035Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    A stable and conservative high order multi-block method for the time-dependent compressible Navier-Stokes equations has been developed. Stability and conservation are proved using summation-by-parts operators, weak interface conditions and the energy method. This development makes it possible to exploit the efficiency of the high order finite difference method for non-trivial geometries. The computational results corroborate the theoretical analysis.

  • 30. Nordström, Jan
    et al.
    Gong, Jing
    Uppsala Univ., IT dept..
    van der Weide, Edwin
    Svärd, Magnus
    A stable and conservative high order multi-block method for the compressible Navier-Stokes equations2009Rapport (Övrigt vetenskapligt)
  • 31. Nordström, Jan
    et al.
    Ham, Frank
    Shoeybi, Mohammad
    van der Weide, Edwin
    Svärd, Magnus
    Mattsson, Ken
    Iaccarino, Gianluca
    Gong, Jing
    Department of Information Technology, Scientific Computing, Uppsala University.
    A Hybrid Method for Unsteady Fluid Flow2007Rapport (Övrigt vetenskapligt)
    Abstract [en]

    We show how a stable and accurate hybrid procedure for fluid flow can be constructed.Two separate solvers, one using high order finite difference methods andanother using the node-centered unstructured finite volume method are coupled ina truly stable way. The two flow solvers run independently and receive and sendinformation from each other by using a third coupling code. Exact solutions to theEuler equations are used to verify the accuracy and stability of the new computationalprocedure. We also demonstrate the capability of the new procedure in acalculation of the flow in and around a model of a coral.

  • 32. Nordström, Jan
    et al.
    Ham, Frank
    Shoeybi, Mohammad
    van der Weide, Edwin
    Svärd, Magnus
    Mattsson, Ken
    Laccarino, Gianluca
    Gong, Jing
    Uppsala Univ., IT dept..
    A hybrid method for unsteady inviscid fluid flow2009Ingår i: Computers & Fluids, ISSN 0045-7930, E-ISSN 1879-0747, Vol. 38, nr 4, s. 875-882Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We show how a stable and accurate hybrid procedure for fluid flow can be constructed. Two separate solvers, one using high order finite difference methods and another using the node-centered unstructured finite volume method are coupled in a truly stable way. The two flow solvers run independently and receive and send information from each other by using a third coupling code. Exact solutions to the Euler equations are used to verify the accuracy and stability of the new computational procedure. We also demonstrate the capability of the new procedure in a calculation of the flow in and around a model of a coral.

  • 33.
    Offermans, Nicolas
    et al.
    KTH, Skolan för teknikvetenskap (SCI), Mekanik. KTH, Skolan för teknikvetenskap (SCI), Centra, Linné Flow Center, FLOW.
    Marin, O.
    Schanen, M.
    Gong, Jing
    KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC. KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Fischer, P.
    Schlatter, Philipp
    KTH, Skolan för teknikvetenskap (SCI), Centra, Linné Flow Center, FLOW. KTH, Skolan för teknikvetenskap (SCI), Mekanik.
    On the strong scaling of the spectral element solver Nek5000 on petascale systems2016Ingår i: Proceedings of the 2016 Exascale Applications and Software Conference (EASC2016): April 25-29 2016, Stockholm, Sweden, Association for Computing Machinery (ACM), 2016, artikel-id a5Konferensbidrag (Refereegranskat)
    Abstract [en]

    The present work is targeted at performing a strong scaling study of the high-order spectral element uid dynamics solver Nek5000. Prior studies such as [5] indicated a recommendable metric for strong scalability from a theoretical viewpoint, which we test here extensively on three parallel machines with different performance characteristics and interconnect networks, namely Mira (IBM Blue Gene/Q), Beskow (Cray XC40) and Titan (Cray XK7). The test cases considered for the simulations correspond to a turbulent ow in a straight pipe at four different friction Reynolds numbers Reτ = 180, 360, 550 and 1000. Considering the linear model for parallel communication we quantify the machine characteristics in order to better assess the scaling behaviors of the code. Subsequently sampling and profiling tools are used to measure the computation and communication times over a large range of compute cores. We also study the effect of the two coarse grid solvers XXT and AMG on the computational time. Super-linear scaling due to a reduction in cache misses is observed on each computer. The strong scaling limit is attained for roughly 5000 - 10; 000 degrees of freedom per core on Mira, 30; 000 - 50; 0000 on Beskow, with only a small impact of the problem size for both machines, and ranges between 10; 000 and 220; 000 depending on the problem size on Titan. This work aims at being a reference for Nek5000 users and also serves as a basis for potential issues to address as the community heads towards exascale supercomputers.

  • 34.
    Otero, Evelyn
    et al.
    KTH, Skolan för teknikvetenskap (SCI), Farkost och flyg, Aerodynamik. KTH, Skolan för teknikvetenskap (SCI), Centra, Linné Flow Center, FLOW.
    Gong, Jing
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Min, Misun
    Fischer, Paul
    Schlatter, Philipp
    KTH, Skolan för teknikvetenskap (SCI), Mekanik. KTH, Skolan för teknikvetenskap (SCI), Centra, Linné Flow Center, FLOW.
    Laure, Erwin
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    OpenACC acceleration for the PN-PN-2 algorithm in Nek50002019Ingår i: Journal of Parallel and Distributed Computing, ISSN 0743-7315, E-ISSN 1096-0848, Vol. 132, s. 69-78Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Due to its high performance and throughput capabilities, GPU-accelerated computing is becoming a popular technology in scientific computing, in particular using programming models such as CUDA and OpenACC. The main advantage with OpenACC is that it enables to simply port codes in their "original" form to GPU systems through compiler directives, thus allowing an incremental approach. An OpenACC implementation is applied to the CFD code Nek5000 for simulation of incompressible flows, based on the spectral-element method. The work follows up previous implementations and focuses now on the P-N-PN-2 method for the spatial discretization of the Navier-Stokes equations. Performance results of the ported code show a speed-up of up to 3.1 on multi-GPU for a polynomial order N > 11.

  • 35.
    Otero, Evelyn
    et al.
    KTH, Skolan för teknikvetenskap (SCI), Farkost och flyg.
    Gong, Jing
    KTH, Centra, SeRC - Swedish e-Science Research Centre. KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Min, Misun
    Argonne National Laboratory.
    Fischer, Paul
    Argonne National Laboratory.
    Schlatter, Philipp
    KTH, Skolan för teknikvetenskap (SCI), Mekanik. KTH, Centra, SeRC - Swedish e-Science Research Centre. KTH, Skolan för teknikvetenskap (SCI), Centra, Linné Flow Center, FLOW.
    Laure, Erwin
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Beräkningsvetenskap och beräkningsteknik (CST). KTH, Centra, SeRC - Swedish e-Science Research Centre.
    OpenACC accelerator for the Pn-Pn-2 algorithm in Nek50002018Ingår i: Proceedings of the 5th International Conference on Exascale Applications and Software, 2018Konferensbidrag (Refereegranskat)
  • 36. Otten, Matthew
    et al.
    Gong, Jing
    KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC. KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Mametjanov, Azamat
    Vose, Aaron
    Levesque, John
    Fischer, Paul
    Min, Misun
    An MPI/OpenACC implementation of a high-order electromagnetics solver with GPUDirect communication2016Ingår i: The international journal of high performance computing applications, ISSN 1094-3420, E-ISSN 1741-2846, Vol. 30, nr 3, s. 320-334Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We present performance results and an analysis of a message passing interface (MPI)/OpenACC implementation of an electromagnetic solver based on a spectral-element discontinuous Galerkin discretization of the time-dependent Maxwell equations. The OpenACC implementation covers all solution routines, including a highly tuned element-by-element operator evaluation and a GPUDirect gather-scatter kernel to effect nearest neighbor flux exchanges. Modifications are designed to make effective use of vectorization, streaming, and data management. Performance results using up to 16,384 graphics processing units of the Cray XK7 supercomputer Titan show more than 2.5x speedup over central processing unit-only performance on the same number of nodes (262,144 MPI ranks) for problem sizes of up to 6.9 billion grid points. We discuss performance-enhancement strategies and the overall potential of GPU-based computing for this class of problems.

  • 37. Svärd, Magnus
    et al.
    Gong, Jing
    Uppsala Univ., IT dept..
    Nordstroem, Jan
    An accuracy evaluation of unstructured node-centred finite volume methods2008Ingår i: Applied Numerical Mathematics, ISSN 0168-9274, E-ISSN 1873-5460, Vol. 58, nr 8, s. 1142-1158Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Node-centred edge-based finite volume approximations are very common in computational fluid dynamics since they are assumed to run on structured, unstructured and even on mixed grids. We analyse the accuracy properties of both first and second derivative approximations and conclude that these schemes cannot be used on arbitrary grids as is often assumed. For the Euler equations first-order accuracy can be obtained if care is taken when constructing the grid. For the Navier-Stokes equations, the grid restrictions are so severe that these finite volume schemes have little advantage over structured finite difference schemes. Our theoretical results are verified through extensive computations.

  • 38. Svärd, Magnus
    et al.
    Gong, Jing
    Uppsala Univ., IT dept..
    Nordstrom, Jan
    Stable artificial dissipation operators for finite volume schemes on unstructured grids2006Ingår i: Applied Numerical Mathematics, ISSN 0168-9274, E-ISSN 1873-5460, Vol. 56, nr 12, s. 1481-1490Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Our objective is to derive stable first-, second- and fourth-order artificial dissipation operators for node based finite volume schemes. Of particular interest are general unstructured grids where the strength of the finite volume method is fully utilised. A commonly used finite volume approximation of the Laplacian will be the basis in the construction of the artificial dissipation. Both a homogeneous dissipation acting in all directions with equal strength and a modification that allows different amount of dissipation in different directions are derived. Stability and accuracy of the new operators are proved and the theoretical results are supported by numerical computations.

  • 39. Svärd, Magnus
    et al.
    Gong, Jing
    Uppsala Univ., IT dept..
    Nordström, Jan
    An Accuracy Evaluation of Unstructured Node-Centred Finite Volume Methods2005Rapport (Övrigt vetenskapligt)
  • 40. Svärd, Magnus
    et al.
    Gong, Jing
    Uppsala Univ., IT dept..
    Nordström, Jan
    Stable Artificial Dissipation Operators for Finite Volume Schemes on Unstructured Grids2005Rapport (Övrigt vetenskapligt)
  • 41.
    Zhang, Mengmeng
    et al.
    KTH, Skolan för teknikvetenskap (SCI), Farkost och flyg.
    Gong, Jing
    KTH, Centra, SeRC - Swedish e-Science Research Centre. KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Axner, Lilit
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    HPC-Enabled Aerodynamic Optimization Studies Using CFD and Design Suite SU22020Ingår i: Proceeding of the Work in Progress Session held in connection with the PDP 2020 Parallel, Distributed, and Network-Based Processing, 2020Konferensbidrag (Refereegranskat)
  • 42.
    Zhang, Mengmeng
    et al.
    Airinnova AB, Stockholm, Sweden.
    Gong, Jing
    KTH, Centra, SeRC - Swedish e-Science Research Centre. KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Axner, Lilit
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Barth, Michaela
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Automation of High-Fidelity CFD Analysis for Aircraft Design and Optimization Aided by HPC2020Ingår i: Proceeding of 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), Institute of Electrical and Electronics Engineers (IEEE) , 2020, s. 395-399Konferensbidrag (Refereegranskat)
    Abstract [en]

    In this paper, an automation process to perform Reynolds-Averaged Navier-Stokes (RANS) computational fluid dynamics (CFD) analysis is developed to carry out aerodynamic design and optimization. The aircraft model/geometry is defined by a Common Parametric Aircraft Configuration Schema (CPACS) file, and the analyses are facilitated using high performance computers (HPC). As the computational capability of the available HPC systems is a limiting factor in the complexity of analyses that can be performed, a detailed performance analysis of the open source CFD code SU2 is undertaken and the profiling and performance analyses for large simulations are carried out.

  • 43.
    Zhang, Mengmeng
    et al.
    KTH.
    Melin, Tomas
    Gong, Jing
    KTH, Centra, SeRC - Swedish e-Science Research Centre. KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Barth, Michaela
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Axner, Lilit
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Parallelldatorcentrum, PDC.
    Mixed Fidelity Aerodynamic and Aero-Structural Optimization for Wings2018Ingår i: 2018 International Conference on High Performance Computing & Simulation, 2018, s. 476-483Konferensbidrag (Refereegranskat)
    Abstract [en]

    Automatic multidisciplinary design optimization is one of the challenges that are faced in the processes involved in designing efficient wings for aircraft. In this paper we present mixed fidelity aerodynamic and aero-structural optimization methods for designing wings. A novel shape design methodology has been developed - it is based on a mix of the automatic aerodynamic optimization for a reference aircraft model, and the aero-structural optimization for an uninhabited air vehicle (UAV) with a high aspect ratio wing. This paper is a significant step towards making it possible to perform all the core processes for aerodynamic and aero-structural optimization that require special skills in a fully automatic manner - this covers all the processes from creating the mesh for the wing simulation to executing the high-fidelity computational fluid dynamics (CFD) analysis code. Our results confirm that the simulation tools can make it possible for a far broader range of engineering researchers and developers to design aircraft in much simpler and more efficient ways. This is a vital step in the evolution of wing design processes as it means that the extremely expensive laboratory experiments that were traditionally used when designing the wings can now be replaced with more cost effective high performance computing (HPC) simulation that utilize accurate numerical methods.

1 - 43 av 43
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf