Change search
ReferencesLink to record
Permanent link

Direct link
A Data Parallel Implementation of an Explicit Method for the Compressible Navier– Stokes Equations for Three–Dimensional Channel Flow
KTH, School of Computer Science and Communication (CSC), Centres, Centre for High Performance Computing, PDC. (Parallelldatorcentrum)
1990 (English)In: Parallel Computing, ISSN 0167-8191, Vol. 14, no 1, 1-30 p.Article in journal (Refereed) Published
Abstract [en]

The fluid flow in a three-dimensional twisted channel is modeled by both the compressible Navier-Stokes equations, and the Euler equations. A three stage Runge-Kutta method is used for integrating the system of equations in time. A second-order accurate, centered difference scheme is used for spatial derivatives of the flux variables. For both the Euler and the Navier-Stokes equations artificial viscosity introduced through fourth-order centered differences is used to stabilize the numeric scheme. By using lower order difference approximations on or close to the boundary than in the interior, the difference stencils can be evaluated at all grid points concurrently. A few different difference molecules for the boundaries, and different factorizations of the fourth-order difference operators were evaluated. With the appropriate factorization of the difference stencils, six variables per lattice point suffice for the evaluation of the difference stencils occurring in the code. The three fourth-order stencils we investigated, including three different factorizations of one of these stencils, account for three out these six variables. The convergence rate for all stencils and their factorizations is approximately the same for the first 1000–1500 steps at which point the residual has reached a value of 10−2–10−3. From this point on the convergence rate for one of the factorizations of the fourth-order stencil is approximately twice that of one of the unfactored stencils.

A performance of 1.05 Gflops/s was demonstrated on 65 536 processor Connection Machine system with 512 Mbytes of primary storage. The performance scales in proportion to the number of processors. The performance on 8k processor configurations was 135 Mflops/s, on 16k processors 265 Mflops/s and 525 Mflops/s on 32k processors. The efficiency is independent of the machine size. The evaluation of the boundary conditions accounted for less than 5% of the total time. A performance improvement by a factor of about three is expected with optimized implementations of functional kernels such as convolution, and matrix-vector multiplication.

Place, publisher, year, edition, pages
1990. Vol. 14, no 1, 1-30 p.
Keyword [en]
Computational fluid dynamics; Navier-Stokes equations; Connection Machine; performance analysis; execution times
National Category
Computer and Information Science
URN: urn:nbn:se:kth:diva-91012DOI: 10.1016/0167-8191(90)90093-OOAI: diva2:507679
NR 20140805Available from: 2012-03-05 Created: 2012-03-05 Last updated: 2012-03-05Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Search in DiVA

By author/editor
Johnsson, Lennart
By organisation
Centre for High Performance Computing, PDC
In the same journal
Parallel Computing
Computer and Information Science

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 17 hits
ReferencesLink to record
Permanent link

Direct link