A Data Parallel Implementation of an Explicit Method for the Compressible Navier– Stokes Equations for Three–Dimensional Channel Flow
1990 (English)In: Parallel Computing, ISSN 0167-8191, Vol. 14, no 1, 1-30 p.Article in journal (Refereed) Published
The fluid flow in a three-dimensional twisted channel is modeled by both the compressible Navier-Stokes equations, and the Euler equations. A three stage Runge-Kutta method is used for integrating the system of equations in time. A second-order accurate, centered difference scheme is used for spatial derivatives of the flux variables. For both the Euler and the Navier-Stokes equations artificial viscosity introduced through fourth-order centered differences is used to stabilize the numeric scheme. By using lower order difference approximations on or close to the boundary than in the interior, the difference stencils can be evaluated at all grid points concurrently. A few different difference molecules for the boundaries, and different factorizations of the fourth-order difference operators were evaluated. With the appropriate factorization of the difference stencils, six variables per lattice point suffice for the evaluation of the difference stencils occurring in the code. The three fourth-order stencils we investigated, including three different factorizations of one of these stencils, account for three out these six variables. The convergence rate for all stencils and their factorizations is approximately the same for the first 1000–1500 steps at which point the residual has reached a value of 10−2–10−3. From this point on the convergence rate for one of the factorizations of the fourth-order stencil is approximately twice that of one of the unfactored stencils.
A performance of 1.05 Gflops/s was demonstrated on 65 536 processor Connection Machine system with 512 Mbytes of primary storage. The performance scales in proportion to the number of processors. The performance on 8k processor configurations was 135 Mflops/s, on 16k processors 265 Mflops/s and 525 Mflops/s on 32k processors. The efficiency is independent of the machine size. The evaluation of the boundary conditions accounted for less than 5% of the total time. A performance improvement by a factor of about three is expected with optimized implementations of functional kernels such as convolution, and matrix-vector multiplication.
Place, publisher, year, edition, pages
1990. Vol. 14, no 1, 1-30 p.
Computational fluid dynamics; Navier-Stokes equations; Connection Machine; performance analysis; execution times
Computer and Information Science
IdentifiersURN: urn:nbn:se:kth:diva-91012DOI: 10.1016/0167-8191(90)90093-OOAI: oai:DiVA.org:kth-91012DiVA: diva2:507679
NR 201408052012-03-052012-03-052012-03-05Bibliographically approved