kth.sePublications
Change search
Link to record
Permanent link

Direct link
Rezaei, Mohammadtaghi
Publications (2 of 2) Show all publications
Atzori, M., Köpp, W., Chien, W. D., Massaro, D., Mallor, F., Peplinski, A., . . . Weinkauf, T. (2022). In situ visualization of large-scale turbulence simulations in Nek5000 with ParaView Catalyst. Journal of Supercomputing, 78(3), 3605-3620
Open this publication in new window or tab >>In situ visualization of large-scale turbulence simulations in Nek5000 with ParaView Catalyst
Show others...
2022 (English)In: Journal of Supercomputing, ISSN 0920-8542, E-ISSN 1573-0484, Vol. 78, no 3, p. 3605-3620Article in journal (Refereed) Published
Abstract [en]

In situ visualization on high-performance computing systems allows us to analyze simulation results that would otherwise be impossible, given the size of the simulation data sets and offline post-processing execution time. We develop an in situ adaptor for Paraview Catalyst and Nek5000, a massively parallel Fortran and C code for computational fluid dynamics. We perform a strong scalability test up to 2048 cores on KTH’s Beskow Cray XC40 supercomputer and assess in situ visualization’s impact on the Nek5000 performance. In our study case, a high-fidelity simulation of turbulent flow, we observe that in situ operations significantly limit the strong scalability of the code, reducing the relative parallel efficiency to only ≈ 21 % on 2048 cores (the relative efficiency of Nek5000 without in situ operations is ≈ 99 %). Through profiling with Arm MAP, we identified a bottleneck in the image composition step (that uses the Radix-kr algorithm) where a majority of the time is spent on MPI communication. We also identified an imbalance of in situ processing time between rank 0 and all other ranks. In our case, better scaling and load-balancing in the parallel image composition would considerably improve the performance of Nek5000 with in situ capabilities. In general, the result of this study highlights the technical challenges posed by the integration of high-performance simulation codes and data-analysis libraries and their practical use in complex cases, even when efficient algorithms already exist for a certain application scenario.

Place, publisher, year, edition, pages
Springer, 2022
Keywords
Computational fluid dynamics, High-performance computing, In situ visualization, Catalysts, Data visualization, Efficiency, Image enhancement, Scalability, Supercomputers, Visualization, Application scenario, High performance computing systems, High-fidelity simulations, High-performance simulation, Large scale turbulence, Parallel efficiency, Relative efficiency, Technical challenges, In situ processing
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-311178 (URN)10.1007/s11227-021-03990-3 (DOI)000680293400003 ()35210696 (PubMedID)2-s2.0-85111797526 (Scopus ID)
Note

QC 20220502

Available from: 2022-05-02 Created: 2022-05-02 Last updated: 2024-01-19Bibliographically approved
Souza, A., Rezaei, M., Laure, E. & Tordsson, J. (2019). Hybrid Resource Management for HPC and Data Intensive Workloads. In: 2019 19TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID): . Paper presented at 19th Annual IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing (CCGRID), MAY 14-17, 2019, Larnaca, CYPRUS (pp. 399-409). IEEE
Open this publication in new window or tab >>Hybrid Resource Management for HPC and Data Intensive Workloads
2019 (English)In: 2019 19TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), IEEE , 2019, p. 399-409Conference paper, Published paper (Refereed)
Abstract [en]

High Performance Computing (HPC) and Data Intensive (DI) workloads have been executed on separate clusters using different tools for resource and application management. With increasing convergence, where modern applications are composed of both types of jobs in complex workflows, this separation becomes a growing overhead and the need for a common platform increases. Executing both workload classes on the same clusters not only enables hybrid workflows, but can also increase system efficiency, as available hardware often is not fully utilized by applications. While HPC systems are typically managed in a coarse grained fashion, with exclusive resource allocations, DI systems employ a finer grained regime, enabling dynamic allocation and control based on application needs. On the path to full convergence, a useful and less intrusive step is a hybrid resource management system allowing the execution of DI applications on top of standard HPC scheduling systems. In this paper we present the architecture of a hybrid system enabling dual-level scheduling for DI jobs in HPC infrastructures. Our system takes advantage of real-time resource profiling to efficiently co-schedule HPC and DI applications. The architecture is easily extensible to current and new types of distributed applications, allowing efficient combination of hybrid workloads on HPC resources with increased job throughput and higher overall resource utilization. The implementation is based on the Sturm and Mesos resource managers for HPC and DI jobs. Experimental evaluations in a real cluster based on a set of representative HPC and DI applications demonstrate that our hybrid architecture improves resource utilization by 20%, with 12% decrease on queue makespan while still meeting all deadlines for HPC jobs.

Place, publisher, year, edition, pages
IEEE, 2019
Series
IEEE-ACM International Symposium on Cluster Cloud and Grid Computing, ISSN 2376-4414
Keywords
Resource Management, High Performance Computing, Data Intensive Computing, Mesos, Sturm, Bootstrapping
National Category
Computational Mathematics
Identifiers
urn:nbn:se:kth:diva-260219 (URN)10.1109/CCGRID.2019.00054 (DOI)000483058700045 ()2-s2.0-85069469164 (Scopus ID)
Conference
19th Annual IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing (CCGRID), MAY 14-17, 2019, Larnaca, CYPRUS
Note

QC 20190930

Part of ISBN 978-1-7281-0912-1

Available from: 2019-09-30 Created: 2019-09-30 Last updated: 2024-10-25Bibliographically approved
Organisations

Search in DiVA

Show all publications