Change search
Link to record
Permanent link

Direct link
BETA
Publications (5 of 5) Show all publications
Simmendinger, C., Iakymchuk, R., Cebamanos, L., Akhmetova, D., Bartsch, V., Rotaru, T., . . . Markidis, S. (2019). Interoperability strategies for GASPI and MPI in large-scale scientific applications. The international journal of high performance computing applications, 33(3), 554-568
Open this publication in new window or tab >>Interoperability strategies for GASPI and MPI in large-scale scientific applications
Show others...
2019 (English)In: The international journal of high performance computing applications, ISSN 1094-3420, E-ISSN 1741-2846, Vol. 33, no 3, p. 554-568Article in journal (Refereed) Published
Abstract [en]

One of the main hurdles of partitioned global address space (PGAS) approaches is the dominance of message passing interface (MPI), which as a de facto standard appears in the code basis of many applications. To take advantage of the PGAS APIs like global address space programming interface (GASPI) without a major change in the code basis, interoperability between MPI and PGAS approaches needs to be ensured. In this article, we consider an interoperable GASPI/MPI implementation for the communication/performance crucial parts of the Ludwig and iPIC3D applications. To address the discovered performance limitations, we develop a novel strategy for significantly improved performance and interoperability between both APIs by leveraging GASPI shared windows and shared notifications. First results with a corresponding implementation in the MiniGhost proxy application and the Allreduce collective operation demonstrate the viability of this approach.

Place, publisher, year, edition, pages
SAGE PUBLICATIONS LTD, 2019
Keywords
Interoperability, GASPI, MPI, iPIC3D, Ludwig, MiniGhost, halo exchange, Allreduce
National Category
Computer Engineering
Identifiers
urn:nbn:se:kth:diva-254034 (URN)10.1177/1094342018808359 (DOI)000468919900011 ()2-s2.0-85059353725 (Scopus ID)
Note

QC 20190814

Available from: 2019-08-14 Created: 2019-08-14 Last updated: 2019-08-14Bibliographically approved
Akhmetova, D., Cebamanos, L., Iakymchuk, R., Rotaru, T., Rahn, M., Markidis, S., . . . Simmendinger, C. (2018). Interoperability of GASPI and MPI in large scale scientific applications. In: 12th International Conference on Parallel Processing and Applied Mathematics, PPAM 2017: . Paper presented at 10 September 2017 through 13 September 2017 (pp. 277-287). Springer Verlag
Open this publication in new window or tab >>Interoperability of GASPI and MPI in large scale scientific applications
Show others...
2018 (English)In: 12th International Conference on Parallel Processing and Applied Mathematics, PPAM 2017, Springer Verlag , 2018, p. 277-287Conference paper, Published paper (Refereed)
Abstract [en]

One of the main hurdles of a broad distribution of PGAS approaches is the prevalence of MPI, which as a de-facto standard appears in the code basis of many applications. To take advantage of the PGAS APIs like GASPI without a major change in the code basis, interoperability between MPI and PGAS approaches needs to be ensured. In this article, we address this challenge by providing our study and preliminary performance results regarding interoperating GASPI and MPI on the performance crucial parts of the Ludwig and iPIC3D applications. In addition, we draw a strategy for better coupling of both APIs. 

Place, publisher, year, edition, pages
Springer Verlag, 2018
Keywords
GASPI, Halo exchange, Interoperability, iPIC3D, Ludwig, MPI, Artificial intelligence, Computer science, Computers, De facto standard, Preliminary performance results, Scientific applications
National Category
Mathematics
Identifiers
urn:nbn:se:kth:diva-227469 (URN)10.1007/978-3-319-78054-2_26 (DOI)000458563900026 ()2-s2.0-85044787063 (Scopus ID)9783319780535 (ISBN)
Conference
10 September 2017 through 13 September 2017
Note

QC 20180521

Available from: 2018-05-21 Created: 2018-05-21 Last updated: 2019-03-05Bibliographically approved
Ivanov, I., Machado, R., Rahn, M., Akhmetova, D., Laure, E., Gong, J., . . . Markidis, S. (2015). Evaluating New Communication Models in the Nek5000 Code for Exascale. In: : . Paper presented at EASC2015. Epigram
Open this publication in new window or tab >>Evaluating New Communication Models in the Nek5000 Code for Exascale
Show others...
2015 (English)Conference paper, Oral presentation with published abstract (Other academic)
Place, publisher, year, edition, pages
Epigram, 2015
National Category
Computer Systems
Identifiers
urn:nbn:se:kth:diva-181105 (URN)
Conference
EASC2015
Note

QC 20160129

Available from: 2016-01-29 Created: 2016-01-29 Last updated: 2016-01-29Bibliographically approved
Ivanov, I., Gong, J., Akhmetova, D., Peng, I. B., Markidis, S., Laure, E., . . . Fischer, P. (2015). Evaluation of Parallel Communication Models in Nekbone, a Nek5000 mini-application. In: 2015 IEEE International Conference on Cluster Computing: . Paper presented at IEEE Cluster 2015 (pp. 760-767). IEEE
Open this publication in new window or tab >>Evaluation of Parallel Communication Models in Nekbone, a Nek5000 mini-application
Show others...
2015 (English)In: 2015 IEEE International Conference on Cluster Computing, IEEE , 2015, p. 760-767Conference paper, Published paper (Refereed)
Abstract [en]

Nekbone is a proxy application of Nek5000, a scalable Computational Fluid Dynamics (CFD) code used for modelling incompressible flows. The Nekbone mini-application is used by several international co-design centers to explore new concepts in computer science and to evaluate their performance. We present the design and implementation of a new communication kernel in the Nekbone mini-application with the goal of studying the performance of different parallel communication models. First, a new MPI blocking communication kernel has been developed to solve Nekbone problems in a three-dimensional Cartesian mesh and process topology. The new MPI implementation delivers a 13% performance improvement compared to the original implementation. The new MPI communication kernel consists of approximately 500 lines of code against the original 7,000 lines of code, allowing experimentation with new approaches in Nekbone parallel communication. Second, the MPI blocking communication in the new kernel was changed to the MPI non-blocking communication. Third, we developed a new Partitioned Global Address Space (PGAS) communication kernel, based on the GPI-2 library. This approach reduces the synchronization among neighbor processes and is on average 3% faster than the new MPI-based, non-blocking, approach. In our tests on 8,192 processes, the GPI-2 communication kernel is 3% faster than the new MPI non-blocking communication kernel. In addition, we have used the OpenMP in all the versions of the new communication kernel. Finally, we highlight the future steps for using the new communication kernel in the parent application Nek5000.

Place, publisher, year, edition, pages
IEEE, 2015
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-181104 (URN)10.1109/CLUSTER.2015.131 (DOI)000378648100121 ()2-s2.0-84959298440 (Scopus ID)
Conference
IEEE Cluster 2015
Note

QC 20160205

Available from: 2016-01-29 Created: 2016-01-29 Last updated: 2018-01-10Bibliographically approved
Markidis, S., Vencels, J., Peng, I. B., Akhmetova, D., Laure, E. & Henri, P. (2015). Idle waves in high-performance computing. Physical Review E. Statistical, Nonlinear, and Soft Matter Physics, 91(1), 013306
Open this publication in new window or tab >>Idle waves in high-performance computing
Show others...
2015 (English)In: Physical Review E. Statistical, Nonlinear, and Soft Matter Physics, ISSN 1539-3755, E-ISSN 1550-2376, Vol. 91, no 1, p. 013306-Article in journal (Refereed) Published
Abstract [en]

The vast majority of parallel scientific applications distributes computation among processes that are in a busy state when computing and in an idle state when waiting for information from other processes. We identify the propagation of idle waves through processes in scientific applications with a local information exchange between the two processes. Idle waves are nondispersive and have a phase velocity inversely proportional to the average busy time. The physical mechanism enabling the propagation of idle waves is the local synchronization between two processes due to remote data dependency. This study provides a description of the large number of processes in parallel scientific applications as a continuous medium. This work also is a step towards an understanding of how localized idle periods can affect remote processes, leading to the degradation of global performance in parallel scientific applications.

Keywords
Continuous medium, Global performance, High performance computing, Local information, Local synchronizations, Parallel scientific applications, Physical mechanism, Scientific applications
National Category
Physical Sciences
Identifiers
urn:nbn:se:kth:diva-160745 (URN)10.1103/PhysRevE.91.013306 (DOI)000348330600020 ()25679738 (PubMedID)2-s2.0-84921638180 (Scopus ID)
Note

QC 20150302

Available from: 2015-03-02 Created: 2015-02-27 Last updated: 2017-12-04Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-1603-5294

Search in DiVA

Show all publications