Change search
ReferencesLink to record
Permanent link

Direct link
Design and Implementation of a Runtime System for Parallel Numerical Simulations on Large-Scale Clusters
KTH, School of Computer Science and Communication (CSC), Centres, Centre for High Performance Computing, PDC.ORCID iD: 0000-0002-5415-1248
KTH, School of Computer Science and Communication (CSC), Centres, Centre for High Performance Computing, PDC.ORCID iD: 0000-0001-9693-6265
KTH, School of Computer Science and Communication (CSC), Centres, Centre for High Performance Computing, PDC.ORCID iD: 0000-0002-9901-9857
2011 (English)In: Proceedings Of The International Conference On Computational Science (ICCS) / [ed] Sato, M; Matsuoka, S; Sloot, PMA; VanAlbada, GD; Dongarra, J, Elsevier, 2011, Vol. 4, 2105-2114 p.Conference paper (Refereed)
Abstract [en]

The execution of scientific codes will introduce a number of new challenges and intensify some old ones on new high-performance computing infrastructures. Petascale computers are large systems with complex designs using heterogeneous technologies that make the programming and porting of applications difficult, particularly if one wants to use the maximum peak performance of the system. In this paper we present the design and first prototype of a runtime system for parallel numerical simulations on large-scale systems. The proposed runtime system addresses the challenges of performance, scalability, and programmability of large-scale HPC systems. We also present initial results of our prototype implementation using a molecular dynamics application kernel.

Place, publisher, year, edition, pages
Elsevier, 2011. Vol. 4, 2105-2114 p.
, Procedia Computer Science, ISSN 1877-0509 ; 4
Keyword [en]
Hybrid computational methods, Parallel computing, Advanced computing architectures, Runtime systems
National Category
Computer Science
URN: urn:nbn:se:kth:diva-38886DOI: 10.1016/j.procs.2011.04.230ISI: 000299165200229ScopusID: 2-s2.0-79958278307OAI: diva2:438320
11th International Conference on Computational Science, ICCS 2011. Singapore. 1 June 2011 - 3 June 2011
Swedish e‐Science Research Center
QC 20120110Available from: 2011-09-02 Created: 2011-09-02 Last updated: 2015-05-08Bibliographically approved
In thesis
1. Towards Scalable Performance Analysis of MPI Parallel Applications
Open this publication in new window or tab >>Towards Scalable Performance Analysis of MPI Parallel Applications
2015 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

  A considerably fraction of science discovery is nowadays relying on computer simulations. High Performance Computing  (HPC) provides scientists with the means to simulate processes ranging from climate modeling to protein folding. However, achieving good application performance and making an optimal use of HPC resources is a heroic task due to the complexity of parallel software. Therefore, performance tools  and runtime systems that help users to execute  applications in the most optimal way are of utmost importance in the landscape of HPC.  In this thesis, we explore different techniques to tackle the challenges of collecting, storing, and using  fine-grained performance data. First, we investigate the automatic use of real-time performance data in order to run applications in an optimal way. To that end, we present a prototype of an adaptive task-based runtime system that uses real-time performance data for task scheduling. This runtime system has a performance monitoring component that provides real-time access to the performance behavior of anapplication while it runs. The implementation of this monitoring component is presented and evaluated within this thesis. Secondly, we explore lossless compression approaches  for MPI monitoring. One of the main problems that  performance tools face is the huge amount of fine-grained data that can be generated from an instrumented application. Collecting fine-grained data from a program is the best method to uncover the root causes of performance bottlenecks, however, it is unfeasible with extremely parallel applications  or applications with long execution times. On the other hand, collecting coarse-grained data is scalable but  sometimes not enough to discern the root cause of a performance problem. Thus, we propose a new method for performance monitoring of MPI programs using event flow graphs. Event flow graphs  provide very low overhead in terms of execution time and  storage size, and can be used to reconstruct fine-grained trace files of application events ordered in time.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2015. viii, 39 p.
TRITA-CSC-A, ISSN 1653-5723 ; 2015:05
parallel computing, performance monitoring, performance tools, event flow graphs
National Category
Computer Systems
Research subject
Computer Science
urn:nbn:se:kth:diva-165043 (URN)978-91-7595-518-6 (ISBN)
2015-05-20, The Visualization Studio, room 4451, Lindstedtsvägen 5, KTH, Stockholm, 10:00 (English)

QC 20150508

Available from: 2015-05-08 Created: 2015-04-21 Last updated: 2015-05-08Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Schliephake, MichaelAguilar, XavierLaure, Erwin
By organisation
Centre for High Performance Computing, PDC
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 78 hits
ReferencesLink to record
Permanent link

Direct link