Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Cooperative user- and system-level scheduling of task-centric parallel programs
KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS. (Multicore Center)ORCID iD: 0000-0002-7860-6593
2013 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Emerging architecture designs include tens of processing cores on a single chip die; it is believed that the number of cores will reach the hundreds in not so many years from now. However, most common workloads cannot expose fluctuating parallelism, insufficient to utilize such systems. The combination of these issues suggests that large-scale systems will be either multiprogrammed or have their unneeded resources powered off. To achieve these features, workloads must be able to provide a metric on their parallelism which the system can use to dynamically adapt per-application resource allotments.Adaptive resource management requires scheduling abstractions to be split into two cooperating layers. The system layer that is aware of the availability of resources and the application layer which can accurately and iteratively estimate the workload's true resource requirements.This thesis addresses these issues and provides a self-adapting work-stealing scheduling method that can achieve expected performance while conserving resources. This method is based on deterministic victim selection (DVS) that controls the concentration of the load among the worker threads. It allows to use the number of spawned but not yet processed tasks as a metric for the requirements. Because this metric measures work to be executed in the future instead of past behavior, DVS is versatile to handlevery irregular workloads.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2013. , vi, 29 p.
Series
Trita-ICT-ECS AVH, ISSN 1653-6363 ; 13:15
Keyword [en]
parallel, workload, runtime, task, adaptive, resource management, load balancing, work-stealing
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering
Research subject
SRA - ICT
Identifiers
URN: urn:nbn:se:kth:diva-127708ISBN: 978-91-7501-816-4 (print)OAI: oai:DiVA.org:kth-127708DiVA: diva2:645441
Presentation
2013-09-27, Sal/Hall D, Forum, KTH-ICT, Isafjordsgatan 39, Kista, 12:10 (English)
Opponent
Supervisors
Note

QC 20130910

Available from: 2013-09-10 Created: 2013-09-04 Last updated: 2013-09-17Bibliographically approved
List of papers
1. Resource management for task-based parallel programs over a multi-kernel.: BIAS: Barrelfish Inter-core Adaptive Scheduling
Open this publication in new window or tab >>Resource management for task-based parallel programs over a multi-kernel.: BIAS: Barrelfish Inter-core Adaptive Scheduling
2012 (English)In: Proceedings of the 2012 workshop on Runtime Environments, Systems, Layering and Virtualized Environments (RESoLVE’12), Association for Computing Machinery (ACM), 2012, 32-36 p.Conference paper, Published paper (Refereed)
Abstract [en]

Trying to attack the problem of resource contention, created by multiple parallel applications running simultaneously, we propose a space-sharing, two-level, adaptive scheduler for the Barrelfish operating system.The first level is system-wide, running close to the OS’ kernel, and has knowledge of the available resources, while the second level, integrated into the application’s runtime, is aware of its type and amount of parallelism. Feedback on efficiency from the second-level to the first-level, allows the latter to adaptively modify the allotment of cores (domain), intelligently promoting space-sharing of resources while still allowing time-sharing when needed.In order to avoid excess inter-core communication, the system-level scheduler is designed as a distributed service, taking advantage of the message-passing nature of Barrelfish. The processor topology is partitioned so that each instance of the scheduler handles an appropriately sized subset of cores.Malleability is achieved by suspending worker-threads. Two different methodologies are introduced and explained, each suitable for distinct programming models and applications.Preliminary results are quite promising and show minimal added overhead. In specific multiprogramming configurations, initial experiments proved significant performance improvement by avoiding contention.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2012
Keyword
Scheduling, parallel programming, multicore, manycore
National Category
Software Engineering
Research subject
SRA - ICT
Identifiers
urn:nbn:se:kth:diva-107665 (URN)
Conference
RESoLVE '12, Second workshop on Runtime Environments, Systems, Layering and Virtualized Environments, London UK, March 3, 2012.
Projects
Barrelfish
Funder
Swedish e‐Science Research Center
Note

QC 20130116

Available from: 2013-01-16 Created: 2012-12-14 Last updated: 2013-09-10Bibliographically approved
2. Automatic Adaptation of Resources to Workload requirements in Nested Fork-join Programming Model
Open this publication in new window or tab >>Automatic Adaptation of Resources to Workload requirements in Nested Fork-join Programming Model
2012 (English)Report (Other academic)
Abstract [en]

We provide a work-stealing scheduling method for nested fork/join parallelism that is mathematically proven to self- adapt multiprogrammed applications resource allocation to the current workloads’ individual needs while it takes avail- able resources into account. The scheduling method both scales up the allocated resources when needed and down, when possible.The theoretical model has been implemented in the Bar- relfish distributed multikernel operating system and demon- strated to function on a simulated x86 64 multicore plat- form.The work presented here is the first step towards a com- plete framework for the system-wide scheduling and load balancing of multiprogrammed many-core systems, assum- ing a variety of workload types and guaranteeing at least av- erage execution for each running program.

Place, publisher, year, edition, pages
KTH Royal Institute of Technology, 2012. 12 p.
Series
TRITA-ICT/ECS R, ISSN 1653-7238 ; 12:04
Keyword
Scheduling, parallel programming, multicore, manycore
National Category
Software Engineering
Research subject
SRA - ICT
Identifiers
urn:nbn:se:kth:diva-107668 (URN)KTH/ICT/ECS/R-12-04-SE (ISRN)
Funder
Swedish e‐Science Research Center
Note

QC 20130109

Available from: 2013-01-09 Created: 2012-12-14 Last updated: 2013-09-10Bibliographically approved
3. DVS: Deterministic Victim Selection to ImprovePerformance in Work-Stealing Schedulers
Open this publication in new window or tab >>DVS: Deterministic Victim Selection to ImprovePerformance in Work-Stealing Schedulers
2014 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Task-centric programming models offer a versatile method for exposing parallelism. Such programs are popularly deployed using work-stealing scheduling runtimes. Work-stealers have traditionally employed randomness dependent techniques, considered optimal for several execution configurations. We have identified certain inefficiencies and leeway for improvement on emerging parallel architectures and workloads of fluctuating parallelism. Our deterministic victim selection (DVS) for work-stealing schedulers was designed to provide controllable and predictable uniform distribution of tasks without degrading performance; stealing is restricted between specific pairs of workers. We experimentally show that DVS offers improved scalability and performance for irregular workloads. We demonstrate DVS on Linux and Barrelfish operating systems, using an 48 core Opteron system and a simulated ideal platform respectively. On real hardware, we observed better scaling and 13% average performance gains, up to 55% for specific irregular workloads.

National Category
Software Engineering
Research subject
SRA - ICT
Identifiers
urn:nbn:se:kth:diva-128183 (URN)
Conference
MULTIPROG 2014: Programmability Issues for Heterogeneous Multicores
Note

QC 20140403

Available from: 2013-09-10 Created: 2013-09-10 Last updated: 2014-04-03Bibliographically approved
4. Palirria: Accurate on-line parallelism estimation for adaptive work-stealing
Open this publication in new window or tab >>Palirria: Accurate on-line parallelism estimation for adaptive work-stealing
2014 (English)In: PMAM'14 Proceedings of Programming Models and Applications on Multicores and Manycore, ACM Press, 2014, 120-130 p.Conference paper, Published paper (Refereed)
Abstract [en]

We present Palirria, a self-adapting work-stealing scheduling method for nested fork/join parallelism that can be used to estimate the number of utilizable workers and self-adapt accordingly. The estimation mechanism is optimized for accuracy, minimizing the requested resources without degrading performance. We implemented Palirria for both the Linux and Barrelfish operating systems and evaluated it on two platforms: a 48-core NUMA multiprocessor and a simulated 32-core system. Compared to state-of-the-art, we observed higher accuracy in estimating resource requirements. This leads to improved resource utilization and performance on par or better to executing with fixed resource allotments.

Place, publisher, year, edition, pages
ACM Press, 2014
Keyword
parallel, workload, runtime, task, adaptive, resource management, load balancing, work-stealing
National Category
Software Engineering
Identifiers
urn:nbn:se:kth:diva-128184 (URN)2-s2.0-84897716570 (Scopus ID)978-1-4503-2657-5 (ISBN)
Conference
2014 International Workshop on Programming Models and Applications for Multicores and Manycores, PMAM 2014; Orlando, FL; United States; 15 February 2014 through 15 February 2014
Note

QC 20140520

Available from: 2013-09-10 Created: 2013-09-10 Last updated: 2017-04-28Bibliographically approved

Open Access in DiVA

Georgios_Varisteas_Lic.pdf(211 kB)476 downloads
File information
File name FULLTEXT02.pdfFile size 211 kBChecksum SHA-512
c1b50dc72f585c6f5b457135580094ce86ea3798ebab09077a2e088e9a4ec76d9fe16b7d02cbff80c00eb29506703aed520c009470d536460d76266793875204
Type fulltextMimetype application/pdf

Authority records BETA

Varisteas, Georgios

Search in DiVA

By author/editor
Varisteas, Georgios
By organisation
Software and Computer systems, SCS
Other Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 476 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 230 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf