Change search
ReferencesLink to record
Permanent link

Direct link
Efficient reduction over threads
KTH, School of Engineering Sciences (SCI), Theoretical Physics.
2011 (English)Independent thesis Basic level (university diploma), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

The increasing number of cores in both desktops and servers leads to a demand for efficient parallel algorithms. This project focuses on the fundamental collective operation reduce, which merges several arrays into one by applying a binary operation element wise. Several reduce algorithms are evaluated in terms of performance and scalability and a novel algorithm is introduced that takes advantage of shared memory and exploits load imbalance. To do so, the concept of dynamic pair generation is introduced which implies constructing a binary reduce tree dynamically based on the order of thread arrival, where pairs are formed in a lock-free manner. We conclude that the dynamic algorithm, given enough spread in the arriving times, can outperform the reference algorithms for some or all array sizes.

Place, publisher, year, edition, pages
2011. , 43 p.
Trita-FYS, ISSN 0280-316X ; 57
National Category
Computer and Information Science
URN: urn:nbn:se:kth:diva-49818OAI: diva2:460393
Educational program
Master of Science in Engineering - Computer Science and Technology
Available from: 2011-11-30 Created: 2011-11-30 Last updated: 2011-11-30Bibliographically approved

Open Access in DiVA

fulltext(2413 kB)217 downloads
File information
File name FULLTEXT01.pdfFile size 2413 kBChecksum SHA-512
Type fulltextMimetype application/pdf

By organisation
Theoretical Physics
Computer and Information Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 217 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 255 hits
ReferencesLink to record
Permanent link

Direct link