Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Algorithm Construction for GPGPU.
KTH, School of Computer Science and Communication (CSC).
KTH, School of Computer Science and Communication (CSC).
2012 (English)Independent thesis Advanced level (professional degree), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

Today every personal computer and almost every work-related computer has a GPU powerful enough to be used as a supplementary computational device. One framework which enables utilization of this is called OpenCL. We asked the question how one writes efficient algorithms on these GPGPU devices. We found that there are two major ways to run code concurrently, data parallel and task parallel. The runtime of an algorithm depends on the amount of the code that can be run in parallel rather than on the number of processing elements available on the device. We decided to test the theory by implementing parallel algorithms for matrix multiplications and integer sorting with radix sort. The results of the tests can be summarized as both good and bad. Even though there is a promising gain in performance there are also factors like memory accessing which is not always easy to control.

Abstract [sv]

Idag har alla persondatorer och nästan alla arbetsrelaterade datorer en GPU kraftig nog att utnyttjas som en extra beräkningsenhet. Ett ramverk som möjliggör utnyttjandet av detta kallas OpenCL. Vi ställde oss frågan hur man skriver effektiva algoritmer till dessa GPGPU-enheter. Vi fann att det finns två sätt i huvudsak att köra kod parallellt på, data- samt uppgiftsparallellt. Körtiden av en algoritm beror snarare på andelen kod som kan köras parallellt än antalet processorelemet som finns tillgängliga på enheten. Vi bestämde oss för att testa teorin genom att implementera parallella algoritmer för matrismultiplikation och sortering av heltal med radix sort. Resultaten av testerna kan summeras till både bra och dåliga. Trots att det finns lovande vinst i prestanda så finns också faktorer som minnesåtkomster som inte alltid är lätt att kontrollera.

Place, publisher, year, edition, pages
2012.
Series
Kandidatexjobb CSC, K12038
National Category
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-131043OAI: oai:DiVA.org:kth-131043DiVA: diva2:654489
Educational program
Master of Science in Engineering - Computer Science and Technology
Uppsok
Technology
Supervisors
Examiners
Available from: 2013-10-07 Created: 2013-10-07

Open Access in DiVA

No full text

Other links

http://www.csc.kth.se/utbildning/kandidatexjobb/datateknik/2012/rapport/hossjer_simon_OCH_svanstrom_mattias_K12038.pdf
By organisation
School of Computer Science and Communication (CSC)
Computer Science

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 73 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf