Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Towards reproducible blocked lu factorization
KTH, School of Computer Science and Communication (CSC), Computational Science and Technology (CST).
KTH, School of Computer Science and Communication (CSC), Computational Science and Technology (CST).
2017 (English)In: Proceedings - 2017 IEEE 31st International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017, Institute of Electrical and Electronics Engineers (IEEE), 2017, p. 1598-1607, article id 7965230Conference paper, Published paper (Refereed)
Abstract [en]

In this article, we address the problem of reproducibility of the blocked LU factorization on GPUs due to cancellations and rounding errors when dealing with floating-point arithmetic. Thanks to the hierarchical structure of linear algebra libraries, the computations carried within this operation can be expressed in terms of the Level-3 BLAS routines as well as the unblocked variant of the factorization, while the latter is correspondingly built upon the Level-1/2 BLAS kernels. In addition, we strengthen numerical stability of the blocked LU factorization via partial row pivoting. Therefore, we propose a double-layer bottom-up approach for ensuring reproducibility of the blocked LUfactorization and provide experimental results for its underlying blocks.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2017. p. 1598-1607, article id 7965230
Series
IEEE International Symposium on Parallel and Distributed Processing Workshops, ISSN 2164-7062
Keywords [en]
BLAS, error-free transformation, floating-point expansion, GPUs, long accumulator, LU factorization, Reproducibility
National Category
Computational Mathematics
Identifiers
URN: urn:nbn:se:kth:diva-213534DOI: 10.1109/IPDPSW.2017.94ISI: 000417418900173Scopus ID: 2-s2.0-85028054787ISBN: 9781538634080 OAI: oai:DiVA.org:kth-213534DiVA, id: diva2:1137823
Conference
31st IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017, Orlando, United States, 29 May 2017 through 2 June 2017
Note

QC 20170901

Available from: 2017-09-01 Created: 2017-09-01 Last updated: 2018-01-29Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Iakymchuk, RomanLaure, Erwin
By organisation
Computational Science and Technology (CST)
Computational Mathematics

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 18 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf