Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Performance evaluation of advanced features in CUDA unified memory
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Theoretical Computer Science, TCS.ORCID iD: 0000-0001-6408-3333
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST).ORCID iD: 0000-0003-0639-0639
2019 (English)In: Proceedings of MCHPC 2019: Workshop on Memory Centric High Performance Computing - Held in conjunction with SC 2019: The International Conference for High Performance Computing, Networking, Storage and Analysis, Institute of Electrical and Electronics Engineers Inc. , 2019, p. 50-57Conference paper, Published paper (Refereed)
Abstract [en]

CUDA Unified Memory improves the GPU pro- grammability and also enables GPU memory oversubscription. Recently, two advanced memory features, memory advises and asynchronous prefetch, have been introduced. In this work, we evaluate the new features on two platforms that feature different CPUs, GPUs, and interconnects. We derive a benchmark suite for the experiments and stress the memory system to evaluate both in-memory and oversubscription performance. The results show that memory advises on the Intel-Volta/Pascal- PCIe platform bring negligible improvement for in-memory exe- cutions. However, when GPU memory is oversubscribed by about 50%, using memory advises results in up to 25% performance improvement compared to the basic CUDA Unified Memory. In contrast, the Power9-Volta-NVLink platform can substantially benefit from memory advises, achieving up to 34% performance gain for in-memory executions. However, when GPU memory is oversubscribed on this platform, using memory advises increases GPU page faults and results in considerable performance loss. The CUDA prefetch also shows different performance impact on the two platforms. It improves performance by up to 50% on the Intel-Volta/Pascal-PCI-E platform but brings little benefit to the Power9-Volta-NVLink platform.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc. , 2019. p. 50-57
Keywords [en]
CUDA memory hints, CUDA Unified Memory, GPU, Memory oversubscription, UVM, Graphics processing unit, Memory architecture, Program processors, Benchmark suites, Memory systems, Performance Gain, Performance impact, Performance loss, Prefetches, Benchmarking
National Category
Computer Systems
Identifiers
URN: urn:nbn:se:kth:diva-268024DOI: 10.1109/MCHPC49590.2019.00014ISI: 000529056800007Scopus ID: 2-s2.0-85078538028ISBN: 9781728160078 (print)OAI: oai:DiVA.org:kth-268024DiVA, id: diva2:1417324
Conference
2019 IEEE/ACM Workshop on Memory Centric High Performance Computing, MCHPC@SC 2019, Denver, CO, USA, November 18, 2019
Note

QC 20200327

Available from: 2020-03-27 Created: 2020-03-27 Last updated: 2020-05-25Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopusConference websiteConference proceedings

Authority records BETA

Chien, Wei DerMarkidis, Stefano

Search in DiVA

By author/editor
Chien, Wei DerMarkidis, Stefano
By organisation
Theoretical Computer Science, TCSComputational Science and Technology (CST)
Computer Systems

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 48 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf