Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Characterizing Deep-Learning I/O Workloads in TensorFlow
KTH, School of Electrical Engineering and Computer Science (EECS), Computational Science and Technology (CST).ORCID iD: 0000-0003-0639-0639
KTH, School of Electrical Engineering and Computer Science (EECS), Computational Science and Technology (CST).
Show others and affiliations
2018 (English)In: Proceedings of PDSW-DISCS 2018: 3rd Joint International Workshop on Parallel Data Storage and Data Intensive Scalable Computing Systems, Held in conjunction with SC 2018: The International Conference for High Performance Computing, Networking, Storage and Analysis, Institute of Electrical and Electronics Engineers (IEEE), 2018, p. 54-63Conference paper, Published paper (Refereed)
Abstract [en]

The performance of Deep-Learning (DL) computing frameworks rely on the rformance of data ingestion and checkpointing. In fact, during the aining, a considerable high number of relatively small files are first aded and pre-processed on CPUs and then moved to accelerator for mputation. In addition, checkpointing and restart operations are rried out to allow DL computing frameworks to restart quickly from a eckpoint. Because of this, I/O affects the performance of DL plications. this work, we characterize the I/O performance and scaling of nsorFlow, an open-source programming framework developed by Google and ecifically designed for solving DL problems. To measure TensorFlow I/O rformance, we first design a micro-benchmark to measure TensorFlow ads, and then use a TensorFlow mini-application based on AlexNet to asure the performance cost of I/O and checkpointing in TensorFlow. To prove the checkpointing performance, we design and implement a burst ffer. find that increasing the number of threads increases TensorFlow ndwidth by a maximum of 2.3 x and 7.8 x on our benchmark environments. e use of the tensorFlow prefetcher results in a complete overlap of mputation on accelerator and input pipeline on CPU eliminating the fective cost of I/O on the overall performance. The use of a burst ffer to checkpoint to a fast small capacity storage and copy ynchronously the checkpoints to a slower large capacity storage sulted in a performance improvement of 2.6x with respect to eckpointing directly to slower storage on our benchmark environment.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2018. p. 54-63
Keywords [en]
Parallel I/O, Input Pipeline, Deep Learning, TensorFlow
National Category
Computer Engineering
Identifiers
URN: urn:nbn:se:kth:diva-248377DOI: 10.1109/PDSW-DISCS.2018.00011ISI: 000462205000006Scopus ID: 2-s2.0-85063062239OAI: oai:DiVA.org:kth-248377DiVA, id: diva2:1302569
Conference
3rd IEEE/ACM Joint International Workshop on Parallel Data Storage and Data Intensive Scalable Computing Systems, PDSW-DISCS 2018; Dallas; United States; 12 November 2018
Note

QC 20190405

Available from: 2019-04-05 Created: 2019-04-05 Last updated: 2019-04-05Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records BETA

Markidis, StefanoSishtla, Chaitanya PrasadLaure, Erwin

Search in DiVA

By author/editor
Chien, Steven W. D.Markidis, StefanoSishtla, Chaitanya PrasadHerman, PawelLaure, Erwin
By organisation
Computational Science and Technology (CST)Centre for High Performance Computing, PDC
Computer Engineering

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 515 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf