kth.sePublikationer KTH
Ändra sökning
Länk till posten
Permanent länk

Direktlänk
Alternativa namn
Publikationer (9 of 9) Visa alla publikationer
Javed Awan, A., Ohara, M., Ayguade, E., Ishizaki, K., Brorsson, M. & Vlassov, V. (2017). Identifying the potential of Near Data Processing for Apache Spark. In: Proceedings of the International Symposium on Memory Systems, MEMSYS 2017: . Paper presented at Proceedings of the International Symposium on Memory Systems, MEMSYS 2017, Alexandria, VA, USA, October 02 - 05, 2017 (pp. 60-67). Association for Computing Machinery (ACM), Article ID F131197.
Öppna denna publikation i ny flik eller fönster >>Identifying the potential of Near Data Processing for Apache Spark
Visa övriga...
2017 (Engelska)Ingår i: Proceedings of the International Symposium on Memory Systems, MEMSYS 2017, Association for Computing Machinery (ACM), 2017, s. 60-67, artikel-id F131197Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

While cluster computing frameworks are continuously evolving to provide real-time data analysis capabilities, Apache Spark has managed to be at the forefront of big data analytics for being a unified framework for both, batch and stream data processing. There is also a renewed interest in Near Data Processing (NDP) due to technological advancement in the last decade. However, it is not known if NDP architectures can improve the performance of big data processing frameworks such as Apache Spark. In this paper, we build the case of NDP architecture comprising programmable logic based hybrid 2D integrated processing-in-memory and instorage processing for Apache Spark, by extensive profiling of Apache Spark based workloads on Ivy Bridge Server.

Ort, förlag, år, upplaga, sidor
Association for Computing Machinery (ACM), 2017
Nyckelord
Processing-in-memory, In-storage Processing, Apache Spark
Nationell ämneskategori
Elektroteknik och elektronik
Forskningsämne
Informations- och kommunikationsteknik
Identifikatorer
urn:nbn:se:kth:diva-211727 (URN)10.1145/3132402.3132427 (DOI)000557248700006 ()2-s2.0-85033586379 (Scopus ID)
Konferens
Proceedings of the International Symposium on Memory Systems, MEMSYS 2017, Alexandria, VA, USA, October 02 - 05, 2017
Anmärkning

ISBN for proceedings: 9781450353359

QC 20171124

QC 20210518

Tillgänglig från: 2017-08-11 Skapad: 2017-08-11 Senast uppdaterad: 2023-03-06Bibliografiskt granskad
Awan, A. J. (2017). Project Night-King: Improving the performance of big data analytics using Near Data Computing Architectures.
Öppna denna publikation i ny flik eller fönster >>Project Night-King: Improving the performance of big data analytics using Near Data Computing Architectures
2017 (Engelska)Övrigt (Övrig (populärvetenskap, debatt, mm)) [Forskning på konstnärlig grund]
Abstract [en]

The goal of Project Night-King is to improve the single-node performance of scale-out big data processing frameworks like Apache Spark using programmable accelerators near DRAM and NVRAM. Using modeling techniques, we estimate the lower bound of 5x performance improvement for Spark MLlib workloads.

Förlag
s. 1
Nyckelord
Near Data Processing Architecture, Apache Spark, In-Storage Processing, Processing-in-Memory
Nationell ämneskategori
Elektroteknik och elektronik
Forskningsämne
SRA - Informations- och kommunikationsteknik
Identifikatorer
urn:nbn:se:kth:diva-216962 (URN)
Anmärkning

QC 20171031

Tillgänglig från: 2017-10-25 Skapad: 2017-10-25 Senast uppdaterad: 2024-03-18Bibliografiskt granskad
Awan, A. J. (2016). Accelerating Apache Spark with Fixed Function Hardware Accelerators Near DRAM and NVRAM.
Öppna denna publikation i ny flik eller fönster >>Accelerating Apache Spark with Fixed Function Hardware Accelerators Near DRAM and NVRAM
2016 (Engelska)Övrigt (Refereegranskat)
Förlag
s. 1
Nyckelord
Apache Spark, Hardware Acceleration
Nationell ämneskategori
Inbäddad systemteknik
Forskningsämne
Informations- och kommunikationsteknik
Identifikatorer
urn:nbn:se:kth:diva-213728 (URN)10.13140/RG.2.2.17593.26724 (DOI)
Anmärkning

QC 20170906

Tillgänglig från: 2017-09-05 Skapad: 2017-09-05 Senast uppdaterad: 2024-03-18Bibliografiskt granskad
Awan, A. J., Brorsson, M., Vlassov, V. & Ayguade, E. (2016). Micro-architectural Characterization of Apache Spark on Batch and Stream Processing Workloads. In: : . Paper presented at The 6th IEEE International Conference on Big Data and Cloud Computing (pp. 59-66). IEEE
Öppna denna publikation i ny flik eller fönster >>Micro-architectural Characterization of Apache Spark on Batch and Stream Processing Workloads
2016 (Engelska)Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

While cluster computing frameworks are continuously evolving to provide real-time data analysis capabilities, Apache Spark has managed to be at the forefront of big data analytics for being a unified framework for both, batch and stream data processing. However, recent studies on micro-architectural characterization of in-memory data analytics are limited to only batch processing workloads. We compare the micro-architectural performance of batch processing and stream processing workloads in Apache Spark using hardware performance counters on a dual socket server. In our evaluation experiments, we have found that batch processing and stream processing has same micro-architectural behavior in Spark if the difference between two implementations is of micro-batching only. If the input data rates are small, stream processing workloads are front-end bound. However, the front end bound stalls are reduced at larger input data rates and instruction retirement is improved. Moreover, Spark workloads using DataFrames have improved instruction retirement over workloads using RDDs.

Ort, förlag, år, upplaga, sidor
IEEE: , 2016
Nyckelord
Microarchitectural Performance, Spark Streaming, Workload Characterization
Nationell ämneskategori
Datorsystem
Forskningsämne
Informations- och kommunikationsteknik
Identifikatorer
urn:nbn:se:kth:diva-196123 (URN)10.1109/BDCloud-SocialCom-SustainCom.2016.20 (DOI)000392516300009 ()2-s2.0-85000885440 (Scopus ID)
Konferens
The 6th IEEE International Conference on Big Data and Cloud Computing
Anmärkning

QC 20161130

Tillgänglig från: 2016-11-11 Skapad: 2016-11-11 Senast uppdaterad: 2024-03-15Bibliografiskt granskad
Awan, A. J., Brorsson, M., Vlassov, V. & Ayguade, E. (2016). Node architecture implications for in-memory data analytics on scale-in clusters. In: : . Paper presented at 3rd IEEE/ACM International Conference on Big Data Computing, Applications and Technologies (pp. 237-246). IEEE Press
Öppna denna publikation i ny flik eller fönster >>Node architecture implications for in-memory data analytics on scale-in clusters
2016 (Engelska)Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

While cluster computing frameworks are continuously evolving to provide real-time data analysis capabilities, Apache Spark has managed to be at the forefront of big data analytics. Recent studies propose scale-in clusters with in-storage processing devices to process big data analytics with Spark However the proposal is based solely on the memory bandwidth characterization of in-memory data analytics and also does not shed light on the specification of host CPU and memory. Through empirical evaluation of in-memory data analytics with Apache Spark on an Ivy Bridge dual socket server, we have found that (i) simultaneous multi-threading is effective up to 6 cores (ii) data locality on NUMA nodes can improve the performance by 10% on average, (iii) disabling next-line L1-D prefetchers can reduce the execution time by up to 14%, (iv) DDR3 operating at 1333 MT/s is sufficient and (v) multiple small executors can provide up to 36% speedup over single large executor.

Ort, förlag, år, upplaga, sidor
IEEE Press, 2016
Nationell ämneskategori
Datorsystem
Identifikatorer
urn:nbn:se:kth:diva-198161 (URN)10.1145/3006299.3006319 (DOI)000408919800026 ()2-s2.0-85013223047 (Scopus ID)
Konferens
3rd IEEE/ACM International Conference on Big Data Computing, Applications and Technologies
Anmärkning

QC 20161219

Tillgänglig från: 2016-12-13 Skapad: 2016-12-13 Senast uppdaterad: 2024-03-15Bibliografiskt granskad
Awan, A. J. (2016). Performance Characterization of In-Memory Data Analytics on a Scale-up Server. (Licentiate dissertation). KTH Royal Institute of Technology
Öppna denna publikation i ny flik eller fönster >>Performance Characterization of In-Memory Data Analytics on a Scale-up Server
2016 (Engelska)Licentiatavhandling, sammanläggning (Övrigt vetenskapligt)
Abstract [en]

The sheer increase in volume of data over the last decade has triggered research in cluster computing frameworks that enable web enterprises to extract big insights from big data. While Apache Spark defines the state of the art in big data analytics platforms for (i) exploiting data-flow and in-memory computing and (ii) for exhibiting superior scale-out performance on the commodity machines, little effort has been devoted at understanding the performance of in-memory data analytics with Spark on modern scale-up servers. This thesis characterizes the performance of in-memory data analytics with Spark on scale-up servers.

Through empirical evaluation of representative benchmark workloads on a dual socket server, we have found that in-memory data analytics with Spark exhibit poor multi-core scalability beyond 12 cores due to thread level load imbalance and work-time inflation. We have also found that workloads are bound by the latency of frequent data accesses to DRAM. By enlarging input data size, application performance degrades significantly due to substantial increase in wait time during I/O operations and garbage collection, despite 10% better instruction retirement rate (due to lower L1 cache misses and higher core utilization).

For data accesses we have found that simultaneous multi-threading is effective in hiding the data latencies. We have also observed that (i) data locality on NUMA nodes can improve the performance by 10% on average, (ii) disabling next-line L1-D prefetchers can reduce the execution time by up-to 14%. For GC impact, we match memory behaviour with the garbage collector to improve performance of applications between 1.6x to 3x. and recommend to use multiple small executors that can provide up-to 36% speedup over single large executor.

Ort, förlag, år, upplaga, sidor
KTH Royal Institute of Technology, 2016. s. 111
Serie
TRITA-ICT ; 2016:07
Nationell ämneskategori
Datorsystem
Forskningsämne
Informations- och kommunikationsteknik
Identifikatorer
urn:nbn:se:kth:diva-185581 (URN)978-91-7595-926-9 (ISBN)
Presentation
2016-05-23, Ka-210, Electrum 229, Kista, Stockholm, 09:15 (Engelska)
Opponent
Handledare
Anmärkning

QC 20160425

Tillgänglig från: 2016-04-25 Skapad: 2016-04-22 Senast uppdaterad: 2022-06-22Bibliografiskt granskad
Awan, A. J., Brorsson, M., Vlassov, V. & Ayguade, E. (2015). How Data Volume Affects Spark Based Data Analytics on a Scale-up Server. In: Big Data Benchmarks, Performance Optimization, and Emerging Hardware: 6th Workshop, BPOE 2015, Kohala, HI, USA, August 31 - September 4, 2015. Revised Selected Papers. Paper presented at 6th International Workshop on Bigdata Benchmarks, Performance Optimization and Emerging Hardware (BpoE), held in conjunction with 41st International Conference on Very Large Data Bases (VLDB),Kohala, HI, USA, August 31 - September 4, 2015 (pp. 81-92). Springer, 9495
Öppna denna publikation i ny flik eller fönster >>How Data Volume Affects Spark Based Data Analytics on a Scale-up Server
2015 (Engelska)Ingår i: Big Data Benchmarks, Performance Optimization, and Emerging Hardware: 6th Workshop, BPOE 2015, Kohala, HI, USA, August 31 - September 4, 2015. Revised Selected Papers, Springer, 2015, Vol. 9495, s. 81-92Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Sheer increase in volume of data over the last decade has triggered research in cluster computing frameworks that enable web enterprises to extract big insights from big data. While Apache Spark is gaining popularity for exhibiting superior scale-out performance on the commodity machines, the impact of data volume on the performance of Spark based data analytics in scale-up configuration is not well understood. We present a deep-dive analysis of Spark based applications on a large scale-up server machine. Our analysis reveals that Spark based data analytics are DRAM bound and do not benefit by using more than 12 cores for an executor. By enlarging input data size, application performance degrades significantly due to substantial increase in wait time during I/O operations and garbage collection, despite 10 % better instruction retirement rate (due to lower L1 cache misses and higher core utilization). We match memory behaviour with the garbage collector to improve performance of applications between 1.6x to 3x.

Ort, förlag, år, upplaga, sidor
Springer, 2015
Serie
Lecture Notes in Computer Science
Nationell ämneskategori
Datorsystem
Identifikatorer
urn:nbn:se:kth:diva-181325 (URN)10.1007/978-3-319-29006-5_7 (DOI)2-s2.0-84958073801 (Scopus ID)978-3-319-29005-8 (ISBN)
Konferens
6th International Workshop on Bigdata Benchmarks, Performance Optimization and Emerging Hardware (BpoE), held in conjunction with 41st International Conference on Very Large Data Bases (VLDB),Kohala, HI, USA, August 31 - September 4, 2015
Anmärkning

QC 20160224

Tillgänglig från: 2016-02-01 Skapad: 2016-02-01 Senast uppdaterad: 2024-03-15Bibliografiskt granskad
Javed Awan, A., Brorsson, M., Vlassov, V. & Ayguade, E. (2015). Performance Characterization of In-Memory Data Analytics on a Modern Cloud Server. In: Proceedings - 2015 IEEE 5th International Conference on Big Data and Cloud Computing, BDCloud 2015: . Paper presented at Big Data and Cloud Computing (BDCloud), 2015 IEEE Fifth International Conference on, Dalian, China, 26-28 Aug. 2015 (pp. 1-8). IEEE Computer Society, Article ID 7310708.
Öppna denna publikation i ny flik eller fönster >>Performance Characterization of In-Memory Data Analytics on a Modern Cloud Server
2015 (Engelska)Ingår i: Proceedings - 2015 IEEE 5th International Conference on Big Data and Cloud Computing, BDCloud 2015, IEEE Computer Society, 2015, s. 1-8, artikel-id 7310708Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

In last decade, data analytics have rapidly progressed from traditional disk-based processing tomodern in-memory processing. However, little effort has been devoted at enhancing performance at micro-architecture level. This paper characterizes the performance of in-memory data analytics using Apache Spark framework. We use a single node NUMA machine and identify the bottlenecks hampering the scalability of workloads. We also quantify the inefficiencies at micro-architecture level for various data analysis workloads. Through empirical evaluation, we show that spark workloads do not scale linearly beyond twelve threads, due to work time inflation and thread level load imbalance. Further, at the micro-architecture level, we observe memory bound latency to be the major cause of work time inflation.

Ort, förlag, år, upplaga, sidor
IEEE Computer Society, 2015
Nyckelord
cloud chambers, cloud computing, data analysis, resource allocation, storage management, Apache Spark framework, Spark workload, data analysis workload, disk-based processing, in-memory data analytics, in-memory processing, memory bound latency, microarchitecture level performance, modern cloud server, performance characterization, single node NUMA machine, thread level load imbalance, work time inflation, workload scalability, Benchmark testing, Big data, Data analysis, Instruction sets, Scalability, Servers, Sparks, Data Analytics, NUMA, Spark Performance, Workload Characterization
Nationell ämneskategori
Datorsystem
Forskningsämne
Datalogi
Identifikatorer
urn:nbn:se:kth:diva-179403 (URN)10.1109/BDCloud.2015.37 (DOI)000380444200001 ()2-s2.0-84962757128 (Scopus ID)978-1-4673-7182-7 (ISBN)
Konferens
Big Data and Cloud Computing (BDCloud), 2015 IEEE Fifth International Conference on, Dalian, China, 26-28 Aug. 2015
Anmärkning

QC 20160118 QC 20160922

Tillgänglig från: 2015-12-16 Skapad: 2015-12-16 Senast uppdaterad: 2024-03-15Bibliografiskt granskad
Awan, A. J., Brorsson, M., Vlassov, V. & Ayguade, E.Architectural Impact on Performance of In-memoryData Analytics: Apache Spark Case Study.
Öppna denna publikation i ny flik eller fönster >>Architectural Impact on Performance of In-memoryData Analytics: Apache Spark Case Study
(Engelska)Manuskript (preprint) (Övrigt vetenskapligt)
Abstract [en]

While cluster computing frameworks are contin-uously evolving to provide real-time data analysis capabilities,Apache Spark has managed to be at the forefront of big data an-alytics for being a unified framework for both, batch and streamdata processing. However, recent studies on micro-architecturalcharacterization of in-memory data analytics are limited to onlybatch processing workloads. We compare micro-architectural per-formance of batch processing and stream processing workloadsin Apache Spark using hardware performance counters on a dualsocket server. In our evaluation experiments, we have found thatbatch processing are stream processing workloads have similarmicro-architectural characteristics are bounded by the latency offrequent data access to DRAM. For data accesses we have foundthat simultaneous multi-threading is effective in hiding the datalatencies. We have also observed that (i) data locality on NUMAnodes can improve the performance by 10% on average and(ii)disabling next-line L1-D prefetchers can reduce the executiontime by up-to 14% and (iii) multiple small executors can provideup-to 36% speedup over single large executor

Nyckelord
Performance Characterization, Apache Spark, Micro-architecture
Nationell ämneskategori
Datorsystem
Forskningsämne
Informations- och kommunikationsteknik
Identifikatorer
urn:nbn:se:kth:diva-185580 (URN)
Anmärkning

QC 20160425

Tillgänglig från: 2016-04-22 Skapad: 2016-04-22 Senast uppdaterad: 2023-03-06Bibliografiskt granskad
Organisationer
Identifikatorer
ORCID-id: ORCID iD iconorcid.org/0000-0002-7510-6286

Sök vidare i DiVA

Visa alla publikationer