Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
MDX on Hadoop: A case study on OLAP for Big Data
KTH, School of Information and Communication Technology (ICT).
2015 (English)Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Online Analytical Processing (OLAP) is a method used for analyzing data within business intelligence and data mining, using n-dimensional hyper cubes. These cubes stores the aggregates of multiple dimensions of the data, and can traditionally be computed from a dimensional relational model in SQL databases, known as a star schema. Multidimensional expressions are a type of queries commonly used by BI tools to query OLAP cubes.

This thesis investigates ways to conduct one-line OLAP like queries against a dimensional relational model, based in a Hadoop cluster. In the evaluation, Hive-on-Spark and Hive-on-Tez and various formats have been compared. The most significant conclusions are that Hive-on-Tez delivers better performance than Hive-on-Spark, and that the ORC format seems to be the best performing format. It could not be demonstrated that less than 20-second performance could be achieved for all queries with the given setup and dataset or that order of input data significantly affects the performance of the ORC format. Scaling seems fairly linear for a cluster of 3 nodes. It also could not be demonstrated that Hive indexes or bucketing improves performance.

Place, publisher, year, edition, pages
2015. , 55 p.
Series
TRITA-ICT-EX, 2015:98
National Category
Computer and Information Science
Identifiers
URN: urn:nbn:se:kth:diva-177468OAI: oai:DiVA.org:kth-177468DiVA: diva2:872941
Examiners
Available from: 2015-11-25 Created: 2015-11-20 Last updated: 2017-06-15Bibliographically approved

Open Access in DiVA

fulltext(1942 kB)50 downloads
File information
File name FULLTEXT01.pdfFile size 1942 kBChecksum SHA-512
c6c3558cfb70382b5d70925d1770e95b0f61ae1f10c101505b730dceda87b4082b183e1c23b8e94615a808fa8773b0c27f2bfafa4e0121b01db12607de9bafe3
Type fulltextMimetype application/pdf

By organisation
School of Information and Communication Technology (ICT)
Computer and Information Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 50 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 449 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf