Simplifying workflows for Hops
Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
One of the most common questions regarding Big Data is what to do with the vast amounts of data available once they have been collected. These amounts surpass the average storage and processing capacities of current equipments, but also the technical processing abilities of average users. Particularly this last issue presents a series of challenges that need to be addressed in order for workflow engines to be intuitive and simple for non power users, as well as highly customizable for those with greater expertise.
This thesis addresses the problem of how to deal with big data processing at scale due to the recent phenomenon of big data. The solution proposed involves setting up a workflow engine that would be able to integrate with a wide variety of tools and applications needed to tackle any solution inside the Hadoop ecosystem. It is demonstrated that with the inclusion of Oozie into the Hops platform it is possible to setup workflow definitions where the Hops users are able to define how to execute and manage their data workflows.
Place, publisher, year, edition, pages
2016. , 44 p.
Computer and Information Science
IdentifiersURN: urn:nbn:se:kth:diva-205296OAI: oai:DiVA.org:kth-205296DiVA: diva2:1088358
Subject / course
Master of Science - School of Electrical Engineering (EES) - Master of Science - Research on Information and Communication Technologies
Haridi, Seif, Professor
Dowling, Jim, Universitetslektor