Topology-Aware Placement of Stream Processing Components on Geographically Distributed Virtualized Environments
Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Distributed Stream Processing Systems are typically deployed within a single datacenter in order to achieve high performance and low-latency computation. The datastreams analyzed by such systems are expected to be available in the same datacenter. Either the data streams are generated within the data center (e.g., logs,transactions, user clicks) or they are aggregated by external systems from varioussources and buffered into the data center for processing (e.g., IoT, sensor data, trafficinformation).The data center approach for stream processing analytics fits the requirements ofthe majority of the applications that exists today. However, for latency sensitiveapplications, such as real-time decisionmaking, which relies on analyzing geographically distributed data streams, a data center approach might not be sufficient. Aggregating data streams incurs high overheads in terms of latency and bandwidthconsumption in addition to the overhead of sending the analysis outcomes back towhere an action needs to be taken.In this thesis, we propose a new stream processing architecture for efficiently analyzing geographically distributed data streams. Our approach utilizes emergingdistributed virtualized environments, such as Mobile Edge Computing, to extendstream processing systems outside the data center in order to push critical parts ofthe analysis closer to the data sources. This will enable real-time applications to respond faster to geographically distributed events. We create the implementation as aplug-in extension for Apache Storm stream processing framework.
Place, publisher, year, edition, pages
IdentifiersURN: urn:nbn:se:kth:diva-183378OAI: oai:DiVA.org:kth-183378DiVA: diva2:910273
Al-Shishtawy, AhmadPeiro Sajjad, Hooman