kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Big Data Workflows: Locality-Aware Orchestration Using Software Containers
Univ Oslo, Dept Informat, N-0373 Oslo, Norway..
SINTEF AS, Software & Serv Innovat, N-0373 Oslo, Norway..
Norwegian Univ Sci & Technol, Dept Comp Sci, N-2815 Gjovik, Norway..
OsloMet Oslo Metropolitan Univ, Dept Comp Sci, N-0166 Oslo, Norway..ORCID iD: 0000-0001-6034-4137
Show others and affiliations
2021 (English)In: Sensors, E-ISSN 1424-8220, Vol. 21, no 24, article id 8212Article in journal (Refereed) Published
Abstract [en]

The emergence of the edge computing paradigm has shifted data processing from centralised infrastructures to heterogeneous and geographically distributed infrastructures. Therefore, data processing solutions must consider data locality to reduce the performance penalties from data transfers among remote data centres. Existing big data processing solutions provide limited support for handling data locality and are inefficient in processing small and frequent events specific to the edge environments. This article proposes a novel architecture and a proof-of-concept implementation for software container-centric big data workflow orchestration that puts data locality at the forefront. The proposed solution considers the available data locality information, leverages long-lived containers to execute workflow steps, and handles the interaction with different data sources through containers. We compare the proposed solution with Argo workflows and demonstrate a significant performance improvement in the execution speed for processing the same data units. Finally, we carry out experiments with the proposed solution under different configurations and analyze individual aspects affecting the performance of the overall solution.

Place, publisher, year, edition, pages
MDPI AG , 2021. Vol. 21, no 24, article id 8212
Keywords [en]
big data workflows, orchestration, data locality, software containers
National Category
Computer Systems
Identifiers
URN: urn:nbn:se:kth:diva-311310DOI: 10.3390/s21248212ISI: 000778247100006PubMedID: 34960302Scopus ID: 2-s2.0-85120809412OAI: oai:DiVA.org:kth-311310DiVA, id: diva2:1655968
Note

QC 20220504

Available from: 2022-05-04 Created: 2022-05-04 Last updated: 2022-06-25Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMedScopus

Authority records

Matskin, MihhailPayberah, Amir H.

Search in DiVA

By author/editor
Soylu, AhmetMatskin, MihhailPayberah, Amir H.
By organisation
Software and Computer systems, SCS
In the same journal
Sensors
Computer Systems

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 83 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf