kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Cloud-native RStudio on Kubernetes for Hopsworks
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS. Hopsworks.
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.ORCID iD: 0000-0001-7236-4637
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS. Hopsworks.ORCID iD: 0000-0002-1672-6899
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.ORCID iD: 0000-0002-9484-6714
Show others and affiliations
2023 (English)Manuscript (preprint) (Other academic)
Abstract [en]

In order to fully benefit from cloud computing, services are designed following the “multi-tenant” architectural model, which is aimed at maximizing resource sharing among users. However, multi-tenancy introduces challenges of security, performance isolation, scaling, and customization. RStudio server is an open-source Integrated Development Environment (IDE) accessible over a web browser for the R programming language. We present the design and implementation of a multi-user distributed system on Hopsworks, a data-intensive AI platform, following the multi-tenant model that provides RStudio as Software as a Service (SaaS). We use the most popular cloud-native technologies: Docker and Kubernetes, to solve the problems of performance isolation, security, and scaling that are present in a multi-tenant environment. We further enable secure data sharing in RStudio server instances to provide data privacy and allow collaboration among RStudio users. We integrate our system with Apache Spark, which can scale and handle Big Data processing workloads. Also, we provide a UI where users can provide custom configurations and have full control of their own RStudio server instances. Our system was tested on a Google Cloud Platform cluster with four worker nodes, each with 30GB of RAM allocated to them. The tests on this cluster showed that 44 RStudio servers, each with 2GB of RAM, can be run concurrently. Our system can scale out to potentially support hundreds of concurrently running RStudio servers by adding more resources (CPUs and RAM) to the cluster or system.

Place, publisher, year, edition, pages
2023.
Keywords [en]
Multi-tenancy, Cloud-native, Performance Isolation, Security, Scaling, Docker, Kubernetes, SaaS, RStudio, Hopsworks
National Category
Computer Sciences Software Engineering
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-336693DOI: 10.48550/arXiv.2307.09132OAI: oai:DiVA.org:kth-336693DiVA, id: diva2:1798047
Note

QC 20230918

Available from: 2023-09-18 Created: 2023-09-18 Last updated: 2023-09-18Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textFull text in arXiv

Authority records

Chikafa, GibsonSheikholeslami, SinaNiazi, SalmanDowling, JimVlassov, Vladimir

Search in DiVA

By author/editor
Chikafa, GibsonSheikholeslami, SinaNiazi, SalmanDowling, JimVlassov, Vladimir
By organisation
Software and Computer systems, SCS
Computer SciencesSoftware Engineering

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 127 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf