IMITA: Imitation Learning for Generalizing Cloud Orchestration
2021 (English)In: 21St IEEE/ACM International Symposium On Cluster, Cloud And Internet Computing (CCGRID 2021) / [ed] Lefevre, L Patterson, S Lee, YC Shen, H Ilager, S Goudarzi, M Toosi, AN Buyya, R, Institute of Electrical and Electronics Engineers (IEEE) , 2021, p. 237-246Conference paper, Published paper (Refereed)
Abstract [en]
Operating large scale and feature-rich applications is becoming increasingly complex as engineers need to deploy highly configurable software releases on distributed cloud stacks while managing ever-shorter production cycles. Although recent proposals attempt to streamline cloud resources orchestration, there is still a significant challenge in making such solutions generalize to unseen cloud stacks. In other words, the behavior of application-specific Key Performance Indicators (KPIs) and resource configurations, crafted for specific stacks, may differ on heterogeneous deployments, requiring time-consuming policy adjustments. We introduce IMITA, a system that leverages imitation learning to create models by imitating an expert behavior that can be generalized seamlessly to new cloud stacks. To make a generalized model, IMITA maps expert actions taken based on the application KPI space to the space of resource utilization metrics that are universally available in cloud platforms. This mapping enables the model to trigger actions, mimicking expert behavior, upon the occurrence of similar resource utilization footprints across deployments. We demonstrate IMITA by learning to scale-out Cassandra deployments with diverse configurations and workloads. Our results show IMITA can replicate expert actions across deployments and extrapolate to unseen environments by achieving 50 - 94% fewer false positives actions than traditional threshold-based policies while still adhering to Service-Level Objectives (SLO) and avoiding under-provisioning of resources. Moreover, since collecting data in clouds is costly, IMITA gathers data only for representative configurations to train the imitator model. This approach reduces the size of the collected data to 50%.
Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE) , 2021. p. 237-246
Keywords [en]
Cloud Orchestration, Imitation Learning, Generalization
National Category
Computer Systems
Identifiers
URN: urn:nbn:se:kth:diva-304178DOI: 10.1109/CCGrid51090.2021.00033ISI: 000703983200024Scopus ID: 2-s2.0-85114885031OAI: oai:DiVA.org:kth-304178DiVA, id: diva2:1609045
Conference
21st IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGrid), MAY 10-13, 2021, ELECTR NETWORK
Note
Part of proceedings: ISBN 978-1-7281-9586-5, QC 20230117
2021-11-052021-11-052023-01-17Bibliographically approved