Change search
ReferencesLink to record
Permanent link

Direct link
On the performance of the Spotify backend
KTH, School of Electrical Engineering (EES), Centres, ACCESS Linnaeus Centre. (Kommunikationsnät, Communication Networks)ORCID iD: 0000-0002-2680-9065
KTH, School of Computer Science and Communication (CSC), Theoretical Computer Science, TCS. Spotify AB.
Spotify AB.
KTH, School of Electrical Engineering (EES), Centres, ACCESS Linnaeus Centre. (Kommunikationsnät, Communication Networks)
Show others and affiliations
2013 (English)In: Journal of Network and Systems Management, ISSN 1064-7570, E-ISSN 1573-7705Article in journal (Refereed) Published
Abstract [en]

We model and evaluate the performance of a distributed key-value storage system that is part of the Spotify backend. Spotify is an on-demand music streaming service, offering low-latency access to a library of over 20 million tracks and serving over 20 million users currently. We first present a simplified model of the Spotify storage architecture, in order to make its analysis feasible. We then introduce an analytical model for the distribution of the response time, a key metric in the Spotify service. We parameterize and validate the model using measurements from two different testbed configurations and from the operational Spotify infrastructure. We find that the model is accurate---measurements are within 11% of predictions---within the range of normal load patterns.In addition, we model the capacity of the Spotify storage system under different object allocation policies and find that measurements on our testbed are within 9% of the model predictions. The model helps us justify the object allocation policy adopted for Spotify storage system.

Place, publisher, year, edition, pages
Springer-Verlag New York, 2013.
Keyword [en]
Key-value store, distributed object store, object allocation policy, performance modeling, performance measurements, response times
National Category
Communication Systems Computer Systems
URN: urn:nbn:se:kth:diva-129973DOI: 10.1007/s10922-013-9292-2ISI: 000350554700009ScopusID: 2-s2.0-84921067530OAI: diva2:653969


Available from: 2013-10-07 Created: 2013-10-07 Last updated: 2016-04-11Bibliographically approved
In thesis
1. Data-driven Performance Prediction and Resource Allocation for Cloud Services
Open this publication in new window or tab >>Data-driven Performance Prediction and Resource Allocation for Cloud Services
2016 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Cloud services, which provide online entertainment, enterprise resource management, tax filing, etc., are becoming essential for consumers, businesses, and governments. The key functionalities of such services are provided by backend systems in data centers. This thesis focuses on three fundamental problems related to management of backend systems. We address these problems using data-driven approaches: triggering dynamic allocation by changes in the environment, obtaining configuration parameters from measurements, and learning from observations. 

The first problem relates to resource allocation for large clouds with potentially hundreds of thousands of machines and services. We developed and evaluated a generic gossip protocol for distributed resource allocation. Extensive simulation studies suggest that the quality of the allocation is independent of the system size for the management objectives considered.

The second problem focuses on performance modeling of a distributed key-value store, and we study specifically the Spotify backend for streaming music. We developed analytical models for system capacity under different data allocation policies and for response time distribution. We evaluated the models by comparing model predictions with measurements from our lab testbed and from the Spotify operational environment. We found the prediction error to be below 12% for all investigated scenarios.

The third problem relates to real-time prediction of service metrics, which we address through statistical learning. Service metrics are learned from observing device and network statistics. We performed experiments on a server cluster running video streaming and key-value store services. We showed that feature set reduction significantly improves the prediction accuracy, while simultaneously reducing model computation time. Finally, we designed and implemented a real-time analytics engine, which produces model predictions through online learning.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2016. 53 p.
TRITA-EE, ISSN 1653-5146 ; 2016:020
National Category
Communication Systems Computer Systems Telecommunications Computer Engineering Other Electrical Engineering, Electronic Engineering, Information Engineering
Research subject
Electrical Engineering
urn:nbn:se:kth:diva-184601 (URN)978-91-7595-876-7 (ISBN)
Public defence
2016-05-03, F3, Lindstedtsvägen 26, KTH Campus, Stockholm, 14:00 (English)
VINNOVA, 2013-03895

QC 20160411

Available from: 2016-04-11 Created: 2016-04-01 Last updated: 2016-05-30Bibliographically approved

Open Access in DiVA

spotify_journal(1382 kB)845 downloads
File information
File name FULLTEXT01.pdfFile size 1382 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopusThe final publication is available at

Search in DiVA

By author/editor
Yanggratoke, RerngvitKreitz, GunnarStadler, RolfFodor, Viktoria
By organisation
ACCESS Linnaeus CentreTheoretical Computer Science, TCSCommunication Networks
In the same journal
Journal of Network and Systems Management
Communication SystemsComputer Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 845 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 221 hits
ReferencesLink to record
Permanent link

Direct link