Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Block-distributed Gradient Boosted Trees
RISE AI.ORCID iD: 0000-0002-8180-7521
AmazonWebServices.
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.
2019 (English)In: SIGIR 2019 - Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Association for Computing Machinery (ACM), 2019, p. 1025-1028Conference paper, Published paper (Refereed)
Abstract [en]

The Gradient Boosted Tree (GBT) algorithm is one of the most popular machine learning algorithms used in production, for tasks that include Click-Through Rate (CTR) prediction and learning-to-rank. To deal with the massive datasets available today, many distributed GBT methods have been proposed. However, they all assume a row-distributed dataset, addressing scalability only with respect to the number of data points and not the number of features, and increasing communication cost for high-dimensional data. In order to allow for scalability across both the data point and feature dimensions, and reduce communication cost, we propose block-distributed GBTs. We achieve communication efficiency by making full use of the data sparsity and adapting the Quickscorer algorithm to the block-distributed setting. We evaluate our approach using datasets with millions of features, and demonstrate that we are able to achieve multiple orders of magnitude reduction in communication cost for sparse data, with no loss in accuracy, while providing a more scalable design. As a result, we are able to reduce the training time for high-dimensional data, and allow more cost-effective scale-out without the need for expensive network communication.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2019. p. 1025-1028
Keywords [en]
Gradient Boosted Trees, Distributed Systems, Communication Efficiency, Scalability
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:kth:diva-249994DOI: 10.1145/3331184.3331331ISI: 000501488900130Scopus ID: 2-s2.0-85073774997ISBN: 978-1-4503-6172-9 (print)OAI: oai:DiVA.org:kth-249994DiVA, id: diva2:1306830
Conference
42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019; Museum of Science and Industry, Paris; France; 21 July 2019 through 25 July 2019
Funder
Swedish Foundation for Strategic Research , BD15-0006
Note

QC 20190426

Available from: 2019-04-25 Created: 2019-04-25 Last updated: 2020-01-22Bibliographically approved
In thesis
1. Scalable Machine Learning through Approximation and Distributed Computing
Open this publication in new window or tab >>Scalable Machine Learning through Approximation and Distributed Computing
2019 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Machine learning algorithms are now being deployed in practically all areas of our lives. Part of this success can be attributed to the ability to learn complex representations from massive datasets. However, computational speed increases have not kept up with the increase in the sizes of data we want to learn from, leading naturally to algorithms that need to be resource-efficient and parallel. As the proliferation of machine learning continues, the ability for algorithms to adapt to a changing environment and deal with uncertainty becomes increasingly important.

In this thesis we develop scalable machine learning algorithms, with a focus on efficient, online, and distributed computation. We make use of approximations to dramatically reduce the computational cost of exact algorithms, and develop online learning algorithms to deal with a constantly changing environment under a tight computational budget. We design parallel and distributed algorithms to ensure that our methods can scale to massive datasets.

We first propose a scalable algorithm for graph vertex similarity calculation and concept discovery. We demonstrate its applicability to multiple domains, including text, music, and images, and demonstrate its scalability by training on one of the largest text corpora available. Then, motivated by a real-world use case of predicting the session length in media streaming, we propose improvements to several aspects of learning with decision trees. We propose two algorithms to estimate the uncertainty in the predictions of online random forests. We show that our approach can achieve better accuracy than the state of the art while being an order of magnitude faster. We then propose a parallel and distributed online tree boosting algorithm that maintains the correctness guarantees of serial algorithms while providing an order of magnitude speedup on average. Finally, we propose an algorithm that allows for gradient boosted trees training to be distributed across both the data point and feature dimensions. We show that we can achieve communication savings of several orders of magnitude for sparse datasets, compared to existing approaches that can only distribute the computation across the data point dimension and use dense communication.

Place, publisher, year, edition, pages
KTH Royal Institute of Technology, 2019. p. 120
Series
TRITA-EECS-AVL ; 2019:43
Keywords
Online Learning, Distributed Computing, Graph Similarity, Decision Trees, Gradient Boosting
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-250038 (URN)978-91-7873-181-7 (ISBN)
Public defence
2019-05-28, Sal B, Kistagången 16, våningsplan 2, Electrum 1, KTH Kista, Kista, 14:00 (English)
Opponent
Supervisors
Funder
Swedish Foundation for Strategic Research , RIT10-0043Swedish Foundation for Strategic Research , BD15-0006
Note

QC 20190426

Available from: 2019-04-26 Created: 2019-04-25 Last updated: 2020-01-22Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopusPublisher

Authority records BETA

Vasiloudis, TheodoreBoström, Henrik

Search in DiVA

By author/editor
Vasiloudis, TheodoreBoström, Henrik
By organisation
Software and Computer systems, SCS
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 39 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf