Change search
Link to record
Permanent link

Direct link
BETA
Publications (7 of 7) Show all publications
Combes, R., Magureanu, S. & Proutiere, A. (2018). Generic Asymptotically Optimal Algorithms for Multi-Armed Bandits. In: 2018 56TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON): . Paper presented at 56th Annual Allerton Conference on Communication, Control, and Computing llerton), OCT 02-05, 2018, Monticello, IL (pp. 235-241). IEEE
Open this publication in new window or tab >>Generic Asymptotically Optimal Algorithms for Multi-Armed Bandits
2018 (English)In: 2018 56TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), IEEE , 2018, p. 235-241Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
IEEE, 2018
Series
Annual Allerton Conference on Communication Control and Computing, ISSN 2474-0195
National Category
Control Engineering
Identifiers
urn:nbn:se:kth:diva-248119 (URN)000461021200021 ()978-1-5386-6596-1 (ISBN)
Conference
56th Annual Allerton Conference on Communication, Control, and Computing llerton), OCT 02-05, 2018, Monticello, IL
Note

QC 20190412

Available from: 2019-04-12 Created: 2019-04-12 Last updated: 2019-09-25Bibliographically approved
Magureanu, S., Isaksson, M., Proutiere, A. & Zhang, B. (2018). Online learning of optimally diverse rankings. In: SIGMETRICS 2018 - Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems: . Paper presented at 2018 ACM International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS 2018, Beckman Center, Irvine, United States, 18 June 2018 through 22 June 2018 (pp. 47-49). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>Online learning of optimally diverse rankings
2018 (English)In: SIGMETRICS 2018 - Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems, Association for Computing Machinery (ACM), 2018, p. 47-49Conference paper, Published paper (Refereed)
Abstract [en]

Search engines answer users’ queries by listing relevant items (e.g. documents, songs, products, web pages, ...). These engines rely on algorithms that learn to rank items so as to present an ordered list maximizing the probability that it contains relevant item. The main challenge in the design of learning-to-rank algorithms stems from the fact that queries often have different meanings for different users. In absence of any contextual information about the query, one often has to adhere to the diversity principle, i.e., to return a list covering the various possible topics or meanings of the query. To formalize this learning-to-rank problem, we propose a natural model where (i) items are categorized into topics, (ii) users find items relevant only if they match the topic of their query, and (iii) the engine is not aware of the topic of an arriving query, nor of the frequency at which queries related to various topics arrive, nor of the topic-dependent click-through-rates of the items. For this problem, we devise LDR (Learning Diverse Rankings), an algorithm that efficiently learns the optimal list based on users’ feedback only. We show that after T queries, the regret of LDR scales as O((N - L) log(T)) where N is the number of all items. This scaling cannot be improved, i.e., LDR is order optimal.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2018
Keywords
Diversity, Learning to rank, Multi-armed bandits, Online learning
National Category
Other Computer and Information Science
Identifiers
urn:nbn:se:kth:diva-234117 (URN)10.1145/3219617.3219637 (DOI)2-s2.0-85052016103 (Scopus ID)9781450358460 (ISBN)
Conference
2018 ACM International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS 2018, Beckman Center, Irvine, United States, 18 June 2018 through 22 June 2018
Note

QC 20180903

Available from: 2018-09-03 Created: 2018-09-03 Last updated: 2018-09-03Bibliographically approved
Magureanu, S. (2016). Structured Stochastic Bandits. (Licentiate dissertation). Stockholm: KTH Royal Institute of Technology
Open this publication in new window or tab >>Structured Stochastic Bandits
2016 (English)Licentiate thesis, monograph (Other academic)
Abstract [en]

In this thesis we address the multi-armed bandit (MAB) problem with stochastic rewards and correlated arms. Particularly, we investigate the case when the expected rewards are a Lipschitz function of the arm, and the learning to rank problem, as viewed from a MAB perspective. For the former, we derive a problem specific lower bound and propose both an asymptotically optimal algorithm (OSLB) and a (pareto)optimal, algorithm (POSLB). For the latter, we construct the regret lower bound and determine its closed form for some particular settings, as well as propose two asymptotically optimal algorithms PIE and PIE-C. For all algorithms mentioned above, we present performance analysis in the form of theoretical regret guarantees as well as numerical evaluation on artificial datasets as well as real-world datasets, in the case of PIE and PIE-C.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2016. p. vii, 94
Series
TRITA-EE, ISSN 1653-5146 ; 2016:021
Keywords
Multi-armed bandits, Learning to rank, reinforcement learning, Lipschitz Bandits
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering
Research subject
Electrical Engineering
Identifiers
urn:nbn:se:kth:diva-182816 (URN)978-91-7595-880-4 (ISBN)
Presentation
2016-03-15, Q2, Osquldasväg 10, KTH, Stockholm, 10:00 (English)
Opponent
Supervisors
Funder
EU, European Research Council
Note

QC 20160223

Available from: 2016-02-23 Created: 2016-02-23 Last updated: 2016-02-23Bibliographically approved
Combes, R., Magureanu, S., Proutière, A. & Laroche, C. (2015). Learning to rank: Regret lower bounds and efficient algorithms. Performance Evaluation Review, 43(1), 231-244
Open this publication in new window or tab >>Learning to rank: Regret lower bounds and efficient algorithms
2015 (English)In: Performance Evaluation Review, ISSN 0163-5999, E-ISSN 1557-9484, Vol. 43, no 1, p. 231-244Article in journal (Refereed) Published
Abstract [en]

Algorithms for learning to rank Web documents, display ads, or other types of items constitute a fundamental component of search engines and more generally of online services. In such systems, when a user makes a request or visits a web page, an ordered list of items (e.g. documents or ads) is displayed; the user scans this list in order, and clicks on the first relevant item if any. When the user clicks on an item, the reward collected by the system typically decreases with the position of the item in the displayed list. The main challenge in the design of sequential list selection algorithms stems from the fact that the probabilities with which the user clicks on the various items are unknown and need to be learned. We formulate the design of such algorithms as a stochastic bandit optimization problem. This problem differs from the classical bandit framework: (1) the type of feedback received by the system depends on the actual relevance of the various items in the displayed list (if the user clicks on the last item, we know that none of the previous items in the list are relevant); (2) there are inherent correlations between the average relevance of the items (e.g. the user may be interested in a specific topic only). We assume that items are categorized according to their topic and that users are clustered, so that users of the same cluster are interested in the same topic. We investigate several scenarios depending on the available sideinformation on the user before selecting the displayed list: (a) we first treat the case where the topic the user is interested in is known when she places a request; (b) we then study the case where the user cluster is known but the mapping between user clusters and topics is unknown. For both scenarios, we derive regret lower bounds and devise algorithms that approach these fundamental limits.

Place, publisher, year, edition, pages
ACM Press, 2015
Keywords
Ad-display optimization, Learning, Multi-armed bandits, Search engines
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-187384 (URN)10.1145/2745844.2745852 (DOI)2-s2.0-84955606870 (Scopus ID)
Note

QC 20160527

Available from: 2016-05-27 Created: 2016-05-23 Last updated: 2018-01-10Bibliographically approved
Magureanu, S., Combes, R. & Proutiere, A. (2014). Lipschitz Bandits: Regret Lower Bounds and Optimal Algorithms. In: : . Paper presented at COLT,Barcelona, Spain, June 13-15, 2014 (pp. 975-999). , 35
Open this publication in new window or tab >>Lipschitz Bandits: Regret Lower Bounds and Optimal Algorithms
2014 (English)Conference paper, Published paper (Refereed)
Abstract [en]

We consider stochastic multi-armed bandit problems where the expected reward is a Lipschitzfunction of the arm, and where the set of arms is either discrete or continuous. For discrete Lipschitzbandits, we derive asymptotic problem specific lower bounds for the regret satisfied by anyalgorithm, and propose OSLB and CKL-UCB, two algorithms that efficiently exploit the Lipschitzstructure of the problem. In fact, we prove that OSLB is asymptotically optimal, as its asymptoticregret matches the lower bound. The regret analysis of our algorithms relies on a new concentrationinequality for weighted sums of KL divergences between the empirical distributions of rewards andtheir true distributions. For continuous Lipschitz bandits, we propose to first discretize the actionspace, and then apply OSLB or CKL-UCB, algorithms that provably exploit the structure efficiently.This approach is shown, through numerical experiments, to significantly outperform existing algorithmsthat directly deal with the continuous set of arms. Finally the results and algorithms areextended to contextual bandits with similarities.

Series
Journal of Machine Learning Research, ISSN 1532-4435
Keywords
Multi-armed Bandits
National Category
Probability Theory and Statistics Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-146876 (URN)2-s2.0-84939604123 (Scopus ID)
Conference
COLT,Barcelona, Spain, June 13-15, 2014
Note

QC 20140826

Available from: 2014-08-13 Created: 2014-06-17 Last updated: 2018-01-11Bibliographically approved
Magureanu, S., Dokoohaki, N., Mokarizadeh, S. & Matskin, M. (2012). Design and Analysis of a Gossip-based Decentralized Trust Recommender System. In: 4th ACM Recommender Systems (RecSys) Workshop on Recommender Systems & the Social Web: . Paper presented at Recommender Systems and the Social Web (RSWEB'2012).
Open this publication in new window or tab >>Design and Analysis of a Gossip-based Decentralized Trust Recommender System
2012 (English)In: 4th ACM Recommender Systems (RecSys) Workshop on Recommender Systems & the Social Web, 2012Conference paper, Oral presentation only (Refereed)
Abstract [en]

Information overload has become an increasingly common problem in today’s large scale internet applications. Collaborative filtering(CF) recommendation systems have emerged   as a popular solution to this problem by taking advantage of underlying social networks. Traditional CF recommenders suffer from lack of scalability[18] while decentralized recommendation systems (DHT-based, Gossip-based etc.) have promised to alleviate this problem. Thus, in this paper we propose a decentralized approach to CF recommender sys  tems that takes advantage of the popular P2P T-Man algorithm to create and maintain an overlay network capable of generating predictions based on only local information. We       analyze our approaches performance in terms of prediction accuracy and item-coverage function of neighborhood size as well as number of T-Man rounds. We show our system       achieves better accuracy than previous approaches while implementing a highly scalable, decentralized paradigm. We also show our system is able to generate predictions for a       large fraction of users, which is comparable with the centralized approaches.

Keywords
Trust, Decentralized, Gossip, Recommender Systems
National Category
Information Systems
Identifiers
urn:nbn:se:kth:diva-118489 (URN)
Conference
Recommender Systems and the Social Web (RSWEB'2012)
Note

QC 20130220

Available from: 2013-02-19 Created: 2013-02-19 Last updated: 2018-01-11Bibliographically approved
Magureanu, S., Dokoohaki, N., Mokarizadeh, S. & Matskin, M. (2012). Epidemic trust-based recommender systems. In: Proceedings - 2012 ASE/IEEE International Conference on Privacy, Security, Risk and Trust and 2012 ASE/IEEE International Conference on Social Computing, SocialCom/PASSAT 2012: . Paper presented at 2012 ASE/IEEE International Conference on Social Computing, SocialCom 2012 and the 2012 ASE/IEEE International Conference on Privacy, Security, Risk and Trust, PASSAT 2012; Amsterdam; 3 September 2012 through 5 September 2012 (pp. 461-470). IEEE
Open this publication in new window or tab >>Epidemic trust-based recommender systems
2012 (English)In: Proceedings - 2012 ASE/IEEE International Conference on Privacy, Security, Risk and Trust and 2012 ASE/IEEE International Conference on Social Computing, SocialCom/PASSAT 2012, IEEE , 2012, p. 461-470Conference paper, Published paper (Refereed)
Abstract [en]

Collaborative filtering(CF) recommender systems are among the most popular approaches to solving the information overload problem in social networks by generating accurate predictions based on the ratings of similar users. Traditional CF recommenders suffer from lack of scalability while decentralized CF recommenders (DHT-based, Gossip-based etc.) have promised to alleviate this problem. Thus, in this paper we propose a decentralized approach to CF recommender systems that uses the T-Man algorithm to create and maintain an overlay network that in turn would facilitate the generation of recommendations based on local information of a node. We analyse the influence of the number of rounds and neighbors on the accuracy of prediction and item coverage and we propose a new approach to inferring trust values between a user and its neighbors. Our experiment son two datasets show an improvement of prediction accuracy relative to previous approaches while using a highly scalable, decentralized paradigm. We also analyse item coverage and show that our system is able to generate predictions for significant fraction of the users, which is comparable with the centralized approaches.

Place, publisher, year, edition, pages
IEEE, 2012
Keywords
Accuracy, Equations, Measurement, Peer to peer computing, Prediction algorithms, Recommender systems, Social network services, collaborative filtering, recommender systems, social networking (online), trusted computing, T-Man algorithm, collaborative filtering recommender systems, decentralized CF recommenders system, decentralized paradigm, epidemic trust-based recommender systems, prediction accuracy, social networks, trust values
National Category
Information Systems
Identifiers
urn:nbn:se:kth:diva-118468 (URN)10.1109/SocialCom-PASSAT.2012.94 (DOI)2-s2.0-84873681802 (Scopus ID)978-076954848-7 (ISBN)
Conference
2012 ASE/IEEE International Conference on Social Computing, SocialCom 2012 and the 2012 ASE/IEEE International Conference on Privacy, Security, Risk and Trust, PASSAT 2012; Amsterdam; 3 September 2012 through 5 September 2012
Note

QC 20130226

Available from: 2013-02-19 Created: 2013-02-19 Last updated: 2018-01-11Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-2988-8464

Search in DiVA

Show all publications