Change search
ReferencesLink to record
Permanent link

Direct link
Constructing decision trees for user behavior prediction in the online consumer market
KTH, School of Computer Science and Communication (CSC).
KTH, School of Computer Science and Communication (CSC).
2016 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

This thesis intends to investigate the usefulness of various aspects of product data for user behavior prediction in the online shopping market. Specifically, a data set from BestBuy was used, containing information regarding what product a user clicked on given their search query.

Decision trees are machine learning algorithms used for making predictions. The decision tree algorithm ID3 was used because of its simplicity and interpretability. It uses information gain to measure how different attributes help the tree split the set into smaller subsets. The approach was to use one decision tree for each product in the data set, and analyze the distribution of the attributes' maximum information gains in the root splits across the various trees. For each of these splits, all possible pivot values (a pivot value being the value split on) were attempted, and the pivot values were also recorded to analyze which pivot values that resulted in the most gain.

The results show that how well the query string matches the product title and description are the two most important aspects, followed by the product's novelty. The number of days since the last two reviews were written before the query proved a decent way to identify trends.

The paper also presents how the attributes were used by analyzing the pivot value distributions, with the conclusion that many attributes were used in similar ways for most products, suggesting it might be possible to create a universal tree applicable for all products.

Regarding the usefulness of decision trees, it was found that they are not very efficient for highly volatile databases, such as those found in the online shopping market. The notion of a universal tree, however, suggests that future work might investigate whether their efficiency could be improved using this, more flexible, approach.

Place, publisher, year, edition, pages
National Category
Computer Science
URN: urn:nbn:se:kth:diva-186497OAI: diva2:927446
Available from: 2016-05-18 Created: 2016-05-12 Last updated: 2016-05-18Bibliographically approved

Open Access in DiVA

fulltext(594 kB)39 downloads
File information
File name FULLTEXT01.pdfFile size 594 kBChecksum SHA-512
Type fulltextMimetype application/pdf

By organisation
School of Computer Science and Communication (CSC)
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 39 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 90 hits
ReferencesLink to record
Permanent link

Direct link