Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Cross Site Product Page Classification with Supervised Machine Learning
KTH, School of Computer Science and Communication (CSC).
2016 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesisAlternative title
Webbsideöverskridande klassificering av produktsidor med övervakad maskininlärning (Swedish)
Abstract [en]

This work outlines a possible technique for identifying webpages that contain product  specifications. Using support vector machines a product web page classifier was constructed and tested with various settings. The final result for this classifier ended up being 0.958 in precision and 0.796 in recall for product pages. The scores imply that the method could be considered a valid technique in real world web classification tasks if additional features and more data were made available.

Place, publisher, year, edition, pages
2016. , 46 p.
Keyword [en]
svm support vector machine product page classification
National Category
Computer and Information Science
Identifiers
URN: urn:nbn:se:kth:diva-189555OAI: oai:DiVA.org:kth-189555DiVA: diva2:946837
External cooperation
Findwise Stockholm AB
Subject / course
Computer Technology, Program- and System Development
Educational program
Master of Science in Engineering - Computer Science and Technology
Supervisors
Examiners
Available from: 2016-07-06 Created: 2016-07-06 Last updated: 2016-07-06Bibliographically approved

Open Access in DiVA

fulltext(1204 kB)85 downloads
File information
File name FULLTEXT01.pdfFile size 1204 kBChecksum SHA-512
20ac5b21f4aebb49fd18a8a40083bbfeda5a447d653fcd76d7406539fcafc33e2b364e3687f051de94dcadfb1037ef4ce7e2e00284b246c231031f6163a316e4
Type fulltextMimetype application/pdf

By organisation
School of Computer Science and Communication (CSC)
Computer and Information Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 85 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 14249 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf