kth.sePublikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Encoding Sequential Structures using Kernels.
KTH, Skolan för datavetenskap och kommunikation (CSC).
2012 (Engelska)Självständigt arbete på avancerad nivå (yrkesexamen), 20 poäng / 30 hpStudentuppsats (Examensarbete)
Abstract [en]

Andrea Baisero

Encoding Sequential Structures using Kernels

Sequential data-types represent a natural model for information in many fields, such as Time-Series Analysis and Computational Biology. Having a very dynamic nature, sequential data still represents a challenge to modern learning methods, which struggle to fully integrate it into their mechanisms. Kernel Methods offer a practical and accessible framework for the integration of structured data into well-established Machine Learning algorithms, with the only requirement being the definition of an adequate kernel function which reflects the task at hand.

Mainstream kernel functions for strings, particular instances of sequences, are mostly based on explicitly constructed feature vectors which describe the input by its relationship with respect to a set of references, usually n-grams. More recently, new efforts have been made to develop more sophisticated and dedicated solutions: the current state-of-the-art uses paths to access alignment-based matchings between sequences.

This Master project aims at broadening the limited variety of existing kernels for sequences. Inspired by modern methods, and with consideration of the issues they suffer from, this project has resulted in the development of 3 novel kernel functions.

A set of experiments is set up to perform a qualitative evaluation of the proposed methods. For this purpose, univariate artificial data sequences are created with various levels of noise to simulate real-world scenarios. The experiments include tasks of classification, evaluated through appropriate cross-validation, and regression, along with visual representations of the feature space induced by the kernels. The results of the experiments show substantial improvements in classification, generalisation and robustness to various levels of noise.

Ort, förlag, år, upplaga, sidor
2012.
Serie
Trita-CSC-E, ISSN 1653-5715 ; 2012:094
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
URN: urn:nbn:se:kth:diva-130915OAI: oai:DiVA.org:kth-130915DiVA, id: diva2:654361
Utbildningsprogram
Civilingenjörsexamen - Informationsteknik
Uppsök
teknik
Handledare
Examinatorer
Tillgänglig från: 2013-10-07 Skapad: 2013-10-07 Senast uppdaterad: 2022-06-23

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

http://www.nada.kth.se/utbildning/grukth/exjobb/rapportlistor/2012/rapporter12/baisero_andrea_12094.pdf
Av organisationen
Skolan för datavetenskap och kommunikation (CSC)
Datavetenskap (datalogi)

Sök vidare utanför DiVA

GoogleGoogle Scholar

urn-nbn

Altmetricpoäng

urn-nbn
Totalt: 790 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf