Change search
ReferencesLink to record
Permanent link

Direct link
Encoding Sequential Structures using Kernels.
KTH, School of Computer Science and Communication (CSC).
2012 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Andrea Baisero

Encoding Sequential Structures using Kernels

Sequential data-types represent a natural model for information in many fields, such as Time-Series Analysis and Computational Biology. Having a very dynamic nature, sequential data still represents a challenge to modern learning methods, which struggle to fully integrate it into their mechanisms. Kernel Methods offer a practical and accessible framework for the integration of structured data into well-established Machine Learning algorithms, with the only requirement being the definition of an adequate kernel function which reflects the task at hand.

Mainstream kernel functions for strings, particular instances of sequences, are mostly based on explicitly constructed feature vectors which describe the input by its relationship with respect to a set of references, usually n-grams. More recently, new efforts have been made to develop more sophisticated and dedicated solutions: the current state-of-the-art uses paths to access alignment-based matchings between sequences.

This Master project aims at broadening the limited variety of existing kernels for sequences. Inspired by modern methods, and with consideration of the issues they suffer from, this project has resulted in the development of 3 novel kernel functions.

A set of experiments is set up to perform a qualitative evaluation of the proposed methods. For this purpose, univariate artificial data sequences are created with various levels of noise to simulate real-world scenarios. The experiments include tasks of classification, evaluated through appropriate cross-validation, and regression, along with visual representations of the feature space induced by the kernels. The results of the experiments show substantial improvements in classification, generalisation and robustness to various levels of noise.

Place, publisher, year, edition, pages
Trita-CSC-E, ISSN 1653-5715 ; 2012:094
National Category
Computer Science
URN: urn:nbn:se:kth:diva-130915OAI: diva2:654361
Educational program
Master of Science in Engineering - Information and Communication Technology
Available from: 2013-10-07 Created: 2013-10-07

Open Access in DiVA

No full text

Other links
By organisation
School of Computer Science and Communication (CSC)
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 326 hits
ReferencesLink to record
Permanent link

Direct link