Change search
ReferencesLink to record
Permanent link

Direct link
A comparative study of structured prediction methods for sequence labeling
KTH, School of Computer Science and Communication (CSC).
2016 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

Some machine learning tasks have a complex output, rather than a real number or a class. Those outputs are composed by elements which have interdependences and structural properties. Methods which take into account the form of the output are known as structured prediction techniques. This study focuses on those techniques, evaluating their performance for tasks of sequence labeling and comparing them. Specifically, tasks of natural language processing are used as benchmarks. 

The principal problem evaluated is part-of-speech tagging. Datasets of different languages (English, Spanish, Portuguese and Dutch) and environments (newspapers, twitter and chats) are used for a general analysis. Shallow parsing and named entity recognition are also examined. The algorithms treated are structured perceptron, conditional random fields, structured support vector machines and trigram hidden Markov models. They are also compared to different approaches to solve these problems.

The results show that, in general, structured perceptron has the best performance for sequence labeling with the conditions evaluated. However, with few training examples, structured support vector machines can achieve a similar or superior accuracy. Moreover, the results for conditional ranom fields is near those two methods. The relative results of the algorithms are similar across different datasets, but the absolute accuracies are dependent on their specificities.

Place, publisher, year, edition, pages
National Category
Computer Science
URN: urn:nbn:se:kth:diva-186385OAI: diva2:927145
2016-05-02, E32, Stockholm, 11:15 (English)
Available from: 2016-05-18 Created: 2016-05-11 Last updated: 2016-05-18Bibliographically approved

Open Access in DiVA

fulltext(1239 kB)48 downloads
File information
File name FULLTEXT01.pdfFile size 1239 kBChecksum SHA-512
Type fulltextMimetype application/pdf

By organisation
School of Computer Science and Communication (CSC)
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 48 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 81 hits
ReferencesLink to record
Permanent link

Direct link