Change search
ReferencesLink to record
Permanent link

Direct link
Natural Emerging Patterns:A Reformulation For Classification
KTH, School of Computer Science and Communication (CSC).
2014 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Emerging Patterns (EPs) are itemsets (characteristics) whose supports change significantly from one dataset to another. They have been proposed for a very long time to capture multi-attribute contrasts between data classes or trends over time. A study carried out in this work shows that Emerging Patterns, as it is formulated to date, has several deficiencies and limitations to face classification problems. Different approaches based on this previous and deficient formulation of Emerging Patterns have been proposed in the literature. These different approaches have been created showing that, despite these limitations, have very high predictive power. These approaches range from classifiers directly built on Emerging Patterns to instance-weighting schemes for weighted Support Vector Machines. In this work, a new formulation for Emerging Patterns, which is completely aimed at dealing with classification problems, is proposed.A new classifier and a new instance-weighting scheme have also been created based on the novel formulation. They have been created to prove the advantages of this novel formulation handling classification problems over the previous formulation. An empirical study carried out on benchmark datasetsfrom the UCI Machine Learning Repository shows that the proposed classifieris superior to other state-of-the-art classification methods such as C4.5, NaiveBayes. It has also shown to be superior to all of the most representative EPbased classifiers, based on the previous and deficient formulation, in terms of overall predictive accuracy in almost all of the used databases. The created instance-weighting scheme has been also empirically compared with the previous related works outperforming them in most of the cases.

In addition to these empirical studies about the predictive power of the new formulation,a second set of studies has also been carried out. This second set of studies was made to show some other interesting features of the novel formulation.These other interesting features are, for instance, the number of patterns required to do a decent classification job or the robustness of the created classifierbased on the new version of Emerging Patterns. These other features, in addition to the overall predictive accuracy, could also be determinant in the selection of the appropriate classifier for some specific classification problems. It is this way because, in very typical situations, there could be some classification constraints such as the available computational power or, for instance, the time to classify a test instance could be limited. The raise in overall predictive accuracy as well as the other results could be considered as clear proofs of the advantages of the novel formulation handling classification problems. To finish, drafts of some possible future works based on the proposed formulation of Emerging Patterns are also given, describing them in some detail. These possible future works, if they were successful, could be of great importance and could be seen as new tools to handle classification problems.

Place, publisher, year, edition, pages
National Category
Computer Science
URN: urn:nbn:se:kth:diva-156191OAI: diva2:765632
Educational program
Bachelor of Science in Engineering - Electrical Engineering
Available from: 2014-11-24 Created: 2014-11-24 Last updated: 2014-11-24Bibliographically approved

Open Access in DiVA

fulltext(1609 kB)123 downloads
File information
File name FULLTEXT01.pdfFile size 1609 kBChecksum SHA-512
Type fulltextMimetype application/pdf

By organisation
School of Computer Science and Communication (CSC)
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 123 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 52 hits
ReferencesLink to record
Permanent link

Direct link