Change search
ReferencesLink to record
Permanent link

Direct link
Illustrations of Data Analysis Using the Mapper Algorithm and Persistent Homology
KTH, School of Engineering Sciences (SCI), Mathematics (Dept.), Mathematics (Div.).
2016 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesisAlternative title
Dataanalys illustrerad genom mapper-algoritmen och persistent homologi (Swedish)
Abstract [en]

The Mapper algorithm and persistent homology are topological data analysis tools used for analyzing point cloud data. In addition a classification method is used as a part of the data analysis toolchain adopted in this thesis in order to serve as a distinguishing technique for two class labels.

This thesis has two major goals; the first goal is to present persistent homology and the Mapper algorithm as two techniques by which shapes, mostly point clouds sampled from shapes of known topology can be identified and visualized even though in some cases noise is being there. We then provide some illustrative examples in the form of barcodes, persistence diagrams and topological network models for several point cloud data.

The second goal is to propose an approach for extracting useful insights from point cloud data based on the use of Mapper and a classification technique known as the penalized logistic regression. We then provide two real-world datasets for which both continuous and categorical responses are considered. We show that it is very advantageous to apply a topological mapping tool such as the Mapper algorithm on a dataset as a pre-processing organizing step before using a classification technique.

We finally show that the Mapper algorithm not only allows for visualizing point cloud data but also allows for detecting possible flarelike shapes that are present in the shape of the data. Those detected flares are given class labels and the classification task at that point is to distinguish one from the other in order to discover relationships between variables in such a way that allows for generalizing those relationships to hold on previously unseen data.

Abstract [sv]

Mapper-algoritmen och persistent homologi tillämpas som verktyg inom topologisk dataanalys. Dessutom används en klassiceringsmetod för att kunna skilja mellan två klasser som definieras i en topologisk nätverksmodell.

Syftet är framförallt att uppnå två huvudmål; det första att presentera persistent homologi och Mapper-algoritmen som verktyg för att identifiera och visualisera punktmoln samplade från objekt av känd topologi även i sådana fall där slumpmässiga fel förekommer. Detta illustreras med flera exempel av punktmoln och motsvarande persistenta diagram samt topologiska nätverksmodeller.

Det andra målet är att föreslå en metod baserad på användning av Mapper-algoritmen och en klassificeringsteknik för att kunna extrahera kunskap ur två datamängder, varav den ena har en kategorisk responsvariabel och den andra en kontinuerlig.

Slutligen visas att Mapper-algoritmen gör det möjligt att omvandla högdimensionell data till en tvådimensionell graf som är lätt att visualisera och som sedan kan analyseras genom statistiska inlärningsmetoder.

Place, publisher, year, edition, pages
TRITA-MAT-E, 2016:03
National Category
URN: urn:nbn:se:kth:diva-181787OAI: diva2:900997
Subject / course
Educational program
Master of Science - Applied and Computational Mathematics
Available from: 2016-02-05 Created: 2016-02-03 Last updated: 2016-02-05Bibliographically approved

Open Access in DiVA

fulltext(3473 kB)833 downloads
File information
File name FULLTEXT01.pdfFile size 3473 kBChecksum SHA-512
Type fulltextMimetype application/pdf

By organisation
Mathematics (Div.)

Search outside of DiVA

GoogleGoogle Scholar
Total: 833 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 3266 hits
ReferencesLink to record
Permanent link

Direct link