kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
GraphDCA - a Framework for Node Distribution Comparison in Real and Synthetic Graphs
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST). (Robotik, perception och lärande, RPL, Robotics, Perception and Learning, RPL)ORCID iD: 0000-0001-6920-5109
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.
KTH, School of Engineering Sciences (SCI), Mathematics (Dept.), Mathematical Statistics.ORCID iD: 0000-0002-0067-4908
KTH, School of Engineering Sciences in Chemistry, Biotechnology and Health (CBH), Chemistry, Organic chemistry.ORCID iD: 0000-0002-9001-7708
Show others and affiliations
(English)Manuscript (preprint) (Other academic)
Abstract [en]

We argue that when comparing two graphs, the distribution of node structural features is more informative than global graph statistics which are often used in practice, especially to evaluate graph generative models. Thus, we present GraphDCA - a framework for evaluating similarity between graphs based on the alignment of their respective node representation sets. The sets are compared using a recently proposed method for comparing representation spaces, called Delaunay Component Analysis (DCA), which we extend to graph data. To evaluate our framework, we generate a benchmark dataset of graphs exhibiting different structural patterns and show, using three node structure feature extractors, that GraphDCA recognizes graphs with both similar and dissimilar local structure. We then apply our framework to evaluate three publicly available real-world graph datasets and demonstrate, using gradual edge perturbations, that GraphDCA satisfyingly captures gradually decreasing similarity, unlike global statistics. Finally, we use GraphDCA to evaluate two state-of-the-art graph generative models, NetGAN and CELL, and conclude that further improvements are needed for these models to adequately reproduce local structural features.

Keywords [en]
Representation Learning, Machine Learning, Graph Generative Models, Node Embeddings
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:kth:diva-312720OAI: oai:DiVA.org:kth-312720DiVA, id: diva2:1659732
Note

QC 20220614

Available from: 2022-05-20 Created: 2022-05-20 Last updated: 2022-06-25Bibliographically approved
In thesis
1. Learning and Evaluating the Geometric Structure of Representation Spaces
Open this publication in new window or tab >>Learning and Evaluating the Geometric Structure of Representation Spaces
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Efficient representations of observed input data have been shown to significantly accelerate the performance of subsequent learning tasks in numerous domains. To obtain such representations automatically, we need to design both i) models that identify useful patterns in the input data and encode them into structured low dimensional representations, and ii) evaluation measures that accurately assess the quality of the resulting representations. In this thesis, we present work that addresses both these requirements, where we extensively focus on requirement ii) since the evaluation of representations has been largely unexplored in the machine learning research. We begin with an overview of representation learning techniques and different structures that can be imposed on representation spaces, thus first addressing i). In this regard,we present a representation learning model that identifies useful patterns from multimodal data, and describe an approach that promotes a structure on there presentation space that is favourable for performing a robotics task. We then thoroughly study the problem of assessing the quality of learned representations and overview the pitfalls of current practices. With this, we motivate the evaluation based on analyzing geometric properties of representations and present two novel evaluation algorithms constituting the core of this thesis. Finally, we present an application of the proposed evaluation algorithms to compare large input graphs.

Abstract [sv]

Effektive representationer av observerat input-data har visat sig ge ensignifikant ökning av prestandan för träningsproblem i ett flertal områden.För att på ett automatiskt sett få fram sådana representationer behövervi både i) modeller som kan identifiera användbara mönster i input-datatoch koda dessa till strukturerade lågdimensionella representationer, samtii) utvärderingsmått som på ett tillförlitligt sätt mäter kvaliteten av dessarepresentationer. I denna avhandling presenterar vi arbete som hanterar bådadessa krav, där fokus ligger på ii) eftersom utvärdering av representationerhar varit ett i stort sätt outforskat ämne i litteraturen för maskininlärning.Vi börjar med en översikt av representationsinlärningstekniker och typer avstrukturer som man kan förelägga på representationsrymden, vilket tillhöri). I detta avseende, presenterar vi modell för representationsinlärning somidentifierar användbara mönster från multimodal data, samt beskriver enmetod som framhäver struktur på representationsrymden som gör sig välpassande för robotikuppgift. Vi studerar sedan genomgående problemet medatt avgöra kvaliteten av dessa inlärda representationer och ger en översikt avvanliga fallgropar som finns med nuvarande metoder. Vi motiverar med dettautvärderingen baserat på av representationernas geometriska egenskaper ochpresenterar två nya utvärderingsalgoritmer vilka huvuddelen av avhandlingenbestår av. Slutligen så presenterar vi ett praktiskt användningsområde avalgoritmerna för att jämföra stora inputgrafer.

Place, publisher, year, edition, pages
KTH Royal Institute of Technology, 2022. p. 54
Series
TRITA-EECS-AVL ; 2022:33
Keywords
Representation Learning, Machine Learning, Generative Models
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-312723 (URN)978-91-8040-228-6 (ISBN)
Public defence
2022-06-13, https://kth-se.zoom.us/j/65953366981, F3, Lindstedtsvägen 26, Stockholm, 15:00 (English)
Opponent
Supervisors
Note

QC 20220523

Available from: 2022-05-23 Created: 2022-05-20 Last updated: 2022-06-25Bibliographically approved

Open Access in DiVA

GraphDCA(928 kB)83 downloads
File information
File name FULLTEXT01.pdfFile size 928 kBChecksum SHA-512
519fbe0070d19989e6920edd713e0edc89d5a628adbe7e83db98cb71fd94cecdec74a8411155d723bc4c0c2df295b5925097f39828f2f3570d56219f179afd44
Type fulltextMimetype application/pdf

Authority records

Ceylan, CiwanHultin, HannaKravchenko, OleksandrVarava, AnastasiiaKragic, Danica

Search in DiVA

By author/editor
Poklukar, PetraCeylan, CiwanHultin, HannaKravchenko, OleksandrVarava, AnastasiiaKragic, Danica
By organisation
Computational Science and Technology (CST)Robotics, Perception and Learning, RPLMathematical StatisticsOrganic chemistry
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 83 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 150 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf