kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Data-Driven Self-Supervised Graph Representation Learning
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.ORCID iD: 0000-0002-5392-6531
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.ORCID iD: 0000-0001-7898-0879
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.ORCID iD: 0000-0003-4516-7317
2023 (English)In: ECAI 2023: 26th European Conference on Artificial Intelligence, including 12th Conference on Prestigious Applications of Intelligent Systems, PAIS 2023 - Proceedings, IOS Press , 2023, p. 629-636Conference paper, Published paper (Refereed)
Abstract [en]

Self-supervised graph representation learning (SSGRL) is a representation learning paradigm used to reduce or avoid manual labeling. An essential part of SSGRL is graph data augmentation. Existing methods usually rely on heuristics commonly identified through trial and error and are effective only within some application domains. Also, it is not clear why one heuristic is better than another. Moreover, recent studies have argued against some techniques (e.g., dropout: that can change the properties of molecular graphs or destroy relevant signals for graph-based document classification tasks). In this study, we propose a novel data-driven SSGRL approach that automatically learns a suitable graph augmentation from the signal encoded in the graph (i.e., the nodes' predictive feature and topological information). We propose two complementary approaches that produce learnable feature and topological augmentations. The former learns multi-view augmentation of node features, and the latter learns a high-order view of the topology. Moreover, the augmentations are jointly learned with the representation. Our approach is general that it can be applied to homogeneous and heterogeneous graphs. We perform extensive experiments on node classification (using nine homogeneous and heterogeneous datasets) and graph property prediction (using another eight datasets). The results show that the proposed method matches or outperforms the SOTA SSGRL baselines and performs similarly to semi-supervised methods. The anonymised source code is available at https://github.com/AhmedESamy/dsgrl/

Place, publisher, year, edition, pages
IOS Press , 2023. p. 629-636
National Category
Computer Sciences Computer graphics and computer vision
Identifiers
URN: urn:nbn:se:kth:diva-339683DOI: 10.3233/FAIA230325Scopus ID: 2-s2.0-85175858097OAI: oai:DiVA.org:kth-339683DiVA, id: diva2:1812477
Conference
26th European Conference on Artificial Intelligence, ECAI 2023, Krakow, Poland, Sep 30 2023 - Oct 4 2023
Note

Part of ISBN 9781643684369

QC 20231116

Available from: 2023-11-16 Created: 2023-11-16 Last updated: 2025-09-29Bibliographically approved
In thesis
1. Representation Learning on Graphs: Investigating and Overcoming Common Challenges
Open this publication in new window or tab >>Representation Learning on Graphs: Investigating and Overcoming Common Challenges
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Graph Representation Learning (GRL) has emerged as a crucial area for modeling and understanding the structure of graph-structured data across diverse applications. This thesis advances GRL by addressing key challenges in both homogeneous and heterogeneous graphs, including modeling complex heterogeneous relational structures, designing generalizable augmentations for self-supervised learning, improving inductive link prediction in cold-start scenarios, and mitigating over-squashing in message-passing architectures.

Heterogeneous graphs present modeling difficulties due to the presence of multiple node and edge types. To address this, we propose a flexible random walk framework that removes the need for predefined domain knowledge such as meta-paths, enabling more effective and scalable modeling of complex relational structures.

In the self-supervised learning setting, current GRL methods often rely on manually designed graph augmentations that limit generalizability. This thesis introduces augmentation techniques that are task- and domain-agnostic, improving performance across varied graph types and structures.

Inductive link prediction remains challenging for GNNs, particularly in cold-start scenarios where target nodes lack topological context. We propose methods that support efficient and accurate inference without requiring access to neighborhood information of unseen nodes, addressing both scalability and generalization.

While GNNs are effective at capturing local structure, they often suffer from over-squashing, which restricts information propagation across long-range dependencies. To overcome this, we present strategies that improve the aggregation process, enabling GNNs to better preserve and prioritize critical signals from distant parts of the graph.

Through extensive experiments on benchmark datasets, the proposed methods demonstrate consistent improvements in node classification, link prediction, and graph property prediction tasks. Our approaches outperform strong baselines in settings involving heterogeneity, inductive generalization, and large-diameter graphs. Some methods significantly reduce inference cost, while others enhance model expressiveness and robustness by improving structural generalization. Collectively, these contributions show that principled and general-purpose solutions can effectively address long-standing challenges in graph representation learning.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2025. p. ix, 70
Series
TRITA-EECS-AVL ; 2025:92
Keywords
Graph Machine Learning, Representation Learning
National Category
Artificial Intelligence
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-370617 (URN)978-91-8106-426-1 (ISBN)
Public defence
2025-11-07, F3, Lindstedtsvägen 26 & 28, KTH Campus, Stocholm, 09:00 (English)
Opponent
Supervisors
Note

QC 20250929

Available from: 2025-09-29 Created: 2025-09-29 Last updated: 2025-09-29Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Samy, Ahmed E.Kefato, Zekarias TilahunGirdzijauskas, Sarunas

Search in DiVA

By author/editor
Samy, Ahmed E.Kefato, Zekarias TilahunGirdzijauskas, Sarunas
By organisation
Software and Computer systems, SCS
Computer SciencesComputer graphics and computer vision

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 245 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf