kth.sePublikationer KTH
Ändra sökning
Länk till posten
Permanent länk

Direktlänk
Ceylan, Ciwan
Publikationer (9 of 9) Visa alla publikationer
Ceylan, C., Ghoorchian, K. & Kragic Jensfelt, D. (2025). Disobeying Directions: Switching Random Walk Filters for Unsupervised Node Embedding Learning on Directed Graphs. Transactions on Machine Learning Research, 2025-June
Öppna denna publikation i ny flik eller fönster >>Disobeying Directions: Switching Random Walk Filters for Unsupervised Node Embedding Learning on Directed Graphs
2025 (Engelska)Ingår i: Transactions on Machine Learning Research, E-ISSN 2835-8856, Vol. 2025-JuneArtikel i tidskrift (Refereegranskat) Published
Abstract [en]

Unsupervised learning of node embeddings for directed graphs (digraphs) requires careful handling to ensure unbiased modelling. This paper addresses two key challenges: (1) the obstruction of information propagation in random walk and message-passing methods due to local sinks, and (2) the representation of multiple multi-step directed neighbourhoods, arising from the distinction between in-and out-neighbours. These challenges are interconnected—local sinks can be mitigated by treating the graph as undirected, but this comes at the cost of discarding all directional information. We make two main contributions to unsupervised embedding learning for digraphs. First, we introduce ReachNEs (Reachability Node Embeddings), a general framework for analysing embedding models and diagnosing local sink behaviour on digraphs. ReachNEs defines the reachability filter, a matrix polynomial over normalized adjacency matrices that captures multi-step, direction-sensitive proximity. It unifies the analysis of message-passing and random walk models, making its insights applicable across a wide range of embedding methods. Second, we propose DirSwitch, a novel embedding model that resolves both local sink bias and neighbourhood multiplicity via switching random walks. These walks use directed edges for local steps, preserving directional structure, then switch to undirected edges for long-range transitions, enabling escape from local sinks and improving information dispersal. Empirical results on node classification benchmarks demonstrate that DirSwitch consistently outperforms state-of-the-art unsupervised digraph proximity embedding methods, and also serves as a flexible digraph extension for self-supervised graph neural networks.

Ort, förlag, år, upplaga, sidor
Transactions on Machine Learning Research, 2025
Nationell ämneskategori
Datavetenskap (datalogi) Sannolikhetsteori och statistik
Identifikatorer
urn:nbn:se:kth:diva-368840 (URN)2-s2.0-105009417431 (Scopus ID)
Anmärkning

QC 20250902

Tillgänglig från: 2025-09-02 Skapad: 2025-09-02 Senast uppdaterad: 2025-09-02Bibliografiskt granskad
Ceylan, C., Ghoorchian, K. & Kragic Jensfelt, D. (2025). Disobeying Directions: Switching Random Walk Filters for Unsupervised Node Embedding Learning on Directed Graphs. Transactions on Machine Learning Research
Öppna denna publikation i ny flik eller fönster >>Disobeying Directions: Switching Random Walk Filters for Unsupervised Node Embedding Learning on Directed Graphs
2025 (Engelska)Ingår i: Transactions on Machine Learning Research, E-ISSN 2835-8856Artikel i tidskrift (Refereegranskat) Epub ahead of print
Nyckelord
directed graphs, node embeddings, unsupervised learning, random walks
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:kth:diva-366840 (URN)
Forskningsfinansiär
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Anmärkning

QC 20250806

Tillgänglig från: 2025-07-18 Skapad: 2025-07-18 Senast uppdaterad: 2026-01-08Bibliografiskt granskad
Ceylan, C., Ghoorchian, K. & Kragic, D. (2025). Full-Rank Unsupervised Node Embeddings for Directed Graphs via Message Aggregation. Transactions on Machine Learning Research, 5
Öppna denna publikation i ny flik eller fönster >>Full-Rank Unsupervised Node Embeddings for Directed Graphs via Message Aggregation
2025 (Engelska)Ingår i: Transactions on Machine Learning Research, E-ISSN 2835-8856, Vol. 5Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

Linear message-passing models have emerged as compelling alternatives to non-linear graph neural networks for unsupervised node embedding learning, due to their scalability and competitive performance on downstream tasks. However, we identify a fundamental flaw in recently proposed linear models that combine embedding aggregation with concatenation during each message-passing iteration: rank deficiency. A rank-deficient embedding matrix contains column vectors which take arbitrary values, leading to ill-conditioning that degrades downstream task accuracy, particularly in unsupervised tasks such as graph alignment. We deduce that repeated embedding aggregation and concatenation introduces linearly dependent features, causing rank deficiency. To address this, we propose ACC (Aggregate, Compress, Concatenate), a novel model that avoids redundant feature computation by applying aggregation to the messages from the previous iteration, rather than the embeddings. Consequently, ACC generates full-rank embeddings, significantly improving graph alignment accuracy from 10% to 60% compared to rank-deficient embeddings, while also being faster to compute. Additionally, ACC employs directed message-passing and achieves node classification accuracies comparable to state-of-the-art self-supervised graph neural networks on directed graph benchmarks, while also being over 70 times faster on graphs with over 1 million edges.

Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:kth:diva-365164 (URN)
Forskningsfinansiär
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Anmärkning

QC 20250630

Tillgänglig från: 2025-06-19 Skapad: 2025-06-19 Senast uppdaterad: 2025-06-30Bibliografiskt granskad
Ceylan, C., Ghoorchian, K. & Kragic Jensfelt, D. (2025). Full-Rank Unsupervised Node Embeddings for Directed Graphs via Message Aggregation. Transactions on Machine Learning Research, 2025-June
Öppna denna publikation i ny flik eller fönster >>Full-Rank Unsupervised Node Embeddings for Directed Graphs via Message Aggregation
2025 (Engelska)Ingår i: Transactions on Machine Learning Research, E-ISSN 2835-8856, Vol. 2025-JuneArtikel i tidskrift (Refereegranskat) Published
Abstract [en]

Linear message-passing models have emerged as compelling alternatives to non-linear graph neural networks for unsupervised node embedding learning, due to their scalability and competitive performance on downstream tasks. However, we identify a fundamental flaw in recently proposed linear models that combine embedding aggregation with concatenation during each message-passing iteration: rank deficiency. A rank-deficient embedding matrix contains column vectors which take arbitrary values, leading to ill-conditioning that degrades downstream task accuracy, particularly in unsupervised tasks such as graph alignment. We deduce that repeated embedding aggregation and concatenation introduces linearly dependent features, causing rank deficiency. To address this, we propose ACC (Aggregate, Compress, Concatenate), a novel model that avoids redundant feature computation by applying aggregation to the messages from the previous iteration, rather than the embeddings. Consequently, ACC generates full-rank embeddings, significantly improving graph alignment accuracy from 10% to 60% compared to rank-deficient embeddings, while also being faster to compute. Additionally, ACC employs directed message-passing and achieves node classification accuracies comparable to state-of-the-art self-supervised graph neural networks on directed graph benchmarks, while also being over 70 times faster on graphs with over 1 million edges.

Ort, förlag, år, upplaga, sidor
Transactions on Machine Learning Research, 2025
Nationell ämneskategori
Datavetenskap (datalogi) Kommunikationssystem
Identifikatorer
urn:nbn:se:kth:diva-366571 (URN)2-s2.0-105007994859 (Scopus ID)
Anmärkning

QC 20250710

Tillgänglig från: 2025-07-10 Skapad: 2025-07-10 Senast uppdaterad: 2025-07-10Bibliografiskt granskad
Ceylan, C., Ghoorchian, K. & Kragic, D. (2024). Scalable Unsupervised Feature Selection with Reconstruction Error Guarantees via QMR Decomposition. In: CIKM 2024 - Proceedings of the 33rd ACM International Conference on Information and Knowledge Management: . Paper presented at 33rd ACM International Conference on Information and Knowledge Management, CIKM 2024, Boise, United States of America, October 21-25, 2024 (pp. 3658-3662). Association for Computing Machinery (ACM)
Öppna denna publikation i ny flik eller fönster >>Scalable Unsupervised Feature Selection with Reconstruction Error Guarantees via QMR Decomposition
2024 (Engelska)Ingår i: CIKM 2024 - Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, Association for Computing Machinery (ACM) , 2024, s. 3658-3662Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Unsupervised feature selection (UFS) methods have garnered significant attention for their capability to eliminate redundant features without relying on class label information. However, their scalability to large datasets remains a challenge, rendering common UFS methods impractical for such applications. To address this issue, we introduce QMR-FS, a greedy forward filtering approach that selects linearly independent features up to a specified relative tolerance, ensuring that any excluded features can be reconstructed from the retained set within this tolerance. This is achieved through the QMR matrix decomposition, which builds upon the well-known QR decomposition. QMR-FS benefits from linear complexity relative to the number of instances and boasts exceptional performance due to its ability to leverage parallelized computation on both CPU and GPU. Despite its greedy nature, QMR-FS achieves comparable classification and clustering accuracies across multiple datasets when compared to other UFS methods, while achieving runtimes approximately 10 times faster than recently proposed scalable UFS methods for datasets ranging from 100 million to 1 billion elements.

Ort, förlag, år, upplaga, sidor
Association for Computing Machinery (ACM), 2024
Nyckelord
feature selection, linear independence, scalability, unsupervised learning
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:kth:diva-357143 (URN)10.1145/3627673.3679994 (DOI)001349579603081 ()2-s2.0-85210013171 (Scopus ID)
Konferens
33rd ACM International Conference on Information and Knowledge Management, CIKM 2024, Boise, United States of America, October 21-25, 2024
Anmärkning

Part of ISBN 9798400704369

QC 20241205

Tillgänglig från: 2024-12-04 Skapad: 2024-12-04 Senast uppdaterad: 2025-12-08Bibliografiskt granskad
Ceylan, C., Franzen, S. & Pokorny, F. T. (2021). Learning Node Representations Using Stationary Flow Prediction on Large Payment and Cash Transaction Networks. In: Meila, M Zhang, T (Ed.), International Conference On Machine Learning, Vol 139: . Paper presented at International Conference on Machine Learning (ICML), JUL 18-24, 2021, ELECTR NETWORK. JMLR-JOURNAL MACHINE LEARNING RESEARCH, 139
Öppna denna publikation i ny flik eller fönster >>Learning Node Representations Using Stationary Flow Prediction on Large Payment and Cash Transaction Networks
2021 (Engelska)Ingår i: International Conference On Machine Learning, Vol 139 / [ed] Meila, M Zhang, T, JMLR-JOURNAL MACHINE LEARNING RESEARCH , 2021, Vol. 139Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Banks are required to analyse large transaction datasets as a part of the fight against financial crime. Today, this analysis is either performed manually by domain experts or using expensive feature engineering. Gradient flow analysis allows for basic representation learning as node potentials can be inferred directly from network transaction data. However, the gradient model has a fundamental limitation: it cannot represent all types of of network flows. Furthermore, standard methods for learning the gradient flow are not appropriate for flow signals that span multiple orders of magnitude and contain outliers, i.e. transaction data. In this work, the gradient model is extended to a gated version and we prove that it, unlike the gradient model, is a universal approximator for flows on graphs. To tackle the mentioned challenges of transaction data, we propose a multi-scale and outlier robust loss function based on the Student-t log-likelihood. Ethereum transaction data is used for evaluation and the gradient models outperform MLP models using hand-engineered and node2vec features in terms of relative error. These results extend to 60 synthetic datasets, with experiments also showing that the gated gradient model learns qualitative information about the underlying synthetic generative flow distributions.

Ort, förlag, år, upplaga, sidor
JMLR-JOURNAL MACHINE LEARNING RESEARCH, 2021
Serie
Proceedings of Machine Learning Research, ISSN 2640-3498
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:kth:diva-303379 (URN)000683104601036 ()
Konferens
International Conference on Machine Learning (ICML), JUL 18-24, 2021, ELECTR NETWORK
Anmärkning

QC 20211015

Tillgänglig från: 2021-10-15 Skapad: 2021-10-15 Senast uppdaterad: 2022-06-25Bibliografiskt granskad
Ceylan, C., Franzén, S. & Pokorny, F. T. (2021). Learning Node Representations Using Stationary Flow Prediction on Large Payment and Cash Transaction Networks. In: Proceedings of the 38th International Conference on Machine Learning, ICML 2021: . Paper presented at 38th International Conference on Machine Learning, ICML 2021, Virtual, Online, NA, Jul 18 2021 - Jul 24 2021 (pp. 1395-1406). ML Research Press
Öppna denna publikation i ny flik eller fönster >>Learning Node Representations Using Stationary Flow Prediction on Large Payment and Cash Transaction Networks
2021 (Engelska)Ingår i: Proceedings of the 38th International Conference on Machine Learning, ICML 2021, ML Research Press , 2021, s. 1395-1406Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Banks are required to analyse large transaction datasets as a part of the fight against financial crime. Today, this analysis is either performed manually by domain experts or using expensive feature engineering. Gradient flow analysis allows for basic representation learning as node potentials can be inferred directly from network transaction data. However, the gradient model has a fundamental limitation: it cannot represent all types of of network flows. Furthermore, standard methods for learning the gradient flow are not appropriate for flow signals that span multiple orders of magnitude and contain outliers, i.e. transaction data. In this work, the gradient model is extended to a gated version and we prove that it, unlike the gradient model, is a universal approximator for flows on graphs. To tackle the mentioned challenges of transaction data, we propose a multi-scale and outlier robust loss function based on the Student-t log-likelihood. Ethereum transaction data is used for evaluation and the gradient models outperform MLP models using hand-engineered and node2vec features in terms of relative error. These results extend to 60 synthetic datasets, with experiments also showing that the gated gradient model learns qualitative information about the underlying synthetic generative flow distributions.

Ort, förlag, år, upplaga, sidor
ML Research Press, 2021
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:kth:diva-347968 (URN)2-s2.0-85134915157 (Scopus ID)
Konferens
38th International Conference on Machine Learning, ICML 2021, Virtual, Online, NA, Jul 18 2021 - Jul 24 2021
Anmärkning

Part of ISBN [9781713845065]

QC 20240703

Tillgänglig från: 2024-07-03 Skapad: 2024-07-03 Senast uppdaterad: 2024-07-03Bibliografiskt granskad
Ceylan, C. & Gutmann, M. U. (2018). Conditional Noise-Contrastive Estimation of Unnormalised Models. In: Dy, J Krause, A (Ed.), 35th International Conference on Machine Learning, ICML 2018: . Paper presented at 35th International Conference on Machine Learning (ICML), JUL 10-15, 2018, Stockholm, Sweden (pp. 1334-1442). International Machine Learning Society (IMLS), 80
Öppna denna publikation i ny flik eller fönster >>Conditional Noise-Contrastive Estimation of Unnormalised Models
2018 (Engelska)Ingår i: 35th International Conference on Machine Learning, ICML 2018 / [ed] Dy, J Krause, A, International Machine Learning Society (IMLS) , 2018, Vol. 80, s. 1334-1442Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Many parametric statistical models are not properly normalised and only specified up to an intractable partition function, which renders parameter estimation difficult. Examples of unnormalised models are Gibbs distributions, Markov random fields, and neural network models in unsupervised deep learning. In previous work, the estimation principle called noise-contrastive estimation (NCE) was introduced where unnormalised models are estimated by learning to distinguish between data and auxiliary noise. An open question is how to best choose the auxiliary noise distribution. We here propose a new method that addresses this issue. The proposed method shares with NCE the idea of formulating density estimation as a supervised learning problem but in contrast to NCE, the proposed method leverages the observed data when generating noise samples. The noise can thus be generated in a semi-automated manner. We first present the underlying theory of the new method, show that score matching emerges as a limiting case, validate the method on continuous and discrete valued synthetic data, and show that we can expect an improved performance compared to NCE when the data lie in a lower-dimensional manifold. Then we demonstrate its applicability in unsupervised deep learning by estimating a four-layer neural image model.

Ort, förlag, år, upplaga, sidor
International Machine Learning Society (IMLS), 2018
Serie
Proceedings of Machine Learning Research, ISSN 2640-3498
Nationell ämneskategori
Annan teknik
Identifikatorer
urn:nbn:se:kth:diva-318709 (URN)000683379200075 ()2-s2.0-85057220926 (Scopus ID)
Konferens
35th International Conference on Machine Learning (ICML), JUL 10-15, 2018, Stockholm, Sweden
Anmärkning

QC 20220922

Part of books: ISBN 978-151086796-3

Tillgänglig från: 2022-09-22 Skapad: 2022-09-22 Senast uppdaterad: 2023-09-22Bibliografiskt granskad
Poklukar, P., Ceylan, C., Hultin, H., Kravchenko, O., Varava, A. & Kragic, D.GraphDCA - a Framework for Node Distribution Comparison in Real and Synthetic Graphs.
Öppna denna publikation i ny flik eller fönster >>GraphDCA - a Framework for Node Distribution Comparison in Real and Synthetic Graphs
Visa övriga...
(Engelska)Manuskript (preprint) (Övrigt vetenskapligt)
Abstract [en]

We argue that when comparing two graphs, the distribution of node structural features is more informative than global graph statistics which are often used in practice, especially to evaluate graph generative models. Thus, we present GraphDCA - a framework for evaluating similarity between graphs based on the alignment of their respective node representation sets. The sets are compared using a recently proposed method for comparing representation spaces, called Delaunay Component Analysis (DCA), which we extend to graph data. To evaluate our framework, we generate a benchmark dataset of graphs exhibiting different structural patterns and show, using three node structure feature extractors, that GraphDCA recognizes graphs with both similar and dissimilar local structure. We then apply our framework to evaluate three publicly available real-world graph datasets and demonstrate, using gradual edge perturbations, that GraphDCA satisfyingly captures gradually decreasing similarity, unlike global statistics. Finally, we use GraphDCA to evaluate two state-of-the-art graph generative models, NetGAN and CELL, and conclude that further improvements are needed for these models to adequately reproduce local structural features.

Nyckelord
Representation Learning, Machine Learning, Graph Generative Models, Node Embeddings
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:kth:diva-312720 (URN)
Anmärkning

QC 20220614

Tillgänglig från: 2022-05-20 Skapad: 2022-05-20 Senast uppdaterad: 2022-06-25Bibliografiskt granskad
Organisationer

Sök vidare i DiVA

Visa alla publikationer