Endre søk
Link to record
Permanent link

Direct link
Ceylan, Ciwan
Publikasjoner (9 av 9) Visa alla publikasjoner
Ceylan, C., Ghoorchian, K. & Kragic Jensfelt, D. (2025). Disobeying Directions: Switching Random Walk Filters for Unsupervised Node Embedding Learning on Directed Graphs. Transactions on Machine Learning Research, 2025-June
Åpne denne publikasjonen i ny fane eller vindu >>Disobeying Directions: Switching Random Walk Filters for Unsupervised Node Embedding Learning on Directed Graphs
2025 (engelsk)Inngår i: Transactions on Machine Learning Research, E-ISSN 2835-8856, Vol. 2025-JuneArtikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

Unsupervised learning of node embeddings for directed graphs (digraphs) requires careful handling to ensure unbiased modelling. This paper addresses two key challenges: (1) the obstruction of information propagation in random walk and message-passing methods due to local sinks, and (2) the representation of multiple multi-step directed neighbourhoods, arising from the distinction between in-and out-neighbours. These challenges are interconnected—local sinks can be mitigated by treating the graph as undirected, but this comes at the cost of discarding all directional information. We make two main contributions to unsupervised embedding learning for digraphs. First, we introduce ReachNEs (Reachability Node Embeddings), a general framework for analysing embedding models and diagnosing local sink behaviour on digraphs. ReachNEs defines the reachability filter, a matrix polynomial over normalized adjacency matrices that captures multi-step, direction-sensitive proximity. It unifies the analysis of message-passing and random walk models, making its insights applicable across a wide range of embedding methods. Second, we propose DirSwitch, a novel embedding model that resolves both local sink bias and neighbourhood multiplicity via switching random walks. These walks use directed edges for local steps, preserving directional structure, then switch to undirected edges for long-range transitions, enabling escape from local sinks and improving information dispersal. Empirical results on node classification benchmarks demonstrate that DirSwitch consistently outperforms state-of-the-art unsupervised digraph proximity embedding methods, and also serves as a flexible digraph extension for self-supervised graph neural networks.

sted, utgiver, år, opplag, sider
Transactions on Machine Learning Research, 2025
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-368840 (URN)2-s2.0-105009417431 (Scopus ID)
Merknad

QC 20250902

Tilgjengelig fra: 2025-09-02 Laget: 2025-09-02 Sist oppdatert: 2025-09-02bibliografisk kontrollert
Ceylan, C., Ghoorchian, K. & Kragic Jensfelt, D. (2025). Disobeying Directions: Switching Random Walk Filters for Unsupervised Node Embedding Learning on Directed Graphs. Transactions on Machine Learning Research
Åpne denne publikasjonen i ny fane eller vindu >>Disobeying Directions: Switching Random Walk Filters for Unsupervised Node Embedding Learning on Directed Graphs
2025 (engelsk)Inngår i: Transactions on Machine Learning Research, E-ISSN 2835-8856Artikkel i tidsskrift (Fagfellevurdert) Epub ahead of print
Emneord
directed graphs, node embeddings, unsupervised learning, random walks
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-366840 (URN)
Forskningsfinansiär
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Merknad

QC 20250806

Tilgjengelig fra: 2025-07-18 Laget: 2025-07-18 Sist oppdatert: 2026-01-08bibliografisk kontrollert
Ceylan, C., Ghoorchian, K. & Kragic, D. (2025). Full-Rank Unsupervised Node Embeddings for Directed Graphs via Message Aggregation. Transactions on Machine Learning Research, 5
Åpne denne publikasjonen i ny fane eller vindu >>Full-Rank Unsupervised Node Embeddings for Directed Graphs via Message Aggregation
2025 (engelsk)Inngår i: Transactions on Machine Learning Research, E-ISSN 2835-8856, Vol. 5Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

Linear message-passing models have emerged as compelling alternatives to non-linear graph neural networks for unsupervised node embedding learning, due to their scalability and competitive performance on downstream tasks. However, we identify a fundamental flaw in recently proposed linear models that combine embedding aggregation with concatenation during each message-passing iteration: rank deficiency. A rank-deficient embedding matrix contains column vectors which take arbitrary values, leading to ill-conditioning that degrades downstream task accuracy, particularly in unsupervised tasks such as graph alignment. We deduce that repeated embedding aggregation and concatenation introduces linearly dependent features, causing rank deficiency. To address this, we propose ACC (Aggregate, Compress, Concatenate), a novel model that avoids redundant feature computation by applying aggregation to the messages from the previous iteration, rather than the embeddings. Consequently, ACC generates full-rank embeddings, significantly improving graph alignment accuracy from 10% to 60% compared to rank-deficient embeddings, while also being faster to compute. Additionally, ACC employs directed message-passing and achieves node classification accuracies comparable to state-of-the-art self-supervised graph neural networks on directed graph benchmarks, while also being over 70 times faster on graphs with over 1 million edges.

HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-365164 (URN)
Forskningsfinansiär
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Merknad

QC 20250630

Tilgjengelig fra: 2025-06-19 Laget: 2025-06-19 Sist oppdatert: 2025-06-30bibliografisk kontrollert
Ceylan, C., Ghoorchian, K. & Kragic Jensfelt, D. (2025). Full-Rank Unsupervised Node Embeddings for Directed Graphs via Message Aggregation. Transactions on Machine Learning Research, 2025-June
Åpne denne publikasjonen i ny fane eller vindu >>Full-Rank Unsupervised Node Embeddings for Directed Graphs via Message Aggregation
2025 (engelsk)Inngår i: Transactions on Machine Learning Research, E-ISSN 2835-8856, Vol. 2025-JuneArtikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

Linear message-passing models have emerged as compelling alternatives to non-linear graph neural networks for unsupervised node embedding learning, due to their scalability and competitive performance on downstream tasks. However, we identify a fundamental flaw in recently proposed linear models that combine embedding aggregation with concatenation during each message-passing iteration: rank deficiency. A rank-deficient embedding matrix contains column vectors which take arbitrary values, leading to ill-conditioning that degrades downstream task accuracy, particularly in unsupervised tasks such as graph alignment. We deduce that repeated embedding aggregation and concatenation introduces linearly dependent features, causing rank deficiency. To address this, we propose ACC (Aggregate, Compress, Concatenate), a novel model that avoids redundant feature computation by applying aggregation to the messages from the previous iteration, rather than the embeddings. Consequently, ACC generates full-rank embeddings, significantly improving graph alignment accuracy from 10% to 60% compared to rank-deficient embeddings, while also being faster to compute. Additionally, ACC employs directed message-passing and achieves node classification accuracies comparable to state-of-the-art self-supervised graph neural networks on directed graph benchmarks, while also being over 70 times faster on graphs with over 1 million edges.

sted, utgiver, år, opplag, sider
Transactions on Machine Learning Research, 2025
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-366571 (URN)2-s2.0-105007994859 (Scopus ID)
Merknad

QC 20250710

Tilgjengelig fra: 2025-07-10 Laget: 2025-07-10 Sist oppdatert: 2025-07-10bibliografisk kontrollert
Ceylan, C., Ghoorchian, K. & Kragic, D. (2024). Scalable Unsupervised Feature Selection with Reconstruction Error Guarantees via QMR Decomposition. In: CIKM 2024 - Proceedings of the 33rd ACM International Conference on Information and Knowledge Management: . Paper presented at 33rd ACM International Conference on Information and Knowledge Management, CIKM 2024, Boise, United States of America, October 21-25, 2024 (pp. 3658-3662). Association for Computing Machinery (ACM)
Åpne denne publikasjonen i ny fane eller vindu >>Scalable Unsupervised Feature Selection with Reconstruction Error Guarantees via QMR Decomposition
2024 (engelsk)Inngår i: CIKM 2024 - Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, Association for Computing Machinery (ACM) , 2024, s. 3658-3662Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Unsupervised feature selection (UFS) methods have garnered significant attention for their capability to eliminate redundant features without relying on class label information. However, their scalability to large datasets remains a challenge, rendering common UFS methods impractical for such applications. To address this issue, we introduce QMR-FS, a greedy forward filtering approach that selects linearly independent features up to a specified relative tolerance, ensuring that any excluded features can be reconstructed from the retained set within this tolerance. This is achieved through the QMR matrix decomposition, which builds upon the well-known QR decomposition. QMR-FS benefits from linear complexity relative to the number of instances and boasts exceptional performance due to its ability to leverage parallelized computation on both CPU and GPU. Despite its greedy nature, QMR-FS achieves comparable classification and clustering accuracies across multiple datasets when compared to other UFS methods, while achieving runtimes approximately 10 times faster than recently proposed scalable UFS methods for datasets ranging from 100 million to 1 billion elements.

sted, utgiver, år, opplag, sider
Association for Computing Machinery (ACM), 2024
Emneord
feature selection, linear independence, scalability, unsupervised learning
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-357143 (URN)10.1145/3627673.3679994 (DOI)001349579603081 ()2-s2.0-85210013171 (Scopus ID)
Konferanse
33rd ACM International Conference on Information and Knowledge Management, CIKM 2024, Boise, United States of America, October 21-25, 2024
Merknad

Part of ISBN 9798400704369

QC 20241205

Tilgjengelig fra: 2024-12-04 Laget: 2024-12-04 Sist oppdatert: 2025-12-08bibliografisk kontrollert
Ceylan, C., Franzen, S. & Pokorny, F. T. (2021). Learning Node Representations Using Stationary Flow Prediction on Large Payment and Cash Transaction Networks. In: Meila, M Zhang, T (Ed.), International Conference On Machine Learning, Vol 139: . Paper presented at International Conference on Machine Learning (ICML), JUL 18-24, 2021, ELECTR NETWORK. JMLR-JOURNAL MACHINE LEARNING RESEARCH, 139
Åpne denne publikasjonen i ny fane eller vindu >>Learning Node Representations Using Stationary Flow Prediction on Large Payment and Cash Transaction Networks
2021 (engelsk)Inngår i: International Conference On Machine Learning, Vol 139 / [ed] Meila, M Zhang, T, JMLR-JOURNAL MACHINE LEARNING RESEARCH , 2021, Vol. 139Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Banks are required to analyse large transaction datasets as a part of the fight against financial crime. Today, this analysis is either performed manually by domain experts or using expensive feature engineering. Gradient flow analysis allows for basic representation learning as node potentials can be inferred directly from network transaction data. However, the gradient model has a fundamental limitation: it cannot represent all types of of network flows. Furthermore, standard methods for learning the gradient flow are not appropriate for flow signals that span multiple orders of magnitude and contain outliers, i.e. transaction data. In this work, the gradient model is extended to a gated version and we prove that it, unlike the gradient model, is a universal approximator for flows on graphs. To tackle the mentioned challenges of transaction data, we propose a multi-scale and outlier robust loss function based on the Student-t log-likelihood. Ethereum transaction data is used for evaluation and the gradient models outperform MLP models using hand-engineered and node2vec features in terms of relative error. These results extend to 60 synthetic datasets, with experiments also showing that the gated gradient model learns qualitative information about the underlying synthetic generative flow distributions.

sted, utgiver, år, opplag, sider
JMLR-JOURNAL MACHINE LEARNING RESEARCH, 2021
Serie
Proceedings of Machine Learning Research, ISSN 2640-3498
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-303379 (URN)000683104601036 ()
Konferanse
International Conference on Machine Learning (ICML), JUL 18-24, 2021, ELECTR NETWORK
Merknad

QC 20211015

Tilgjengelig fra: 2021-10-15 Laget: 2021-10-15 Sist oppdatert: 2022-06-25bibliografisk kontrollert
Ceylan, C., Franzén, S. & Pokorny, F. T. (2021). Learning Node Representations Using Stationary Flow Prediction on Large Payment and Cash Transaction Networks. In: Proceedings of the 38th International Conference on Machine Learning, ICML 2021: . Paper presented at 38th International Conference on Machine Learning, ICML 2021, Virtual, Online, NA, Jul 18 2021 - Jul 24 2021 (pp. 1395-1406). ML Research Press
Åpne denne publikasjonen i ny fane eller vindu >>Learning Node Representations Using Stationary Flow Prediction on Large Payment and Cash Transaction Networks
2021 (engelsk)Inngår i: Proceedings of the 38th International Conference on Machine Learning, ICML 2021, ML Research Press , 2021, s. 1395-1406Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Banks are required to analyse large transaction datasets as a part of the fight against financial crime. Today, this analysis is either performed manually by domain experts or using expensive feature engineering. Gradient flow analysis allows for basic representation learning as node potentials can be inferred directly from network transaction data. However, the gradient model has a fundamental limitation: it cannot represent all types of of network flows. Furthermore, standard methods for learning the gradient flow are not appropriate for flow signals that span multiple orders of magnitude and contain outliers, i.e. transaction data. In this work, the gradient model is extended to a gated version and we prove that it, unlike the gradient model, is a universal approximator for flows on graphs. To tackle the mentioned challenges of transaction data, we propose a multi-scale and outlier robust loss function based on the Student-t log-likelihood. Ethereum transaction data is used for evaluation and the gradient models outperform MLP models using hand-engineered and node2vec features in terms of relative error. These results extend to 60 synthetic datasets, with experiments also showing that the gated gradient model learns qualitative information about the underlying synthetic generative flow distributions.

sted, utgiver, år, opplag, sider
ML Research Press, 2021
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-347968 (URN)2-s2.0-85134915157 (Scopus ID)
Konferanse
38th International Conference on Machine Learning, ICML 2021, Virtual, Online, NA, Jul 18 2021 - Jul 24 2021
Merknad

Part of ISBN [9781713845065]

QC 20240703

Tilgjengelig fra: 2024-07-03 Laget: 2024-07-03 Sist oppdatert: 2024-07-03bibliografisk kontrollert
Ceylan, C. & Gutmann, M. U. (2018). Conditional Noise-Contrastive Estimation of Unnormalised Models. In: Dy, J Krause, A (Ed.), 35th International Conference on Machine Learning, ICML 2018: . Paper presented at 35th International Conference on Machine Learning (ICML), JUL 10-15, 2018, Stockholm, Sweden (pp. 1334-1442). International Machine Learning Society (IMLS), 80
Åpne denne publikasjonen i ny fane eller vindu >>Conditional Noise-Contrastive Estimation of Unnormalised Models
2018 (engelsk)Inngår i: 35th International Conference on Machine Learning, ICML 2018 / [ed] Dy, J Krause, A, International Machine Learning Society (IMLS) , 2018, Vol. 80, s. 1334-1442Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Many parametric statistical models are not properly normalised and only specified up to an intractable partition function, which renders parameter estimation difficult. Examples of unnormalised models are Gibbs distributions, Markov random fields, and neural network models in unsupervised deep learning. In previous work, the estimation principle called noise-contrastive estimation (NCE) was introduced where unnormalised models are estimated by learning to distinguish between data and auxiliary noise. An open question is how to best choose the auxiliary noise distribution. We here propose a new method that addresses this issue. The proposed method shares with NCE the idea of formulating density estimation as a supervised learning problem but in contrast to NCE, the proposed method leverages the observed data when generating noise samples. The noise can thus be generated in a semi-automated manner. We first present the underlying theory of the new method, show that score matching emerges as a limiting case, validate the method on continuous and discrete valued synthetic data, and show that we can expect an improved performance compared to NCE when the data lie in a lower-dimensional manifold. Then we demonstrate its applicability in unsupervised deep learning by estimating a four-layer neural image model.

sted, utgiver, år, opplag, sider
International Machine Learning Society (IMLS), 2018
Serie
Proceedings of Machine Learning Research, ISSN 2640-3498
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-318709 (URN)000683379200075 ()2-s2.0-85057220926 (Scopus ID)
Konferanse
35th International Conference on Machine Learning (ICML), JUL 10-15, 2018, Stockholm, Sweden
Merknad

QC 20220922

Part of books: ISBN 978-151086796-3

Tilgjengelig fra: 2022-09-22 Laget: 2022-09-22 Sist oppdatert: 2023-09-22bibliografisk kontrollert
Poklukar, P., Ceylan, C., Hultin, H., Kravchenko, O., Varava, A. & Kragic, D.GraphDCA - a Framework for Node Distribution Comparison in Real and Synthetic Graphs.
Åpne denne publikasjonen i ny fane eller vindu >>GraphDCA - a Framework for Node Distribution Comparison in Real and Synthetic Graphs
Vise andre…
(engelsk)Manuskript (preprint) (Annet vitenskapelig)
Abstract [en]

We argue that when comparing two graphs, the distribution of node structural features is more informative than global graph statistics which are often used in practice, especially to evaluate graph generative models. Thus, we present GraphDCA - a framework for evaluating similarity between graphs based on the alignment of their respective node representation sets. The sets are compared using a recently proposed method for comparing representation spaces, called Delaunay Component Analysis (DCA), which we extend to graph data. To evaluate our framework, we generate a benchmark dataset of graphs exhibiting different structural patterns and show, using three node structure feature extractors, that GraphDCA recognizes graphs with both similar and dissimilar local structure. We then apply our framework to evaluate three publicly available real-world graph datasets and demonstrate, using gradual edge perturbations, that GraphDCA satisfyingly captures gradually decreasing similarity, unlike global statistics. Finally, we use GraphDCA to evaluate two state-of-the-art graph generative models, NetGAN and CELL, and conclude that further improvements are needed for these models to adequately reproduce local structural features.

Emneord
Representation Learning, Machine Learning, Graph Generative Models, Node Embeddings
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-312720 (URN)
Merknad

QC 20220614

Tilgjengelig fra: 2022-05-20 Laget: 2022-05-20 Sist oppdatert: 2022-06-25bibliografisk kontrollert
Organisasjoner