kth.sePublications
Change search
Link to record
Permanent link

Direct link
Publications (3 of 3) Show all publications
Pozzoli, S. & Girdzijauskas, S. (2023). On Learning Embeddings at the Intersection of Communities and Roles. In: Proceedings - 2023 10th International Conference on Social Networks Analysis, Management and Security, SNAMS 2023: . Paper presented at 10th International Conference on Social Networks Analysis, Management and Security, SNAMS 2023, Abu Dhabi, United Arab Emirates, Nov 21 2023 - Nov 24 2023. Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>On Learning Embeddings at the Intersection of Communities and Roles
2023 (English)In: Proceedings - 2023 10th International Conference on Social Networks Analysis, Management and Security, SNAMS 2023, Institute of Electrical and Electronics Engineers (IEEE) , 2023Conference paper, Published paper (Refereed)
Abstract [en]

Graph Neural Networks (GNNs) have established themselves as the state of the art of encoding the nodes of a graph into a low-dimensional space by extracting features from the connectivity structure of the graph as well as the features of the nodes. However, since the embedding of a node is updated according to the information aggregated from the immediate neighborhood, a GNN tends to capture the community memberships of the nodes better than the other side of the coin: the role memberships, which quantify how much nodes carry out specific functions from a structural point of view. In this paper, we present RC-GNNs, a category of GNNs designed to learn embeddings from the community and the role memberships as well as the features of the nodes. RC-GNNs learn from different versions of the same graph, in which the nodes are connected according to either the community or the role memberships. Results show that, compared with models such as k-hop GNNs, k-GNNs, and MixHop, RC-GNNs are up to 4% more accurate in classifying the nodes of CiteSeer, Cora, and PubMed and up to 3% in classifying the graphs of MUTAG, PROTEINS, and Synthie.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2023
Keywords
Community Detection, Graph Classification, Graph Clustering, Graph Representation Learning, Node Classification, Role Discovery, Semi-Supervised Learning
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-343178 (URN)10.1109/SNAMS60348.2023.10375479 (DOI)2-s2.0-85183474186 (Scopus ID)
Conference
10th International Conference on Social Networks Analysis, Management and Security, SNAMS 2023, Abu Dhabi, United Arab Emirates, Nov 21 2023 - Nov 24 2023
Note

Part of ISBN: 979-8-3503-1890-6

QC 20240209

Available from: 2024-02-08 Created: 2024-02-08 Last updated: 2024-02-09Bibliographically approved
Pozzoli, S. & Girdzijauskas, S. (2022). Not Only Degree Matters: Diffusion-Driven Role Recognition. In: Proceedings of the 2022 Workshop on Open Challenges in Online Social Networks, OASIS 2022 - Held in conjunction with the 33rd ACM Conference on Hypertext and Social Media, HT 2022: . Paper presented at 2022 Workshop on Open Challenges in Online Social Networks, OASIS 2022, held in conjunction with the 33rd ACM Conference on Hypertext and Social Media, HT 2022, Virtual, Online (pp. 16-24). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>Not Only Degree Matters: Diffusion-Driven Role Recognition
2022 (English)In: Proceedings of the 2022 Workshop on Open Challenges in Online Social Networks, OASIS 2022 - Held in conjunction with the 33rd ACM Conference on Hypertext and Social Media, HT 2022, Association for Computing Machinery (ACM) , 2022, p. 16-24Conference paper, Published paper (Refereed)
Abstract [en]

Graphs are a data structure that lends itself to representing a wide range of entities connected by relationships. Insights into such entities are learned by graph clustering models that group nodes by either communities or roles. While community detection methods divide vertices into clusters with more significant internal than external connectivity, role discovery algorithms divide nodes by maximizing the similarity in the connectivity structure. Even though both are clusters of vertices, communities and roles excel at different tasks, such as link prediction and anomaly detection, respectively. Many role discovery algorithms explicitly or implicitly regard the degree as the most discriminating node feature. Methods that depend on how many neighbors a node has work very well for graphs in which the intra-role patterns of connectivity are equivalent. However, in this research paper, we show that structurally similar nodes with different degrees can be mislabeled by existing models since the connectivity structure is similar yet not equivalent. To address this, we present Diffusion-Driven Role Recognition (D2-R2), an unsupervised learning model designed to account for structurally similar nodes differing in degree, which is important for, e.g., social networks. Firstly, we compute a diffusion matrix in such a way as to explore the neighborhoods of the vertices without emphasizing differences in degree. From this, we extract the diffusion patterns that summarize the connectivity structure of the nodes. Then, we compute the distance between them via Dynamic Time Warping (DTW) and assign a given number of roles by running k-means. Tests on both synthetic graphs and non-synthetic networks show that D2-R2 outperforms methods such as RolX, struc2vec, and GraphWave by up to 21.2% in accuracy and 35.3% in F1 score for graphs in which there are differences in degree between structurally similar nodes. 

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2022
Keywords
Graph Clustering, Graph Signal Processing, Role Discovery, Unsupervised Learning
National Category
Computer and Information Sciences Other Computer and Information Science
Research subject
Computer Science; Information and Communication Technology
Identifiers
urn:nbn:se:kth:diva-315786 (URN)10.1145/3524010.3539497 (DOI)2-s2.0-85134323554 (Scopus ID)
Conference
2022 Workshop on Open Challenges in Online Social Networks, OASIS 2022, held in conjunction with the 33rd ACM Conference on Hypertext and Social Media, HT 2022, Virtual, Online
Note

QC 20220817

Part of proceedings: ISBN 978-145039279-2

Available from: 2022-07-20 Created: 2022-07-20 Last updated: 2022-08-17Bibliographically approved
Pozzoli, S., Soliman, A., Bahri, L., Branca, R. M., Girdzijauskas, S. & Brambilla, M. (2020). Domain expertise–agnostic feature selection for the analysis of breast cancer data. Artificial Intelligence in Medicine, 108, Article ID 101928.
Open this publication in new window or tab >>Domain expertise–agnostic feature selection for the analysis of breast cancer data
Show others...
2020 (English)In: Artificial Intelligence in Medicine, ISSN 0933-3657, E-ISSN 1873-2860, Vol. 108, article id 101928Article in journal (Refereed) Published
Abstract [en]

Progress in proteomics has enabled biologists to accurately measure the amount of protein in a tumor. This work is based on a breast cancer data set, result of the proteomics analysis of a cohort of tumors carried out at Karolinska Institutet. While evidence suggests that an anomaly in the protein content is related to the cancerous nature of tumors, the proteins that could be markers of cancer types and subtypes and the underlying interactions are not completely known. This work sheds light on the potential of the application of unsupervised learning in the analysis of the aforementioned data sets, namely in the detection of distinctive proteins for the identification of the cancer subtypes, in the absence of domain expertise. In the analyzed data set, the number of samples, or tumors, is significantly lower than the number of features, or proteins; consequently, the input data can be thought of as high-dimensional data. The use of high-dimensional data has already become widespread, and a great deal of effort has been put into high-dimensional data analysis by means of feature selection, but it is still largely based on prior specialist knowledge, which in this case is not complete. There is a growing need for unsupervised feature selection, which raises the issue of how to generate promising subsets of features among all the possible combinations, as well as how to evaluate the quality of these subsets in the absence of specialist knowledge. We hereby propose a new wrapper method for the generation and evaluation of subsets of features via spectral clustering and modularity, respectively. We conduct experiments to test the effectiveness of the new method in the analysis of the breast cancer data, in a domain expertise–agnostic context. Furthermore, we show that we can successfully augment our method by incorporating an external source of data on known protein complexes. Our approach reveals a large number of subsets of features that are better at clustering the samples than the state-of-the-art classification in terms of modularity and shows a potential to be useful for future proteomics research.

Place, publisher, year, edition, pages
Elsevier BV, 2020
Keywords
Breast cancer, Clustering, Clustering performance evaluation, Dimensionality reduction, Feature selection, Proteomics, Unsupervised learning
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-282662 (URN)10.1016/j.artmed.2020.101928 (DOI)000574951400008 ()32972658 (PubMedID)2-s2.0-85088878526 (Scopus ID)
Note

QC 20201102

Available from: 2020-09-30 Created: 2020-09-30 Last updated: 2024-03-18Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-6899-6209

Search in DiVA

Show all publications