kth.sePublications KTH
Change search
Link to record
Permanent link

Direct link
Publications (7 of 7) Show all publications
Tu, S., Stankovic, A., Neumann, S. & Gionis, A. (2025). Optirefine: densest subgraphs and maximum cuts with k refinements. Data mining and knowledge discovery, 39(6), Article ID 82.
Open this publication in new window or tab >>Optirefine: densest subgraphs and maximum cuts with k refinements
2025 (English)In: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756X, Vol. 39, no 6, article id 82Article in journal (Refereed) Published
Abstract [en]

Data-analysis tasks often involve an iterative process, which requires refining previous solutions. For instance, when analyzing social networks, we may obtain initial communities based on noisy metadata, and we want to improve them by adding influential nodes and removing non-important ones, without making too many changes. However, classic optimization algorithms, which typically find solutions from scratch, potentially return communities that are very dissimilar to the initial one. To mitigate these issues, we introduce the OptiRefine framework. The framework optimizes initial solutions by making a small number of refinements, thereby ensuring that the new solution remains close to the initial solution and simultaneously achieving a near-optimal solution for the optimization problem. We apply the OptiRefine framework to two classic graph-optimization problems: densest subgraph and maximum cut. For the densest-subgraph problem, we optimize a given subgraph’s density by adding or removing k nodes. We show that this novel problem is a generalization of k-densest subgraph, and provide constant-factor approximation algorithms for refinements. We also study a version of maximum cut in which the goal is to improve a given cut. We provide connections to the maximum cut with cardinality constraints and provide an optimal approximation algorithm in most parameter regimes under the Unique Games Conjecture for refinements. We evaluate our theoretical methods and scalable heuristics on synthetic and real-world data and show that they are highly effective in practice.

Place, publisher, year, edition, pages
Springer Nature, 2025
Keywords
Densest subgraph, Discrete optimization, Maximum cut, Semidefinite programming, Sum-of-squares
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-370712 (URN)10.1007/s10618-025-01142-2 (DOI)001568937400001 ()2-s2.0-105015555730 (Scopus ID)
Note

Not duplicate with DiVA 1958027

QC 20250930

Available from: 2025-09-30 Created: 2025-09-30 Last updated: 2025-09-30Bibliographically approved
Wang, H., Tu, S. & Gionis, A. (2025). Sequential Diversification with Provable Guarantees. In: Proceedings Of The Eighteenth Acm International Conference On Web Search And Data Mining, WSDM 2025: . Paper presented at 18th International Conference on Web Search and Data Mining-WSDM, MAR 10-14, 2025, Hannover, GERMANY (pp. 345-353). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>Sequential Diversification with Provable Guarantees
2025 (English)In: Proceedings Of The Eighteenth Acm International Conference On Web Search And Data Mining, WSDM 2025, Association for Computing Machinery (ACM) , 2025, p. 345-353Conference paper, Published paper (Refereed)
Abstract [en]

Diversification is a useful tool for exploring large collections of information items. It has been used to reduce redundancy and cover multiple perspectives in information-search settings. Diversification finds applications in many different domains, including presenting search results of information-retrieval systems and selecting suggestions for recommender systems. Interestingly, existing measures of diversity are defined over sets of items, rather than evaluating sequences of items. This design choice comes in contrast with commonly-used relevance measures, which are distinctly defined over sequences of items, taking into account the ranking of items. The importance of employing sequential measures is that information items are almost always presented in a sequential manner, and during their information-exploration activity users tend to prioritize items with higher ranking. In this paper, we study the problem of maximizing sequential diversity. This is a new measure of diversity, which accounts for the ranking of the items, and incorporates item relevance and user behavior. The overarching framework can be instantiated with different diversity measures, and here we consider the measures of sum diversity and coverage diversity. The problem was recently proposed by Coppolillo et al. [11], where they introduce empirical methods that work well in practice. Our paper is a theoretical treatment of the problem: we establish the problem hardness and present algorithms with constant approximation guarantees for both diversity measures we consider. Experimentally, we demonstrate that our methods are competitive against strong baselines.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2025
Keywords
Diversification, Ranking algorithms, Approximation algorithms
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-366040 (URN)10.1145/3701551.3703531 (DOI)001476971200037 ()2-s2.0-105001669833 (Scopus ID)
Conference
18th International Conference on Web Search and Data Mining-WSDM, MAR 10-14, 2025, Hannover, GERMANY
Note

Part of ISBN 979-8-4007-1329-3

QC 20250703

Available from: 2025-07-03 Created: 2025-07-03 Last updated: 2025-08-22Bibliographically approved
Out, C., Tu, S., Neumann, S. & Zehmakan, A. N. (2024). The Impact of External Sources on the Friedkin-Johnsen Model. In: CIKM 2024 - Proceedings of the 33rd ACM International Conference on Information and Knowledge Management: . Paper presented at 33rd ACM International Conference on Information and Knowledge Management, CIKM 2024, Boise, United States of America, Oct 21 2024 - Oct 25 2024 (pp. 1815-1824). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>The Impact of External Sources on the Friedkin-Johnsen Model
2024 (English)In: CIKM 2024 - Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, Association for Computing Machinery (ACM) , 2024, p. 1815-1824Conference paper, Published paper (Refereed)
Abstract [en]

To obtain a foundational understanding of timeline algorithms and viral content in shaping public opinions, computer scientists started to study augmented versions of opinion formation models from sociology. In this paper, we generalize the popular Friedkin - Johnsen model to include the effects of external media sources on opinion formation. Our goal is to mathematically analyze the influence of biased media, arising from factors such as manipulated news reporting or the phenomenon of false balance. Within our framework, we examine the scenario of two opposing media sources, which do not adapt their opinions like ordinary nodes, and analyze the conditions and the number of periods required for radicalizing the opinions in the network. When both media sources possess equal influence, we theoretically characterize the final opinion configuration. In the special case where there is only a single media source present, we prove that media sources which do not adapt their opinions are significantly more powerful than those which do. Lastly, we conduct the experiments on real-world and synthetic datasets, showing that our theoretical guarantees closely align with experimental simulations.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2024
Keywords
false balance, Friedkin - Johnsen model, opinion formation, social networks
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-357145 (URN)10.1145/3627673.3679780 (DOI)2-s2.0-85209989330 (Scopus ID)
Conference
33rd ACM International Conference on Information and Knowledge Management, CIKM 2024, Boise, United States of America, Oct 21 2024 - Oct 25 2024
Note

QC 20241206

Part of ISBN 9798400704369

Available from: 2024-12-04 Created: 2024-12-04 Last updated: 2025-05-14Bibliographically approved
Tu, S., Neumann, S. & Gionis, A. (2023). Adversaries with Limited Information in the Friedkin-Johnsen Model. In: KDD 2023: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Paper presented at 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2023, Long Beach, United States of America, Aug 6 2023 - Aug 10 2023 (pp. 2201-2210). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>Adversaries with Limited Information in the Friedkin-Johnsen Model
2023 (English)In: KDD 2023: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Association for Computing Machinery (ACM) , 2023, p. 2201-2210Conference paper, Published paper (Refereed)
Abstract [en]

In recent years, online social networks have been the target of adversaries who seek to introduce discord into societies, to undermine democracies and to destabilize communities. Often the goal is not to favor a certain side of a conflict but to increase disagreement and polarization. To get a mathematical understanding of such attacks, researchers use opinion-formation models from sociology, such as the Friedkin - Johnsen model, and formally study how much discord the adversary can produce when altering the opinions for only a small set of users. In this line of work, it is commonly assumed that the adversary has full knowledge about the network topology and the opinions of all users. However, the latter assumption is often unrealistic in practice, where user opinions are not available or simply difficult to estimate accurately. To address this concern, we raise the following question: Can an attacker sow discord in a social network, even when only the network topology is known? We answer this question affirmatively. We present approximation algorithms for detecting a small set of users who are highly influential for the disagreement and polarization in the network. We show that when the adversary radicalizes these users and if the initial disagreement/polarization in the network is not very high, then our method gives a constant-factor approximation on the setting when the user opinions are known. To find the set of influential users, we provide a novel approximation algorithm for a variant of MaxCut in graphs with positive and negative edge weights. We experimentally evaluate our methods, which have access only to the network topology, and we find that they have similar performance as methods that have access to the network topology and all user opinions. We further present an NP-hardness proof, which was left as an open question by Chen and Racz [IEEE Transactions on Network Science and Engineering, 2021].

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2023
Keywords
graph algorithms, max-cut, opinion dynamics, polarization, social network analysis
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-337892 (URN)10.1145/3580305.3599255 (DOI)001118896302023 ()2-s2.0-85171327908 (Scopus ID)
Conference
29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2023, Long Beach, United States of America, Aug 6 2023 - Aug 10 2023
Note

Part of ISBN 9798400701030

QC 20231010

Available from: 2023-10-10 Created: 2023-10-10 Last updated: 2025-05-14Bibliographically approved
Tu, S. & Neumann, S. (2022). A Viral Marketing-Based Model For Opinion Dynamics in Online Social Networks. In: Proceedings of the ACM Web Conference 2022 WWW'2022: . Paper presented at 31st ACM World Wide Web Conference, WWW 2022, Virtual/Online, 25-29 April 2022 (pp. 1570-1578). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>A Viral Marketing-Based Model For Opinion Dynamics in Online Social Networks
2022 (English)In: Proceedings of the ACM Web Conference 2022 WWW'2022, Association for Computing Machinery (ACM) , 2022, p. 1570-1578Conference paper, Published paper (Refereed)
Abstract [en]

Online social networks provide a medium for citizens to form opinions on different societal issues, and a forum for public discussion. They also expose users to viral content, such as breaking news articles. In this paper, we study the interplay between these two aspects: opinion formation and information cascades in online social networks. We present a new model that allows us to quantify how users change their opinion as they are exposed to viral content. Our model is a combination of the popular Friedkin-Johnsen model for opinion dynamics and the independent cascade model for information propagation. We present algorithms for simulating our model, and we provide approximation algorithms for optimizing certain network indices, such as the sum of user opinions or the disagreement-controversy index; our approach can be used to obtain insights into how much viral content can increase these indices in online social networks. Finally, we evaluate our model on real-world datasets. We show experimentally that marketing campaigns and polarizing contents have vastly different effects on the network: while the former have only limited effect on the polarization in the network, the latter can increase the polarization up to 59% even when only 0.5% of the users start sharing a polarizing content. We believe that this finding sheds some light into the growing segregation in today's online media.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2022
Keywords
online social networks, opinion dynamics, information spread
National Category
Human Computer Interaction Sociology (Excluding Social Work, Social Anthropology, Demography and Criminology)
Identifiers
urn:nbn:se:kth:diva-320994 (URN)10.1145/3485447.3512203 (DOI)000852713001061 ()2-s2.0-85129796448 (Scopus ID)
Conference
31st ACM World Wide Web Conference, WWW 2022, Virtual/Online, 25-29 April 2022
Note

Part of proceedings: ISBN 978-1-4503-9096-5

QC 20221104

Available from: 2022-11-04 Created: 2022-11-04 Last updated: 2025-05-14Bibliographically approved
Tu, S., Aslay, C. & Gionis, A. (2020). Co-exposure maximization in online social networks. In: Advances in Neural Information Processing Systems: . Paper presented at 34th Conference on Neural Information Processing Systems, NeurIPS 2020, 6 December 2020 through 12 December 2020. Neural information processing systems foundation
Open this publication in new window or tab >>Co-exposure maximization in online social networks
2020 (English)In: Advances in Neural Information Processing Systems, Neural information processing systems foundation , 2020Conference paper, Published paper (Refereed)
Abstract [en]

Social media has created new ways for citizens to stay informed on societal matters and participate in political discourse. However, with its algorithmically-curated and virally-propagating content, social media has contributed further to the polarization of opinions by reinforcing users’ existing viewpoints. An emerging line of research seeks to understand how content-recommendation algorithms can be re-designed to mitigate societal polarization amplified by social-media interactions. In this paper, we study the problem of allocating seed users to opposing campaigns: by drawing on the equal-time rule of political campaigning on traditional media, our goal is to allocate seed users to campaigners with the aim to maximize the expected number of users who are co-exposed to both campaigns. We show that the problem of maximizing co-exposure is NP-hard and its objective function is neither submodular nor supermodular. However, by exploiting a connection to a submodular function that acts as a lower bound to the objective, we are able to devise a greedy algorithm with provable approximation guarantee. We further provide a scalable instantiation of our approximation algorithm by introducing a novel extension to the notion of random reverse-reachable sets for efficiently estimating the expected co-exposure. We experimentally demonstrate the quality of our proposal on real-world social networks.

Place, publisher, year, edition, pages
Neural information processing systems foundation, 2020
Keywords
Approximation algorithms, NP-hard, Polarization, Content recommendations, Greedy algorithms, Objective functions, On-line social networks, Political campaigning, Political discourse, Reachable set, Submodular functions, Social networking (online)
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-301002 (URN)2-s2.0-85108453782 (Scopus ID)
Conference
34th Conference on Neural Information Processing Systems, NeurIPS 2020, 6 December 2020 through 12 December 2020
Note

QC 20210906

Available from: 2021-09-06 Created: 2021-09-06 Last updated: 2025-05-14Bibliographically approved
Tu, S., Stankovic, A., Neumann, S. & Gionis, A.OptiRefine: Densest subgraphs and maximum cuts with k refinements.
Open this publication in new window or tab >>OptiRefine: Densest subgraphs and maximum cuts with k refinements
(English)Manuscript (preprint) (Other academic)
Abstract [en]

 Data-analysis tasks often involve an iterative process, which requires refining previous solutions. For instance, when analyzing social networks we may obtain initial communities based on noisy metadata, and we want to improve them by adding influential nodes and removing non-important ones, without making too many changes. However, classic optimization algorithms, which typically find solutions from scratch, potentially return communities that are very dissimilar to the initial one. To mitigate these issues, we introduce the OptiRefine framework. The framework optimizes initial solutions by making a small number of refinements, thereby ensuring that the new solution remains close to the initial solution and simultaneously achieving a near-optimal solution for the optimization problem. We apply the OptiRefine framework to two classic graph optimization problems: densest subgraph and maximum cut. For the densest-subgraph problem,we optimize a given subgraph’s density by adding or removing k nodes. We show that this novel problem is a generalization of k-densest subgraph, and provide constant-factor approximationalgorithms for k = Ω(n) refinements. We also study a version of maximum cut in which the goal is to improve a given cut. We provide connections to maximum cut with cardinality constraints and provide an optimal approximation algorithm in most parameter regimes under the UniqueGames Conjecture for k = Ω(n) refinements. We evaluate our theoretical methods and scalable heuristics on synthetic and real-world data and show that they are highly effective in practice. 

National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-363328 (URN)
Note

Under submission

QC 20250513

Available from: 2025-05-13 Created: 2025-05-13 Last updated: 2025-05-16Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-5976-1993

Search in DiVA

Show all publications