Outsmarting Willful-Thinking Opponents: Bayesian Belief Revision for Adversarial Reasoning in Large Language ModelsShow others and affiliations
2026 (English)In: Social Networks Analysis and Mining - 17th International Conference, ASONAM 2025, Proceedings, Springer Nature , 2026, Vol. 16324, p. 559-578Conference paper, Published paper (Refereed)
Abstract [en]
In adversarial contexts, success often hinges on understanding not just what the opponent knows, but what they believe and how they revise those beliefs. This study investigates how large language models can be made more resilient and strategically capable by modeling the opponent’s reasoning using Bayesian belief revision. By formalizing negotiations as Bayesian games of incomplete information, it is shown that models equipped with belief revision are better able to counter deceptive or willful-thinking adversaries. The findings underscore the role of second-order reasoning in adversarial settings, with implications for social manipulation in the context of, for example, online communication and intelligence gathering.
Place, publisher, year, edition, pages
Springer Nature , 2026. Vol. 16324, p. 559-578
Series
Lecture Notes in Computer Science, ISSN 03029743
Keywords [en]
Adversarial modeling, Bayesian belief revision, Behavioral learning, Game theory, Social manipulation
National Category
Philosophy
Identifiers
URN: urn:nbn:se:kth:diva-377811DOI: 10.1007/978-3-032-14107-1_44Scopus ID: 2-s2.0-105029897517OAI: oai:DiVA.org:kth-377811DiVA, id: diva2:2045306
Conference
17th International Conference on Social Networks Analysis and Mining, ASONAM 2025, Niagara Falls, Canada, Aug 25 2025 - Aug 28 2025
Note
Part of ISBN 9783032141064
QC 20260312
2026-03-122026-03-122026-03-12Bibliographically approved