kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Outsmarting Willful-Thinking Opponents: Bayesian Belief Revision for Adversarial Reasoning in Large Language Models
FOI Swedish Defence Research Agency, SE-164 90, Stockholm, Sweden.
KTH, School of Electrical Engineering and Computer Science (EECS), Theoretical Computer Science. FOI Swedish Defence Research Agency, SE-164 90, Stockholm, Sweden.ORCID iD: 0000-0002-2677-9759
KTH, School of Electrical Engineering and Computer Science (EECS), Theoretical Computer Science. FOI Swedish Defence Research Agency, SE-164 90, Stockholm, Sweden.
FOI Swedish Defence Research Agency, SE-164 90, Stockholm, Sweden.
Show others and affiliations
2026 (English)In: Social Networks Analysis and Mining - 17th International Conference, ASONAM 2025, Proceedings, Springer Nature , 2026, Vol. 16324, p. 559-578Conference paper, Published paper (Refereed)
Abstract [en]

In adversarial contexts, success often hinges on understanding not just what the opponent knows, but what they believe and how they revise those beliefs. This study investigates how large language models can be made more resilient and strategically capable by modeling the opponent’s reasoning using Bayesian belief revision. By formalizing negotiations as Bayesian games of incomplete information, it is shown that models equipped with belief revision are better able to counter deceptive or willful-thinking adversaries. The findings underscore the role of second-order reasoning in adversarial settings, with implications for social manipulation in the context of, for example, online communication and intelligence gathering.

Place, publisher, year, edition, pages
Springer Nature , 2026. Vol. 16324, p. 559-578
Series
Lecture Notes in Computer Science, ISSN 03029743
Keywords [en]
Adversarial modeling, Bayesian belief revision, Behavioral learning, Game theory, Social manipulation
National Category
Philosophy
Identifiers
URN: urn:nbn:se:kth:diva-377811DOI: 10.1007/978-3-032-14107-1_44Scopus ID: 2-s2.0-105029897517OAI: oai:DiVA.org:kth-377811DiVA, id: diva2:2045306
Conference
17th International Conference on Social Networks Analysis and Mining, ASONAM 2025, Niagara Falls, Canada, Aug 25 2025 - Aug 28 2025
Note

Part of ISBN 9783032141064

QC 20260312

Available from: 2026-03-12 Created: 2026-03-12 Last updated: 2026-03-12Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Brynielsson, JoelCohen, MikaLavebrink, SamuelVangeli, Marius

Search in DiVA

By author/editor
Brynielsson, JoelCohen, MikaLavebrink, SamuelVangeli, Marius
By organisation
Theoretical Computer ScienceKTH
Philosophy

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 20 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf