kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Learning Formal Mathematics From Intrinsic Motivation
Department of Computer Science, Stanford University, USA.
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.ORCID iD: 0000-0001-8457-4105
Department of Computer Science, Stanford University, USA; Department of Education, Stanford University, USA.
Department of Computer Science, Stanford University, USA; Department of Psychology, Stanford University, USA.
2024 (English)In: Advances in Neural Information Processing Systems 37 - 38th Conference on Neural Information Processing Systems, NeurIPS 2024, Neural information processing systems foundation , 2024Conference paper, Published paper (Refereed)
Abstract [en]

How did humanity coax mathematics from the æther? We explore the Platonic view that mathematics can be discovered from its axioms-a game of conjecture and proof. We describe MINIMO (Mathematics from Intrinsic Motivation): an agent that jointly learns to pose challenging problems for itself (conjecturing) and solve them (theorem proving). Given a mathematical domain axiomatized in dependent type theory, we first combine methods for constrained decoding and type-directed synthesis to sample valid conjectures from a language model. Our method guarantees well-formed conjectures by construction, even as we start with a randomly initialized model. We use the same model to represent a policy and value function for guiding proof search. Our agent targets generating hard but provable conjectures-a moving target, since its own theorem proving ability also improves as it trains. We propose novel methods for hindsight relabeling on proof search trees to significantly improve the agent's sample efficiency in both tasks. Experiments on 3 axiomatic domains (propositional logic, arithmetic and group theory) demonstrate that our agent can bootstrap from only the axioms, self-improving in generating true and challenging conjectures and in finding proofs.

Place, publisher, year, edition, pages
Neural information processing systems foundation , 2024.
National Category
Other Mathematics
Identifiers
URN: urn:nbn:se:kth:diva-361996Scopus ID: 2-s2.0-105000486976OAI: oai:DiVA.org:kth-361996DiVA, id: diva2:1949669
Conference
38th Conference on Neural Information Processing Systems, NeurIPS 2024, Vancouver, Canada, December 9-15, 2024
Note

Part of ISBN 9798331314385

QC 20250408

Available from: 2025-04-03 Created: 2025-04-03 Last updated: 2025-04-08Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Scopusfulltext

Authority records

Broman, David

Search in DiVA

By author/editor
Broman, David
By organisation
Software and Computer systems, SCS
Other Mathematics

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 17 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf