kth.sePublications
System disruptions
We are currently experiencing disruptions on the search portals due to high traffic. We are working to resolve the issue, you may temporarily encounter an error message.
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
The edge of chaos: Quantum field theory and deep neural networks
Max Planck Inst Phys Komplexer Syst & Wurzburg D, Cluster Excellence Ctqmat, Nothnitzer Str 38, D-01187 Dresden, Germany.;Leiden Univ, Inst Lorentz, POB 9506, NL-2300 RA Leiden, Netherlands..
Nordita SU Stockholm Univ, Hannes Alfvens Vag 12, S-10691 Stockholm, Sweden..
2022 (English)In: SciPost Physics, E-ISSN 2542-4653, Vol. 12, no 3, article id 081Article in journal (Refereed) Published
Abstract [en]

We explicitly construct the quantum field theory corresponding to a general class of deep neural networks encompassing both recurrent and feedforward architectures. We first consider the mean-field theory (MFT) obtained as the leading saddlepoint in the action, and derive the condition for criticality via the largest Lyapunov exponent. We then compute the loop corrections to the correlation function in a perturbative expansion in the ratio of depth T to width N, and find a precise analogy with the well-studied O(N) vector model, in which the variance of the weight initializations plays the role of the 't Hooft coupling. In particular, we compute both the O(1) corrections quantifying fluctuations from typicality in the ensemble of networks, and the subleading O(T IN) corrections due to finite-width effects. These provide corrections to the correlation length that controls the depth to which information can propagate through the network, and thereby sets the scale at which such networks are trainable by gradient descent. Our analysis provides a first-principles approach to the rapidly emerging NN-QFT correspondence, and opens several interesting avenues to the study of criticality in deep neural networks.

Place, publisher, year, edition, pages
Stichting SciPost , 2022. Vol. 12, no 3, article id 081
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:kth:diva-311517DOI: 10.21468/SciPostPhys.12.3.081ISI: 000782238100009Scopus ID: 2-s2.0-85127649952OAI: oai:DiVA.org:kth-311517DiVA, id: diva2:1655674
Note

QC 20220503

Available from: 2022-05-03 Created: 2022-05-03 Last updated: 2022-06-25Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus
In the same journal
SciPost Physics
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 12 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf