kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Towards quantifying information flows: Relative entropy in deep neural networks and the renormalization group
Julius Maximilians Univ Wurzburg, Inst Theoret Phys & Astrophys, D-97074 Wurzburg, Germany.;Julius Maximilians Univ Wurzburg, Wurzburg Dresden Cluster Excellence Ct Qmat, D-97074 Wurzburg, Germany..
Max Planck Inst Phys Komplexer Syst, Nothnitzer Str 38, D-01187 Dresden, Germany.;Wurzburg Dresden Cluster Excellence Ct Qmat, Nothnitzer Str 38, D-01187 Dresden, Germany..
Nordita SU.
2022 (English)In: SciPost Physics, E-ISSN 2542-4653, Vol. 12, no 1, article id 041Article in journal (Refereed) Published
Abstract [en]

We investigate the analogy between the renormalization group (RG) and deep neural networks, wherein subsequent layers of neurons are analogous to successive steps along the RG. In particular, we quantify the flow of information by explicitly computing the relative entropy or Kullback-Leibler divergence in both the one- and two-dimensional Ising models under decimation RG, as well as in a feedforward neural network as a function of depth. We observe qualitatively identical behavior characterized by the monotonic increase to a parameter-dependent asymptotic value. On the quantum field theory side, the monotonic increase confirms the connection between the relative entropy and the c-theorem. For the neural networks, the asymptotic behavior may have implications for various information maximization methods in machine learning, as well as for disentangling compactness and generalizability. Furthermore, while both the two-dimensional Ising model and the random neural networks we consider exhibit non-trivial critical points, the relative entropy appears insensitive to the phase structure of either system. In this sense, more refined probes are required in order to fully elucidate the flow of information in these models.

Place, publisher, year, edition, pages
Stichting SciPost , 2022. Vol. 12, no 1, article id 041
National Category
Computer Sciences Information Systems
Identifiers
URN: urn:nbn:se:kth:diva-314890DOI: 10.21468/SciPostPhys.12.1.041ISI: 000807448000036Scopus ID: 2-s2.0-85124977025OAI: oai:DiVA.org:kth-314890DiVA, id: diva2:1676756
Note

QC 20220627

Available from: 2022-06-27 Created: 2022-06-27 Last updated: 2022-07-04Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus
In the same journal
SciPost Physics
Computer SciencesInformation Systems

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 28 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf