kth.sePublications
System disruptions
We are currently experiencing disruptions on the search portals due to high traffic. We are working to resolve the issue, you may temporarily encounter an error message.
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Geometry of Linear Convolutional Networks
KTH, School of Engineering Sciences (SCI), Mathematics (Dept.).ORCID iD: 0000-0002-4627-8812
UCLA, Dept Math, Los Angeles, CA 90095 USA..
UCLA, Dept Math, Los Angeles, CA 90095 USA.;UCLA, Dept Stat, Los Angeles, CA 90095 USA.;Max Planck Inst Math Sci, D-04103 Leipzig, Saxony, Germany..
Amazon, New York, NY 10001 USA..
2022 (English)In: SIAM JOURNAL ON APPLIED ALGEBRA AND GEOMETRY, ISSN 2470-6566, Vol. 6, no 3, p. 368-406Article in journal (Refereed) Published
Abstract [en]

We study the family of functions that are represented by a linear convolutional network (LCN). These functions form a semi-algebraic subset of the set of linear maps from input space to output space. In contrast, the families of functions represented by fully connected linear networks form algebraic sets. We observe that the functions represented by LCNs can be identified with polynomials that admit certain factorizations, and we use this perspective to describe the impact of the network's architecture on the geometry of the resulting function space. We further study the optimization of an objective function over an LCN, analyzing critical points in function space and in parameter space and describing dynamical invariants for gradient descent. Overall, our theory predicts that the optimized parameters of an LCN will often correspond to repeated filters across layers, or filters that can be decomposed as repeated filters. We also conduct numerical and symbolic experiments that illustrate our results and present an in-depth analysis of the landscape for small architectures.

Place, publisher, year, edition, pages
Society for Industrial & Applied Mathematics (SIAM) , 2022. Vol. 6, no 3, p. 368-406
Keywords [en]
function space description of neural networks, linear network, Toeplitz matrix, circulant matrix, al, gebraic statistics, Euclidean distance degree, semi -algebraic set, gradient flow, discriminant, critical
National Category
Computer graphics and computer vision Computer Sciences Computational Mathematics
Identifiers
URN: urn:nbn:se:kth:diva-317033DOI: 10.1137/21M1441183ISI: 000838964100002Scopus ID: 2-s2.0-85139568287OAI: oai:DiVA.org:kth-317033DiVA, id: diva2:1693218
Note

QC 20220906

Available from: 2022-09-06 Created: 2022-09-06 Last updated: 2025-02-01Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Kohn, Kathlén

Search in DiVA

By author/editor
Kohn, Kathlén
By organisation
Mathematics (Dept.)
Computer graphics and computer visionComputer SciencesComputational Mathematics

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 183 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf