kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Exploring the ability of CNNs to generalise to previously unseen scales over wide scale ranges
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST). (Computational Brain Science Lab)ORCID iD: 0000-0003-0011-6444
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST). (Computational Brain Science Lab)ORCID iD: 0000-0002-9081-2170
2021 (English)In: ICPR 2020: International Conference on Pattern Recognition, Institute of Electrical and Electronics Engineers (IEEE) , 2021, p. 1181-1188Conference paper, Published paper (Refereed)
Abstract [en]

The ability to handle large scale variations is crucial for many real world visual tasks. A straightforward approach for handling scale in a deep network is to process an image at several scales simultaneously in a set of scale channels. Scale invariance can then, in principle, be achieved by using weight sharing between the scale channels together with max or average pooling over the outputs from the scale channels. The ability of such scale channel networks to generalise to scales not present in the training set over significant scale ranges has, however, not previously been explored. We, therefore, present a theoretical analysis of invariance and covariance properties of scale channel networks and perform an experimental evaluation of the ability of different types of scale channel networks to generalise to previously unseen scales. We identify limitations of previous approaches and propose a new type of foveated scale channel architecture, where the scale channels process increasingly larger parts of the image with decreasing resolution. Our proposed FovMax and FovAvg networks perform almost identically over a scale range of 8, also when training on single scale training data, and do also give improvements in the small sample regime.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE) , 2021. p. 1181-1188
Keywords [en]
deep learning, convolutional neural networks, invariant neural networks, scale invariance
National Category
Computer graphics and computer vision
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-288539DOI: 10.1109/ICPR48806.2021.9413276ISI: 000678409201038Scopus ID: 2-s2.0-85103171938OAI: oai:DiVA.org:kth-288539DiVA, id: diva2:1515273
Conference
ICPR 2020: 25th International Conference on Pattern Recognition, Milan, Italy, January 10-15, 2021
Funder
Swedish Research Council, 2018-03586
Note

Part of proceedings: ISBN 978-1-7281-8808-9, Not duplicate with diva 1423788, QC 20220517

Available from: 2021-01-08 Created: 2021-01-08 Last updated: 2025-02-07Bibliographically approved

Open Access in DiVA

fulltext(219 kB)424 downloads
File information
File name FULLTEXT01.pdfFile size 219 kBChecksum SHA-512
01cccfc8a023e657b8b5d506e5c5d0ae703e48fff51d8a3345c516d4742f2c07afac9401d839836524ac3368a80d11eb06b03a958e414567965234b0df813e0c
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopusarXiv:2004.01536 extended versionICPR 2020 home page

Authority records

Jansson, YlvaLindeberg, Tony

Search in DiVA

By author/editor
Jansson, YlvaLindeberg, Tony
By organisation
Computational Science and Technology (CST)
Computer graphics and computer vision

Search outside of DiVA

GoogleGoogle Scholar
Total: 425 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 2159 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf