Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Exploring the ability of CNNs to generalise to previously unseen scales over wide scale ranges
KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Beräkningsvetenskap och beräkningsteknik (CST). (Computational Brain Science Lab)ORCID-id: 0000-0003-0011-6444
KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Beräkningsvetenskap och beräkningsteknik (CST). (Computational Brain Science Lab)ORCID-id: 0000-0002-9081-2170
2021 (engelsk)Inngår i: ICPR 2020: International Conference on Pattern Recognition, Institute of Electrical and Electronics Engineers (IEEE) , 2021, s. 1181-1188Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

The ability to handle large scale variations is crucial for many real world visual tasks. A straightforward approach for handling scale in a deep network is to process an image at several scales simultaneously in a set of scale channels. Scale invariance can then, in principle, be achieved by using weight sharing between the scale channels together with max or average pooling over the outputs from the scale channels. The ability of such scale channel networks to generalise to scales not present in the training set over significant scale ranges has, however, not previously been explored. We, therefore, present a theoretical analysis of invariance and covariance properties of scale channel networks and perform an experimental evaluation of the ability of different types of scale channel networks to generalise to previously unseen scales. We identify limitations of previous approaches and propose a new type of foveated scale channel architecture, where the scale channels process increasingly larger parts of the image with decreasing resolution. Our proposed FovMax and FovAvg networks perform almost identically over a scale range of 8, also when training on single scale training data, and do also give improvements in the small sample regime.

sted, utgiver, år, opplag, sider
Institute of Electrical and Electronics Engineers (IEEE) , 2021. s. 1181-1188
Emneord [en]
deep learning, convolutional neural networks, invariant neural networks, scale invariance
HSV kategori
Forskningsprogram
Datalogi
Identifikatorer
URN: urn:nbn:se:kth:diva-288539DOI: 10.1109/ICPR48806.2021.9413276ISI: 000678409201038Scopus ID: 2-s2.0-85103171938OAI: oai:DiVA.org:kth-288539DiVA, id: diva2:1515273
Konferanse
ICPR 2020: 25th International Conference on Pattern Recognition, Milan, Italy, January 10-15, 2021
Forskningsfinansiär
Swedish Research Council, 2018-03586
Merknad

Part of proceedings: ISBN 978-1-7281-8808-9, Not duplicate with diva 1423788, QC 20220517

Tilgjengelig fra: 2021-01-08 Laget: 2021-01-08 Sist oppdatert: 2025-02-07bibliografisk kontrollert

Open Access i DiVA

fulltext(219 kB)427 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 219 kBChecksum SHA-512
01cccfc8a023e657b8b5d506e5c5d0ae703e48fff51d8a3345c516d4742f2c07afac9401d839836524ac3368a80d11eb06b03a958e414567965234b0df813e0c
Type fulltextMimetype application/pdf

Andre lenker

Forlagets fulltekstScopusarXiv:2004.01536 extended versionICPR 2020 home page

Person

Jansson, YlvaLindeberg, Tony

Søk i DiVA

Av forfatter/redaktør
Jansson, YlvaLindeberg, Tony
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 428 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

doi
urn-nbn

Altmetric

doi
urn-nbn
Totalt: 2160 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf