kth.sePublications
Change search
Link to record
Permanent link

Direct link
Lindeberg, Tony, ProfessorORCID iD iconorcid.org/0000-0002-9081-2170
Biography [eng]

Tony Lindeberg is a Professor of Computer Science at KTH Royal Institute of Technology in Stockholm, Sweden. He received his MSc degree in 1987, his PhD degree in 1991, became docent in 1996, and was appointed professor in 2000. He was a Research Fellow at the Royal Swedish Academy of Sciences between 2000 and 2010.

His research interests in computer vision relate to scale-space representation, image features, object recognition, video analysis and computational modelling of biological vision. He has developed theories and methodologies for continuous and discrete scale-space representation, visual and auditory receptive fields, hierarchical and deep networks, detection of salient image structures, automatic scale selection, scale-covariant and scale-invariant image features, affine-covariant and affine-invariant features, affine and Galilean normalization, temporal, spatio-temporal and spectro-temporal scale-space concepts as well as spatial and spatio-temporal image descriptors for image-based recognition.

He does also work on computational modelling of hearing and has previously worked on topics in medical image analysis, brain activation and gesture recognition. He is the author of the book Scale-Space Theory in Computer Vision.

Biography [swe]

Tony Lindeberg är professor i datavetenskap vid Kungliga Tekniska Högskolan, KTH, i Stockholm. Han fick sin civilingenjörsexamen i teknisk fysik 1987, blev teknisk doktor i datalogi 1991, blev docent 1996 och utnämndes till professor 2000. Han var akademiforskare vid Kungliga Vetenskapsakademien mellan 2000 och 2010.

Hans forskningsintressen i datorseende omfattar skalrumsrepresentation, särdragsdetektion, objektigenkänning, videoanalys och beräkningsinriktad modellering av biologiskt seende. Han har utvecklat teorier och metodiker för kontinuerliga och diskreta skalrumsrepresentationer, visuella och auditiva receptiva fält, hierarkiska och djupa nätverk, detektion av framträdande särdrag, automatiskt skalval, skalkovarianta och skalinvarianta särdrag, affint kovarianta och invarianta särdrag, affin och galileisk normalisering, temporala, spatio-temporala och spektro-temporala skalrumsbegrepp samt spatiala och spatio-temporala bilddeskriptorer för bildbaserad igenkänning.

Han arbetar också med beräkningsorienterad modellering av hörsel och har tidigare arbetat inom medicinsk bildanalys, hjärnaktivitetsanalys och med gestigenkänning. Han är författare till boken ”Scale-Space Theory in Computer Vision”.

Publications (10 of 159) Show all publications
Lindeberg, T. (2023). A time-causal and time-recursive analogue of the Gabor transform.
Open this publication in new window or tab >>A time-causal and time-recursive analogue of the Gabor transform
2023 (English)Report (Other academic)
Abstract [en]

This paper presents a time-causal analogue of the Gabor filter, as well as a both time-causal and time-recursive analogue of the Gabor transform, where the proposed time-causal representations obey both temporal scale covariance and a cascade property with a simplifying kernel over temporal scales. The motivation behind these constructions is to enable theoretically well-founded time-frequency analysis over multiple temporal scales for real-time situations, or for physical or biological modelling situations, when the future cannot be accessed, and the non-causal access to future in Gabor filtering is therefore not viable for a time-frequency analysis of the system.

We develop the theory for these representations, obtained by replacing the Gaussian kernel in Gabor filtering with a time-causal kernel, referred to as the time-causal limit kernel, which guarantees simplification properties from finer to coarser levels of scales in a time-causal situation, similar as the Gaussian kernel can be shown toguarantee over a non-causal temporal domain. In these ways, the proposed time-frequency representations guarantee well-founded treatment over multiple scales, in situations when the characteristic scales in the signals, or physical or biological phenomena, to be analyzed may vary substantially, and additionally all steps in the time-frequency analysis have to be fully time-causal.

Publisher
p. 9
Keywords
time-frequency analysis, Gabor filter, Gabor transform, Time-causal, time-recursive, temporal scale, scale covariance, harmonic analysis, signal processing
National Category
Signal Processing Mathematical Analysis
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-334893 (URN)10.48550/arXiv.2308.14512 (DOI)
Projects
Covariant and invariant deep networks
Funder
Swedish Research Council, 2022-02969
Note

QC 20231123

Available from: 2023-08-29 Created: 2023-08-29 Last updated: 2024-03-18Bibliographically approved
Lindeberg, T. (2023). A time-causal and time-recursive scale-covariant scale-space representation of temporal signals and past time. Biological Cybernetics, 117(1-2), 21-59
Open this publication in new window or tab >>A time-causal and time-recursive scale-covariant scale-space representation of temporal signals and past time
2023 (English)In: Biological Cybernetics, ISSN 0340-1200, E-ISSN 1432-0770, Vol. 117, no 1-2, p. 21-59Article in journal (Refereed) Published
Abstract [en]

This article presents an overview of a theory for performing temporal smoothing on temporal signals in such a way that: (i) temporally smoothed signals at coarser temporal scales are guaranteed to constitute simplifications of corresponding temporally smoothed signals at any finer temporal scale (including the original signal) and (ii) the temporal smoothing process is both time-causal and time-recursive, in the sense that it does not require access to future information and can be performed with no other temporal memory buffer of the past than the resulting smoothed temporal scale-space representations themselves.

For specific subsets of parameter settings for the classes of linear and shift-invariant temporal smoothing operators that obey thisproperty, it is shown how temporal scale covariance can be additionally obtained, guaranteeing that if the temporal input signal is rescaled by a uniform temporal scaling factor, then also the resulting temporal scale-space representations of the rescaled temporal signal will constitute mere rescalings of the temporal scale-space representations of the original input signal, complemented by a shift along the temporal scale dimension. The resulting time-causal limit kernel that obeys this property constitutes a canonical temporal kernel for processing temporal signals in real-time scenarios when the regular Gaussian kernel cannot be used, because of its non-causal access to information from the future, and we cannot additionally require the temporal smoothing process to comprise a complementary memory of the past beyond the information contained in the temporal smoothing process itself, which in this way also serves as a multi-scale temporal memory of the past.

We describe how the time-causal limit kernel relates to previously used temporal models, such as Koenderink's scale-time kernels and the ex-Gaussian kernel. We do also give an overview of how the time-causal limit kernel can be used for modelling the temporal processing in models for spatio-temporal and spectro-temporal receptive fields, and how it more generally has a high potential for modelling neural temporal response functions in a purely time-causal and time-recursive way, that can also handle phenomena at multiple temporal scales in a theoretically well-founded manner.

We detail how this theory can be efficiently implemented for discrete data, in terms of a set of recursive filters coupled incascade. Hence, the theory is generally applicable for both: (i) modelling continuous temporal phenomena over multiple temporal scales and (ii) digital processing of measured temporal signals in real time.

We conclude by stating implications of the theory for modelling temporal phenomena in biological, perceptual, neural and memoryprocesses by mathematical models, as well as implications regarding the philosophy of time and perceptual agents. Specifically, we propose that for A-type theories of time, as well as for perceptual agents, the notion of a non-infinitesimal inner temporal scale of the temporal receptive fields has to be included in representations of the present, where the inherent non-zero temporal delay of such time-causal receptive fields implies a need for incorporating predictions from the actual time-delayed present in the layers of a perceptual hierarchy, to make it possible for a representation of the perceptual present to constitute a representation of the environment with timing properties closer to the actual present.

Place, publisher, year, edition, pages
Springer Nature, 2023
Keywords
Time, Temporal, Scale, Time-causal, Time-recursive, Scale covariance, Scale space, Wavelet analysis, Time-frequency analysis, Signal, The present, Delay, Memory, Perceptual agent, Theoretical neuroscience, Theoretical biology
National Category
Bioinformatics (Computational Biology) Mathematics Signal Processing
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-322099 (URN)10.1007/s00422-022-00953-6 (DOI)000917358000001 ()36689001 (PubMedID)2-s2.0-85146732894 (Scopus ID)
Projects
Scale-space theory for covariant and invariant visual perception
Funder
Swedish Research Council, 2018-0358
Note

Not duplicate with DiVA which is a report

QC 20221202

Available from: 2022-12-01 Created: 2022-12-01 Last updated: 2023-05-09Bibliographically approved
Lindeberg, T. (2023). Covariance properties under natural image transformations for the generalized Gaussian derivative model for visual receptive fields.
Open this publication in new window or tab >>Covariance properties under natural image transformations for the generalized Gaussian derivative model for visual receptive fields
2023 (English)Report (Other academic)
Abstract [en]

The property of covariance, also referred to as equivariance, means that an image operator is well-behaved under image transformations, in the sense that the result of applying the image operator to a transformed input image gives essentially a similar result as applying the same image transformation to the output of applying the image operator to the original image. This paper presents a theory of geometric covariance properties in vision, developed for a generalized Gaussian derivative model of receptive fields in the primary visual cortex and the lateral geniculate nucleus, which, in turn, enable geometric invariance properties at higher levels in the visual hierarchy.

It is shown how the studied generalized Gaussian derivative model for visual receptive fields obeys true covariance properties under spatial scaling transformations, spatial affine transformations, Galilean transformations and temporal scaling transformations. These covariance properties imply that a vision system, based on image and video measurements in terms of the receptive fields according to the generalized Gaussian derivative model, can, to first order of approximation, handle the image and video deformations between multiple views of objects delimited by smooth surfaces, as well as between multiple views of spatio-temporal events, under varying relative motions between the objects and events in the world and the observer.

We conclude by describing implications of the presented theory for biological vision, regarding connections between the variabilities of the shapes of biological visual receptive fields and the variabilities of spatial and spatio-temporal image structures under natural image transformations. Specifically, we formulate experimentally testable biological hypotheses as well as needs for measuring population statistics of receptive field characteristics, originating from predictions from the presented theory, concerning the extent to which the shapes of the biological receptive fields in the primary visual cortex span the variabilities of spatial and spatio-temporal image structures induced by natural image transformations, based on geometric covariance properties.

Publisher
p. 38
Keywords
receptive field, image transformations, scale covariance, affine covariance, Galilean covariance, primary visual cortex, lateral geniculate nucleus, vision, theoretical neuroscience, theoretical biology
National Category
Bioinformatics (Computational Biology) Neurosciences Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:kth:diva-324883 (URN)
Projects
Covariant and invariant deep networks
Funder
Swedish Research Council, 2022-02969
Note

QC 20230328

Available from: 2023-03-20 Created: 2023-03-20 Last updated: 2023-11-22Bibliographically approved
Lindeberg, T. (2023). Covariance properties under natural image transformations for the generalized Gaussian derivative model for visual receptive fields. Frontiers in Computational Neuroscience, 17, 1189949-1-1189949-23
Open this publication in new window or tab >>Covariance properties under natural image transformations for the generalized Gaussian derivative model for visual receptive fields
2023 (English)In: Frontiers in Computational Neuroscience, E-ISSN 1662-5188, Vol. 17, p. 1189949-1-1189949-23Article in journal (Refereed) Published
Abstract [en]

The property of covariance, also referred to as equivariance, means that an image operator is well-behaved under image transformations, in the sense that the result of applying the image operator to a transformed input image gives essentially a similar result as applying the same image transformation to the output of applying the image operator to the original image. This paper presents a theory of geometric covariance properties in vision, developed for a generalized Gaussian derivative model of receptive fields in the primary visual cortex and the lateral geniculate nucleus, which, in turn, enable geometric invariance properties at higher levels in the visual hierarchy.

It is shown how the studied generalized Gaussian derivative model for visual receptive fields obeys true covariance properties under spatial scaling transformations, spatial affine transformations, Galilean transformations and temporal scaling transformations. These covariance properties imply that a vision system, based on image and video measurements in terms of the receptive fields according to the generalized Gaussian derivative model, can, to first order of approximation, handle the image and video deformations between multiple views of objects delimited by smooth surfaces, as well as between multiple views of spatio-temporal events, under varying relative motions between the objects and events in the world and the observer.

We conclude by describing implications of the presented theory for biological vision, regarding connections between the variabilities of the shapes of biological visual receptive fields and the variabilities of spatial and spatio-temporal image structures under natural image transformations. Specifically, we formulate experimentally testable biological hypotheses as well as needs for measuring population statistics of receptive field characteristics, originating from predictions from the presented theory, concerning the extent to which the shapes of the biological receptive fields in the primary visual cortex span the variabilities of spatial and spatio-temporal image structures induced by natural image transformations, based on geometric covariance properties.

Place, publisher, year, edition, pages
Frontiers Media SA, 2023
Keywords
receptive field, image transformations, scale covariance, affine covariance, Galilean covariance, primary visual cortex, vision, theoretical neuroscience
National Category
Bioinformatics (Computational Biology) Neurosciences Computer Vision and Robotics (Autonomous Systems)
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-327330 (URN)10.3389/fncom.2023.1189949 (DOI)001016501200001 ()37398936 (PubMedID)
Projects
Covariant and invariant deep networks
Funder
Swedish Research Council, 2018-03586, 2022-02969
Note

Not duplicate with DiVA 1744469 which is a report.

QC 20230529

Available from: 2023-05-24 Created: 2023-05-24 Last updated: 2024-01-17Bibliographically approved
Lindeberg, T. (2023). Discrete approximations of Gaussian smoothing and Gaussian derivatives.
Open this publication in new window or tab >>Discrete approximations of Gaussian smoothing and Gaussian derivatives
2023 (English)Report (Other academic)
Abstract [en]

This paper develops an in-depth treatment concerning the problem of approximating the Gaussian smoothing and Gaussian derivative computations in scale-space theory for application on discrete data. With close connections to previous axiomatic treatments of continuous and discrete scale-space theory, we consider three main ways discretizing these scale-space operations in terms of explicit discrete convolutions, based on either (i) sampling the Gaussian kernels and the Gaussian derivative kernels, (ii) locally integrating the Gaussian kernels and the Gaussian derivative kernels over each pixel support region, to aim at suppressing some of the severe artefacts of sampled Gaussian kernels and sampled Gaussian derivatives at very fine scales, and (iii) basing the scale-space analysis on the discrete analogue of the Gaussian kernel, and then computing derivative approximations by applying small-support central difference operators to the spatially smoothed image data.

We study the properties of these three main discretization methods both theoretically and experimentally, and characterize their performance by quantitative measures, including the results they give rise to with respect to the task of scale selection, investigated for four different use cases, and with emphasis on the behaviour at fine scales. The results show that the sampled Gaussian kernels and the sampled Gaussian derivatives as well as the integrated Gaussian kernels and the integrated Gaussian derivatives perform very poorly at very fine scales. At very fine scales, the discrete analogue of the Gaussian kernel with its corresponding discrete derivative approximations performs substantially better. The sampled Gaussian kernel and the sampled Gaussian derivatives do, on the other hand, lead to numerically very good approximations of the corresponding continuous results, when the scale parameter is sufficiently large, in the experiments presented in the paper, when the scale parameter is greater than a value of about 1, in units of the grid spacing. Below a standard deviation of about 0.75, the derivative estimates obtained from convolutions with the sampled Gaussian derivative kernels are, however, not numerically accurate or consistent, while the results obtained from the discrete analogue of the Gaussian kernel with its associated central difference operators applied to the spatially smoothed image data is then a much better choice.

Publisher
p. 38
Keywords
Discrete, Continuous, Gaussian kernel, Gaussian derivative, Directional derivative, Scale-normalized derivative, Steerable filter, Filter bank, Scale-space properties, Scale space
National Category
Computer Vision and Robotics (Autonomous Systems)
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-339838 (URN)
Projects
Covariant and invariant deep networks
Funder
Swedish Research Council, 2022-02969
Note

QC 20231121

Available from: 2023-11-21 Created: 2023-11-21 Last updated: 2024-03-18Bibliographically approved
Lindeberg, T. (2023). Joint covariance property under geometric image  transformations for spatio-temporal receptive fields according to the generalized Gaussian derivative model for visual receptive fields.
Open this publication in new window or tab >>Joint covariance property under geometric image  transformations for spatio-temporal receptive fields according to the generalized Gaussian derivative model for visual receptive fields
2023 (English)Report (Other academic)
Abstract [en]

The influence of natural image transformations on receptive field responses is crucial for modelling visual operations in computer vision and biological vision. In this regard, covariance properties with respect to geometric image transformations in the earliest layers of the visual hierarchy are essential for expressing robust image operations and for formulating invariant visual operations at higher levels. This paper defines and proves a joint covariance property under compositions of spatial scaling transformations, spatial affine transformations, Galilean transformations and temporal scaling transformations, which makes it possible to characterize how different types of image transformations interact with each other. Specifically, the derived relations show how the receptive field parameters need to be transformed, in order to match the output from spatio-temporal receptive fields with the underlying spatio-temporal image transformations.

Publisher
p. 7
Keywords
Covariance, Receptive field, Scaling, Affine, Galilean, Spatial, Temporal, Spatio-temporal, Image transformations, Vision
National Category
Computer Vision and Robotics (Autonomous Systems) Bioinformatics (Computational Biology)
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-339789 (URN)
Projects
Covariant and invariant deep networks
Funder
Swedish Research Council, 2018-03586, 2022-02969
Note

QC 20231120

Available from: 2023-11-20 Created: 2023-11-20 Last updated: 2023-11-27Bibliographically approved
Lindeberg, T. (2023). Orientation selectivity of affine Gaussian derivative based receptive fields.
Open this publication in new window or tab >>Orientation selectivity of affine Gaussian derivative based receptive fields
2023 (English)Report (Other academic)
Abstract [en]

This paper presents a theoretical analysis of the orientation selectivity of simple and complex cells that can be well modelled by the generalized Gaussian derivative model for visual receptive fields, with the purely spatial component of the receptive fields determined by oriented affine Gaussian derivatives for different orders of spatial differentiation.

A detailed mathematical analysis is presented for the three different cases of either: (i) purely spatial receptive fields, (ii) space-time separable spatio-temporal receptive fields and (iii)~velocity-adapted spatio-temporal receptive fields. Closed-form theoretical expressions for the orientation selectivity curves for idealized models of simple and complex cells are derived for all these main cases, and it is shown that the degree of orientation selectivity of the receptive fields increases with a scale parameter ratio $\kappa$, defined as the ratio between the scale parameters in the directions perpendicular to vs. parallel with the preferred orientation of the receptive field. It is also shown that the degree of orientation selectivity increases with the order of spatial differentiation in the underlying affine Gaussian derivative operators over the spatial domain.

We describe biological implications of the derived theoretical results, demonstrating that the predictions from the presented theory are consistent with previously established biological results concerning broad vs. sharp orientation tuning of visual neurons in the primary visual cortex. We also demonstrate that the above theoretical predictions, in combination with these biological results, are consistent with a previously formulated biological hypothesis, stating that the biological receptive field shapes should span the degrees of freedom in affine image transformations, to support affine covariance over the population of receptive fields in the primary visual cortex.

Based on the results from the theoretical analysis in the paper, combined with existing results for biological experiments, we formulate a set of testable predictions that could be used to, with neurophysiological experiments, judge if the receptive fields in the primary visual cortex of higher mammals could be regarded as spanning a variability over the eccentricity or the elongation of the receptive fields, and, if so, then also characterize if such a variability would, in a structured way, be related to the pinwheel structure in the visual cortex.

For comparison, we also present a corresponding theoretical orientation selectivity analysis for purely spatial receptive fields according to an affine Gabor model. The results from that analysis are consistent with the results obtained from the affine Gaussian derivative model, in the respect that the orientation selectivity becomes more narrow when making the receptive fields wider in the direction perpendicular to the preferred orientation of the receptive field. The affine Gabor model does, however, comprise one more degree of freedom in its parameter space, compared to the affine Gaussian derivative model, where a variability within that additional dimension of the parameter space does also strongly influence the orientation selectivity of the receptive fields. In this respect, the affine Gaussian derivative model leads to more specific predictions concerning relationships between the orientation selectivity and the elongation of the receptive fields, compared to the affine Gabor model.

Publisher
p. 21
Keywords
Receptive field, Orientation selectivity, Affine covariance, Gaussian derivative, Quasi quadrature, Simple cell, Complex cell, Vision, Theoretical neuroscience
National Category
Bioinformatics (Computational Biology)
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-326136 (URN)
Projects
Covariant and invariant deep networks
Funder
Swedish Research Council, 2022-02969
Note

QC 20230425

Available from: 2023-04-25 Created: 2023-04-25 Last updated: 2023-12-11Bibliographically approved
Lindeberg, T. (2022). A time-causal and time-recursive scale-covariant scale-space representation of temporal signals and past time.
Open this publication in new window or tab >>A time-causal and time-recursive scale-covariant scale-space representation of temporal signals and past time
2022 (English)Report (Other academic)
Abstract [en]

This article presents an overview of a theory for performing temporal smoothing on temporal signals in such a way that: (i) temporally smoothed signals at coarser temporal scales are guaranteed to constitute simplifications of corresponding temporally smoothed signals at any finer temporal scale (including the original signal) and (ii) the temporal smoothing process is both time-causal and time-recursive, in the sense that it does not require access to future information and can be performed with no other temporal memory buffer of the past than the resulting smoothed temporal scale-space representations themselves.

For specific subsets of parameter settings for the classes of linear and shift-invariant temporal smoothing operators that obey this property, it is shown how temporal scale covariance can be additionally obtained, guaranteeing that if the temporal input signal is rescaled by a uniform temporal scaling factor, then also the resulting temporal scale-space representations of the rescaled temporal signal will constitute mere rescalings of the temporal scale-space representations of the original input signal, complemented by a shift along the temporal scale dimension. The resulting time-causal limit kernel that obeys this property constitutes a canonical temporal kernel for processing temporal signals in real-time scenarios when the regular Gaussian kernel cannot be used, because of its non-causal access to information from the future, and we cannot additionally require the temporal smoothing process to comprise a complementary memory of the past beyond the information contained in the temporal smoothing process itself, which in this way also serves as a multi-scale temporal memory of the past.

We describe how the time-causal limit kernel relates to previously used temporal models, such as Koenderink's scale-time kernels and the ex-Gaussian kernel. We do also give an overview of how the time-causal limit kernel can be used for modelling the temporal processing in models for spatio-temporal and spectro-temporal receptive fields, and how it more generally has a high potential for modelling neural temporal response functions in a purely time-causal and time-recursive way, that can also handle phenomena at multiple temporal scales in a theoretically well-founded manner.

We detail how this theory can be efficiently implemented for discrete data, in terms of a set of recursive filters coupled incascade. Hence, the theory is generally applicable for both: (i) modelling continuous temporal phenomena over multiple temporal scales and (ii)digital processing of measured temporal signals in real time.

We conclude by stating implications of the theory for modelling temporal phenomena in biological, perceptual, neural and memory processes by mathematical models, as well as implications regarding the philosophy of time and perceptual agents. Specifically, we propose that for A-type theories of time, as well as for perceptual agents, the notion of a non-infinitesimal inner temporal scale of the temporal receptive fields has to be included in representations of the present, where the inherent non-zero temporal delay of such time-causal receptive fields implies a need for incorporating predictions from the actual time-delayed present in the layers of a perceptual hierarchy, to make it possible for a representation of the perceptual present to constitute a representation of the environment with timing properties closer to the actual present.

Publisher
p. 22
Keywords
Time, Temporal, Scale, Time-causal, Time-recursive, Scale covariance, Scale space, Wavelet analysis, Time-frequency analysis, Signal, The present, Delay, Memory, Perceptual agent, Theoretical neuroscience, Theoretical biology
National Category
Bioinformatics (Computational Biology) Mathematics Computer Vision and Robotics (Autonomous Systems)
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-309081 (URN)
Projects
Scale-space theory for covariant and invariant visual perception
Funder
Swedish Research Council, 2018-03586
Note

QC 20220926

Available from: 2022-02-21 Created: 2022-02-21 Last updated: 2023-01-25Bibliographically approved
Maki, A., Kragic, D., Kjellström, H., Azizpour, H., Sullivan, J., Björkman, M., . . . Sundblad, Y. (2022). In Memoriam: Jan-Olof Eklundh. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(9), 4488-4489
Open this publication in new window or tab >>In Memoriam: Jan-Olof Eklundh
Show others...
2022 (English)In: IEEE Transactions on Pattern Analysis and Machine Intelligence, ISSN 0162-8828, E-ISSN 1939-3539, Vol. 44, no 9, p. 4488-4489Article in journal (Refereed) Published
Place, publisher, year, edition, pages
IEEE COMPUTER SOC, 2022
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-316696 (URN)10.1109/TPAMI.2022.3183266 (DOI)000836666600005 ()
Note

QC 20220905

Available from: 2022-09-05 Created: 2022-09-05 Last updated: 2022-09-05Bibliographically approved
Lindeberg, T. (2022). Scale-covariant and scale-invariant Gaussian derivative networks. Journal of Mathematical Imaging and Vision, 64(3), 223-242
Open this publication in new window or tab >>Scale-covariant and scale-invariant Gaussian derivative networks
2022 (English)In: Journal of Mathematical Imaging and Vision, ISSN 0924-9907, E-ISSN 1573-7683, Vol. 64, no 3, p. 223-242Article in journal (Refereed) Published
Abstract [en]

This paper presents a hybrid approach between scale-space theory and deep learning, where a deep learning architecture is constructed by coupling parameterized scale-space operations in cascade. By sharing the learnt parameters between multiple scale channels, and by using the transformation properties of the scale-space primitives under scaling transformations, the resulting network becomes provably scale covariant. By in addition performing max pooling over the multiple scale channels, or other permutation-invariant pooling over scales, a resulting network architecture for image classification also becomes provably scale invariant.

We investigate the performance of such networks on the MNIST Large Scale dataset, which contains rescaled images from the original MNISTdataset over a factor of 4 concerning training data and over a factor of 16 concerning testing data. It is demonstrated that the resulting approach allows for scale generalization, enabling good performance for classifying patterns at scales not spanned by the training data.

Place, publisher, year, edition, pages
Springer Nature, 2022
Keywords
Scale covariance, Scale invariance, Scale generalisation, Scale selection, Gaussian derivative, Scale space, Deep learning
National Category
Computer Vision and Robotics (Autonomous Systems) Mathematics
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-303875 (URN)10.1007/s10851-021-01057-9 (DOI)000733684800001 ()2-s2.0-85121607127 (Scopus ID)
Projects
Scale-space theory for covariant and invariant visual perception
Funder
Swedish Research Council, 2018-03586
Note

QC 20211021

Available from: 2021-10-21 Created: 2021-10-21 Last updated: 2022-06-25Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-9081-2170

Search in DiVA

Show all publications