Ändra sökning
Länk till posten
Permanent länk

Direktlänk
BETA
Lindeberg, Tony, ProfessorORCID iD iconorcid.org/0000-0002-9081-2170
Biografi [eng]

Tony Lindeberg is a Professor of Computer Science at KTH Royal Institute of Technology in Stockholm, Sweden. He received his MSc degree in 1987, his PhD degree in 1991, became docent in 1996, and was appointed professor in 2000. He was a Research Fellow at the Royal Swedish Academy of Sciences between 2000 and 2010.

His research interests in computer vision relate to scale-space representation, image features, object recognition, video analysis and computational modelling of biological vision. He has developed theories and methodologies for continuous and discrete scale-space representation, visual and auditory receptive fields, detection of salient image structures, automatic scale selection, scale-invariant image features, affine invariant features, affine and Galilean normalization, temporal, spatio-temporal and spectro-temporal scale-space concepts as well as spatial and spatio-temporal image descriptors for image-based recognition.

He does also work on computational modelling of hearing and has previously worked on topics in medical image analysis and gesture recognition. He is author of the book Scale-Space Theory in Computer Vision.

Biografi [swe]

Tony Lindeberg är professor i datavetenskap vid Kungliga Tekniska Högskolan, KTH, i Stockholm. Han fick sin civilingenjörsexamen i teknisk fysik 1987, blev teknisk doktor i datalogi 1991, blev docent 1996 och utnämndes till professor 2000. Han var akademiforskare vid Kungliga Vetenskapsakademien mellan 2000 och 2010.

Hans forskningsintressen i datorseende omfattar skalrumsrepresentation, särdragsdetektion, objektigenkänning, videoanalys och beräkningsinriktad modellering av biologiskt seende. Han har utvecklat teorier och metodiker för kontinuerliga och diskreta skalrumsrepresentationer, visuella och auditiva receptiva fält, detektion av framträdande särdrag, automatiskt skalval, skalinvarianta särdrag, affint invarianta särdrag, affin och galileisk normalisering, temporala, spatio-temporala och spektro-temporala skalrumsbegrepp samt spatiala och spatio-temporala bilddeskriptorer för bildbaserad igenkänning.

Han arbetar också med beräkningsorienterad modellering av hörsel och har tidigare arbetat inom medicinsk bildanalys och med gestigenkänning. Han är författare till boken ”Scale-Space Theory in Computer Vision”.

Publikationer (10 of 123) Visa alla publikationer
Jansson, Y. & Lindeberg, T. (2018). Dynamic texture recognition using time-causal and time-recursive spatio-temporal receptive fields. Journal of Mathematical Imaging and Vision, 60(9), 1369-1398
Öppna denna publikation i ny flik eller fönster >>Dynamic texture recognition using time-causal and time-recursive spatio-temporal receptive fields
2018 (Engelska)Ingår i: Journal of Mathematical Imaging and Vision, ISSN 0924-9907, E-ISSN 1573-7683, Vol. 60, nr 9, s. 1369-1398Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

This work presents a first evaluation of using spatio-temporal receptive fields from a recently proposed time-causal spatiotemporal scale-space framework as primitives for video analysis. We propose a new family of video descriptors based on regional statistics of spatio-temporal receptive field responses and evaluate this approach on the problem of dynamic texture recognition. Our approach generalises a previously used method, based on joint histograms of receptive field responses, from the spatial to the spatio-temporal domain and from object recognition to dynamic texture recognition. The time-recursive formulation enables computationally efficient time-causal recognition. The experimental evaluation demonstrates competitive performance compared to state of the art. In particular, it is shown that binary versions of our dynamic texture descriptors achieve improved performance compared to a large range of similar methods using different primitives either handcrafted or learned from data. Further, our qualitative and quantitative investigation into parameter choices and the use of different sets of receptive fields highlights the robustness and flexibility of our approach. Together, these results support the descriptive power of this family of time-causal spatio-temporal receptive fields, validate our approach for dynamic texture recognition and point towards the possibility of designing a range of video analysis methods based on these new time-causal spatio-temporal primitives.

Ort, förlag, år, upplaga, sidor
Springer, 2018
Nyckelord
Dynamic texture, Receptive field, Spatio-temporal, Time-causal, Time-recursive, Video descriptor, Receptive field histogram, Scale space
Nationell ämneskategori
Datorseende och robotik (autonoma system)
Identifikatorer
urn:nbn:se:kth:diva-231094 (URN)10.1007/s10851-018-0826-9 (DOI)000447385200002 ()2-s2.0-85048764772 (Scopus ID)
Projekt
Scale-space theory for invariant and covariant visual receptive fieldsTime-causal receptive fields for computer vision and modelling of biological vision
Forskningsfinansiär
Vetenskapsrådet, 2014-4083Stiftelsen Olle Engkvist Byggmästare, 2015/465
Anmärkning

QC 20180625

Tillgänglig från: 2018-06-21 Skapad: 2018-06-21 Senast uppdaterad: 2019-01-15Bibliografiskt granskad
Friberg, A., Lindeberg, T., Hellwagner, M., Helgason, P., Salomão, G. L., Elovsson, A., . . . Ternström, S. (2018). Prediction of three articulatory categories in vocal sound imitations using models for auditory receptive fields. Journal of the Acoustical Society of America, 144(3), 1467-1483
Öppna denna publikation i ny flik eller fönster >>Prediction of three articulatory categories in vocal sound imitations using models for auditory receptive fields
Visa övriga...
2018 (Engelska)Ingår i: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 144, nr 3, s. 1467-1483Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

Vocal sound imitations provide a new challenge for understanding the coupling between articulatory mechanisms and the resulting audio. In this study, we have modeled the classification of three articulatory categories, phonation, supraglottal myoelastic vibrations, and turbulence from audio recordings. Two data sets were assembled, consisting of different vocal imitations by four professional imitators and four non-professional speakers in two different experiments. The audio data were manually annotated by two experienced phoneticians using a detailed articulatory description scheme. A separate set of audio features was developed specifically for each category using both time-domain and spectral methods. For all time-frequency transformations, and for some secondary processing, the recently developed Auditory Receptive Fields Toolbox was used. Three different machine learning methods were applied for predicting the final articulatory categories. The result with the best generalization was found using an ensemble of multilayer perceptrons. The cross-validated classification accuracy was 96.8 % for phonation, 90.8 % for supraglottal myoelastic vibrations, and 89.0 % for turbulence using all the 84 developed features. A final feature reduction to 22 features yielded similar results.

Ort, förlag, år, upplaga, sidor
Acoustical Society of America (ASA), 2018
Nyckelord
vocal articulation, sound imitations, signal processing, auditory receptive fields, turbulence, phonation, supraglottal myoelastic vibration, partial least-square regression, support vector classification, ensemble learning
Nationell ämneskategori
Signalbehandling Data- och informationsvetenskap
Forskningsämne
Tal- och musikkommunikation
Identifikatorer
urn:nbn:se:kth:diva-234295 (URN)10.1121/1.5052438 (DOI)000457802200049 ()2-s2.0-85053873907 (Scopus ID)
Forskningsfinansiär
EU, FP7, Sjunde ramprogrammet, 618067
Anmärkning

QC 20181003

Tillgänglig från: 2018-09-06 Skapad: 2018-09-06 Senast uppdaterad: 2019-02-22Bibliografiskt granskad
Ekeberg, Ö., Fransén, E., Hellgren Kotaleski, J., Herman, P., Kumar, A., Lansner, A. & Lindeberg, T. (2016). Computational Brain Science at CST, CSC, KTH. KTH Royal Institute of Technology
Öppna denna publikation i ny flik eller fönster >>Computational Brain Science at CST, CSC, KTH
Visa övriga...
2016 (Engelska)Övrigt, Policydokument (Övrigt vetenskapligt)
Abstract [en]

Mission and Vision - Computational Brain Science Lab at CST, CSC, KTH

The scientific mission of the Computational Brain Science Lab at CSC is to be at the forefront of mathematical modelling, quantitative analysis and mechanistic understanding of brain function. We perform research on (i) computational modelling of biological brain function and on (ii) developing theory, algorithms and software for building computer systems that can perform brain-like functions. Our research answers scientific questions and develops methods in these fields. We integrate results from our science-driven brain research into our work on brain-like algorithms and likewise use theoretical results about artificial brain-like functions as hypotheses for biological brain research.

Our research on biological brain function includes sensory perception (vision, hearing, olfaction, pain), cognition (action selection, memory, learning) and motor control at different levels of biological detail (molecular, cellular, network) and mathematical/functional description. Methods development for investigating biological brain function and its dynamics as well as dysfunction comprises biomechanical simulation engines for locomotion and voice, machine learning methods for analysing functional brain images, craniofacial morphology and neuronal multi-scale simulations. Projects are conducted in close collaborations with Karolinska Institutet and Karolinska Hospital in Sweden as well as other laboratories in Europe, U.S., Japan and India.

Our research on brain-like computing concerns methods development for perceptual systems that extract information from sensory signals (images, video and audio), analysis of functional brain images and EEG data, learning for autonomous agents as well as development of computational architectures (both software and hardware) for neural information processing. Our brain-inspired approach to computing also applies more generically to other computer science problems such as pattern recognition, data analysis and intelligent systems. Recent industrial collaborations include analysis of patient brain data with MentisCura and the startup company 13 Lab bought by Facebook.

Our long term vision is to contribute to (i) deeper understanding of the computational mechanisms underlying biological brain function and (ii) better theories, methods and algorithms for perceptual and intelligent systems that perform artificial brain-like functions by (iii) performing interdisciplinary and cross-fertilizing research on both biological and artificial brain-like functions. 

On one hand, biological brains provide existence proofs for guiding our research on artificial perceptual and intelligent systems. On the other hand, applying Richard Feynman’s famous statement ”What I cannot create I do not understand” to brain science implies that we can only claim to fully understand the computational mechanisms underlying biological brain function if we can build and implement corresponding computational mechanisms on a computerized system that performs similar brain-like functions.

Ort, förlag, år, sidor
KTH Royal Institute of Technology, 2016. s. 1
Nationell ämneskategori
Data- och informationsvetenskap Neurovetenskaper
Identifikatorer
urn:nbn:se:kth:diva-180669 (URN)
Anmärkning

QC 20160121

Tillgänglig från: 2016-01-19 Skapad: 2016-01-19 Senast uppdaterad: 2018-01-10Bibliografiskt granskad
Lindeberg, T. (2016). Time-causal and time-recursive receptive fields for invariance and covariance under natural image transformations. In: : . Paper presented at First European Machine Vision Forum, Heidelberg, Germany, September 8-9, 2016..
Öppna denna publikation i ny flik eller fönster >>Time-causal and time-recursive receptive fields for invariance and covariance under natural image transformations
2016 (Engelska)Konferensbidrag, Muntlig presentation med publicerat abstract (Övrigt vetenskapligt)
Abstract [en]

Due to the huge variability of image information under natural image transformations, the receptive field responses of the local image operations that serve as input to higher level visual processes will in general be strongly dependent on the geometric and illumination conditions in the image formation process. To obtain robustness of a vision system, it is natural to require the receptive field families underlying the image operators to be either invariant or covariant under the relevant families of natural image transformations.

This talk presents an improved model and theory for time-causal and time-recursive spatio-temporal receptive fields, obtained by a combination of Gaussian receptive fields over the spatial domain and first-order integrators or equivalently truncated exponential filters coupled in cascade over the temporal domain. This model inherits the theoretically attractive properties of the Gaussian scale-space model over a spatial domain in terms of (i) invariance or covariance of receptive field responses under scaling transformation and affine transformations over the spatial domain combined with (ii) non-creation of new image structures from finer to coarser scales. When complemented by velocity adaptation the receptive field responses can be made (iii) Galilean covariant or invariant to account for unknown or variable relative motions between objects in the world and the observer. Additionally when expressed over a logarithmic distribution of the temporal scale levels, this model allows for (iv) scale invariance and self-similarity over the temporal domain while simultaneously expressed over a time-causal and time-recursive temporal domain, which is a theoretically new type of construction.

We propose this axiomatically derived theory as the natural extension of the Gaussian scale-space paradigm for local image operations from a spatial domain to a time-causal spatio-temporal domain, to be used as a general framework for expressing spatial and spatio-temporal image operators for a computer vision system. The theory leads to (v) predictions about spatial and spatio-temporal receptive fields with good qualitative similarity to biological receptive fields measured by cell recordings in the retina, the lateral geniculate nucleus (LGN) and the primary visual cortex (V1). Specifically, this framework allows for (vi) computationally efficient real-time operations and leads to (vii) much better temporal dynamics (shorter temporal delays) compared to previously formulated time-causal temporal scale-space models.

Reference:

Lindeberg (2016) "Time-causal and time-recursive spatio-temporal receptive fields", Journal of Mathematical Imaging and Vision, 55(1): 50-88.

Nationell ämneskategori
Datorseende och robotik (autonoma system)
Identifikatorer
urn:nbn:se:kth:diva-188028 (URN)
Externt samarbete:
Konferens
First European Machine Vision Forum, Heidelberg, Germany, September 8-9, 2016.
Projekt
Scale-space theory for invariant and covariant receptive fields
Forskningsfinansiär
Vetenskapsrådet, 2014-4083Stiftelsen Olle Engkvist Byggmästare
Anmärkning

QC 20160603

Tillgänglig från: 2016-06-03 Skapad: 2016-06-03 Senast uppdaterad: 2018-01-10Bibliografiskt granskad
Lindeberg, T. (2016). Time-causal and time-recursive spatio-temporal receptive fields. Journal of Mathematical Imaging and Vision, 55(1), 50-88
Öppna denna publikation i ny flik eller fönster >>Time-causal and time-recursive spatio-temporal receptive fields
2016 (Engelska)Ingår i: Journal of Mathematical Imaging and Vision, ISSN 0924-9907, E-ISSN 1573-7683, Vol. 55, nr 1, s. 50-88Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

We present an improved model and theory for time-causal and time-recursive spatio-temporal receptive fields, obtained by a combination of Gaussian receptive fields over the spatial domain and first-order integrators or equivalently truncated exponential filters coupled in cascade over the temporal domain. 

Compared to previous spatio-temporal scale-space formulations in terms of non-enhancement of local extrema or scale invariance, these receptive fields are based on different scale-space axiomatics over time by ensuring non-creation of new local extrema or zero-crossings with increasing temporal scale. Specifically, extensions are presented about (i) parameterizing the intermediate temporal scale levels, (ii) analysing the resulting temporal dynamics, (iii) transferring the theory to a discrete implementation in terms of recursive filters over time, (iv) computing scale-normalized spatio-temporal derivative expressions for spatio-temporal feature detection and (v) computational modelling of receptive fields in the lateral geniculate nucleus (LGN) and the primary visual cortex (V1) in biological vision. 

We show that by distributing the intermediate temporal scale levels according to a logarithmic distribution, we obtain a new family of temporal scale-space kernels with better temporal characteristics compared to a more traditional approach of using a uniform distribution of the intermediate temporal scale levels. Specifically, the new family of time-causal kernels has much faster temporal response properties (shorter temporal delays) compared to the kernels obtained from a uniform distribution. When increasing the number of temporal scale levels, the temporal scale-space kernels in the new family do also converge very rapidly to a limit kernel possessing true self-similar scale-invariant properties over temporal scales. Thereby, the new representation allows for true scale invariance over variations in the temporal scale, although the underlying temporal scale-space representation is based on a discretized temporal scale parameter. 

We show how scale-normalized temporal derivatives can be defined for these time-causal scale-space kernels and how the composed theory can be used for computing basic types of scale-normalized spatio-temporal derivative expressions in a computationally efficient manner.

Ort, förlag, år, upplaga, sidor
Springer Science+Business Media B.V., 2016
Nyckelord
Scale space, Receptive field, Scale, Spatial, Temporal, Spatio-temporal, Scale-normalized derivative, Scale invariance, Differential invariant, Natural image transformations, Feature detection, Computer vision, Computational modelling, Biological vision
Nationell ämneskategori
Datorseende och robotik (autonoma system) Bioinformatik (beräkningsbiologi) Matematik
Identifikatorer
urn:nbn:se:kth:diva-175890 (URN)10.1007/s10851-015-0613-9 (DOI)000372282800004 ()2-s2.0-84960806901 (Scopus ID)
Projekt
Scale-space theory for invariant and covariant visual receptive fields
Forskningsfinansiär
Vetenskapsrådet, 2014-4083
Anmärkning

QC 20160201

Tillgänglig från: 2015-10-26 Skapad: 2015-10-26 Senast uppdaterad: 2018-01-10Bibliografiskt granskad
Lindeberg, T. (2016). Time-causal and time-recursive spatio-temporal receptive fields for computer vision and computational modelling of biological vision. In: International Workshop on Geometry, PDE’s and Lie Groups in Image Analysis, Eindhoven, The Netherlands, August 24-26, 2016.: . Paper presented at International Workshop on Geometry, PDE’s and Lie Groups in Image Analysis, Eindhoven, The Netherlands, August 24-26, 2016..
Öppna denna publikation i ny flik eller fönster >>Time-causal and time-recursive spatio-temporal receptive fields for computer vision and computational modelling of biological vision
2016 (Engelska)Ingår i: International Workshop on Geometry, PDE’s and Lie Groups in Image Analysis, Eindhoven, The Netherlands, August 24-26, 2016., 2016Konferensbidrag, Muntlig presentation med publicerat abstract (Övrigt vetenskapligt)
Abstract [en]

When operating on time-dependent image information in real time, a fundamental constraint originates from the fact that image operations must be both time-causal and time-recursive.

In this talk, we will present an improved model and theory for time-causal and time-recursive spatio-temporal receptive fields, obtained by a combination of Gaussian filters over the spatial domain and first-order integrators or equivalently truncated exponential filters coupled in cascade over the temporal domain. This receptive field family obeys scale-space axiomatics in the sense of non-enhancement of local extrema over the spatial domain and non-creation of new local extrema over time for any purely temporal signal and does in these respects guarantee theoretically well-founded treatment of spatio-temporal image structures at different spatial and temporal scales.

By a logarithmic distribution of the temporal scale levels in combination with the construction of a time-causal limit kernel based on an infinitely dense distribution of the temporal scale levels towards zero temporal scale, it will be shown that this family allows for temporal scale invariance although the temporal scale levels by the theory have to be discrete. Additionally, the family obeys basic invariance or covariance properties under other classes of natural image transformations including spatial scaling transformations, rotations/affine image deformations over the spatial domain, Galilean transformations of space time and local multiplicative intensity transformations. Thereby, this receptive field family allows for the formulation of multi-scale differential geometric image features with invariance or covariance properties under basic classes of natural image transformations over space-time.

It is shown how this spatio-temporal scale-space concept (i) allows for efficient computation of different types of spatio-temporal features for purposes in computer vision and (ii) leads to predictions about biological receptive fields with good qualitative similarities to the results of cell recordings in the lateral geniculate nucleus (LGN) and the primary visual cortex (V1) in biological vision.

References:

T. Lindeberg (2016) ”Time-causal and time-recursive spatio-temporal receptive fields”, Journal of Mathematical Imaging and Vision, 55(1): 50-88.

T. Lindeberg (2013) ”A computational theory of visual receptive fields”, Biological Cybernetics, 107(6): 589–635.

T. Lindeberg (2013) ”Invariance of visual operations at the level of receptive fields”, PLOS One, 8(7): e66990.

T. Lindeberg (2011) ”Generalized Gaussian scale-space axiomatics comprising linear scale space, affine scale space and spatio-temporal scale space”, Journal of Mathematical Imaging and Vision, 40(1): 36–81.

Nationell ämneskategori
Datorseende och robotik (autonoma system) Bioinformatik (beräkningsbiologi)
Identifikatorer
urn:nbn:se:kth:diva-188030 (URN)
Externt samarbete:
Konferens
International Workshop on Geometry, PDE’s and Lie Groups in Image Analysis, Eindhoven, The Netherlands, August 24-26, 2016.
Forskningsfinansiär
Vetenskapsrådet, 2014-4083Stiftelsen Olle Engkvist Byggmästare
Anmärkning

QC 20160603

Tillgänglig från: 2016-06-03 Skapad: 2016-06-03 Senast uppdaterad: 2018-01-10Bibliografiskt granskad
Lindeberg, T. & Friberg, A. (2015). Idealized computational models for auditory receptive fields. PLoS ONE, 10(3), Article ID e0119032.
Öppna denna publikation i ny flik eller fönster >>Idealized computational models for auditory receptive fields
2015 (Engelska)Ingår i: PLoS ONE, ISSN 1932-6203, E-ISSN 1932-6203, Vol. 10, nr 3, artikel-id e0119032Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

We present a theory by which idealized models of auditory receptive fields can be derived in a principled axiomatic manner, from a set of structural properties to (i) enable invariance of receptive field responses under natural sound transformations and (ii) ensure internal consistency between spectro-temporal receptive fields at different temporal and spectral scales.

For defining a time-frequency transformation of a purely temporal sound signal, it is shown that the framework allows for a new way of deriving the Gabor and Gammatone filters as well as a novel family of generalized Gammatone filters, with additional degrees of freedom to obtain different trade-offs between the spectral selectivity and the temporal delay of time-causal temporal window functions.

When applied to the definition of a second-layer of receptive fields from a spectrogram, it is shown that the framework leads to two canonical families of spectro-temporal receptive fields, in terms of spectro-temporal derivatives of either spectro-temporal Gaussian kernels for non-causal time or a cascade of time-causal first-order integrators over the temporal domain and a Gaussian filter over the logspectral domain. For each filter family, the spectro-temporal receptive fields can be either separable over the time-frequency domain or be adapted to local glissando transformations that represent variations in logarithmic frequencies over time. Within each domain of either non-causal or time-causal time, these receptive field families are derived by uniqueness from the assumptions.

It is demonstrated how the presented framework allows for computation of basic auditory features for audio processing and that it leads to predictions about auditory receptive fields with good qualitative similarity to biological receptive fields measured in the inferior colliculus (ICC) and primary auditory cortex (A1) of mammals.

Ort, förlag, år, upplaga, sidor
Plos, 2015
Nyckelord
Automatic Speech Recognition, Cat Striate Cortex, Inferior Colliculus, Feature-Extraction, Scale Selection, Natural Sounds, Gabor Analysis, Visual-Cortex, Time-Domain, Filter
Nationell ämneskategori
Data- och informationsvetenskap
Forskningsämne
Tal- och musikkommunikation
Identifikatorer
urn:nbn:se:kth:diva-160565 (URN)10.1371/journal.pone.0119032 (DOI)000352134700031 ()25822973 (PubMedID)2-s2.0-84926628005 (Scopus ID)
Forskningsfinansiär
Vetenskapsrådet, 2010-4766,2012-4685,2014-4083EU, FP7, Sjunde ramprogrammet, FET-Open 618067
Anmärkning

QC 20150407

Tillgänglig från: 2015-02-24 Skapad: 2015-02-24 Senast uppdaterad: 2018-09-13Bibliografiskt granskad
Lindeberg, T. (2015). Image matching using generalized scale-space interest points. Paper presented at Special issue with selected papers from SSVM 2013: Scale-Space and Variational Methods in Computer Vision. Journal of Mathematical Imaging and Vision, 52(1), 3-36
Öppna denna publikation i ny flik eller fönster >>Image matching using generalized scale-space interest points
2015 (Engelska)Ingår i: Journal of Mathematical Imaging and Vision, ISSN 0924-9907, E-ISSN 1573-7683, Vol. 52, nr 1, s. 3-36Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

The performance of matching and object recognition methods based on interest points depends on both the properties of the underlying interest points and the choice of associated image descriptors. This paper demonstrates advantages of using generalized scale-space interest point detectors in this context for selecting a sparse set of points for computing image descriptors for image-based matching.

For detecting interest points at any given scale, we make use of the Laplacian, the determinant of the Hessian and four new unsigned or signed Hessian feature strength measures, which are defined by generalizing the definitions of the Harris and Shi-and-Tomasi operators from the second moment matrix to the Hessian matrix. Then, feature selection over different scales is performed either by scale selection from local extrema over scale of scale-normalized derivates or by linking features over scale into feature trajectories and computing a significance measure from an integrated measure of normalized feature strength over scale.

A theoretical analysis is presented of the robustness of the differential entities underlying these interest points under image deformations, in terms of invariance properties under affine image deformations or approximations thereof. Disregarding the effect of the rotationally symmetric scale-space smoothing operation, the determinant of the Hessian is a truly affine covariant differential entity and two of the new Hessian feature strength measures have a major contribution from the affine covariant determinant of the Hessian, implying that local extrema of these differential entities will bemore robust under affine image deformations than local extrema of the Laplacian operator or the two other new Hessian feature strength measures.

It is shown how these generalized scale-space interest points allow for a higher ratio of correct matches and a lower ratio of false matches compared to previously known interest point detectors within the same class. The best results are obtained using interest points computed with scale linking and with the new Hessian feature strength measures and the determinant of the Hessian being the differential entities that lead to the best matching performance under perspective image transformations with significant foreshortening, and better than the more commonly used Laplacian operator, its difference-of-Gaussians approximation or the Harris-Laplace operator.

We propose that these generalized scale-space interest points, when accompanied by associated local scale-invariant image descriptors, should allow for better performance of interest point based methods for image-based matching, object recognition and related visual tasks.

Ort, förlag, år, upplaga, sidor
Springer Berlin/Heidelberg, 2015
Nyckelord
Feature detection, Interest point, Blob detection, Corner detection, Scale, Scale selection, Scale linking, Feature trajectory, Matching, Object recognition, Scale invariance, Affine invariance, Differential invariant, Image descriptor, Scale space, Computer vision
Nationell ämneskategori
Datorseende och robotik (autonoma system) Data- och informationsvetenskap
Identifikatorer
urn:nbn:se:kth:diva-153640 (URN)10.1007/s10851-014-0541-0 (DOI)000353205200002 ()2-s2.0-84908123725 (Scopus ID)
Konferens
Special issue with selected papers from SSVM 2013: Scale-Space and Variational Methods in Computer Vision
Forskningsfinansiär
Vetenskapsrådet, 2010-4766Kungliga VetenskapsakademienKnut och Alice Wallenbergs Stiftelse
Anmärkning

QC 20141218

Tillgänglig från: 2014-10-06 Skapad: 2014-10-06 Senast uppdaterad: 2018-01-11Bibliografiskt granskad
Lindeberg, T. & Friberg, A. (2015). Scale-space theory for auditory signals. In: J.-F. Aujol et al. (Ed.), Scale Space and Variational Methods in Computer Vision: 5th International Conference, SSVM 2015, Lège-Cap Ferret, France, May 31 - June 4, 2015, Proceedings. Paper presented at SSVM 2015: Fifth International Conference on Scale Space and Variational Methods in Computer Vision, Lège Cap Ferret, France, 31 May - 4 June, 2015 (pp. 3-15). Springer, 9087
Öppna denna publikation i ny flik eller fönster >>Scale-space theory for auditory signals
2015 (Engelska)Ingår i: Scale Space and Variational Methods in Computer Vision: 5th International Conference, SSVM 2015, Lège-Cap Ferret, France, May 31 - June 4, 2015, Proceedings / [ed] J.-F. Aujol et al., Springer, 2015, Vol. 9087, s. 3-15Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

We show how the axiomatic structure of scale-space theory can be applied to the auditory domain and be used for deriving idealized models of auditory receptive fields via scale-space principles. For defining a time-frequency transformation of a purely temporal signal, it is shown that the scale-space framework allows for a new way of deriving the Gabor and Gammatone filters as well as a novel family of generalized Gammatone filters with additional degrees of freedom to obtain different trade-offs between the spectral selectivity and the temporal delay of time-causal window functions. Applied to the definition of a second layer of receptive fields from the spectrogram, it is shown that the scale-space framework leads to two canonical families of spectro-temporal receptive fields, using a combination of Gaussian filters over the logspectral domain with either Gaussian filters or a cascade of first-order integrators over the temporal domain. These spectro-temporal receptive fields can be either separable over the time-frequency domain or be adapted to local glissando transformations that represent variations in logarithmic frequencies over time. Such idealized models of auditory receptive fields respect auditory invariances, can be used for computing basic auditory features for audio processing and lead to predictions about auditory receptive fields with good qualitative similarity to biological receptive fields in the inferior colliculus (ICC) and the primary auditory cortex (A1).

Ort, förlag, år, upplaga, sidor
Springer, 2015
Serie
Lecture Notes in Computer Science, ISSN 0302-9743 ; 9087
Nyckelord
Computation theory, Computer vision, Degrees of freedom (mechanics), Economic and social effects, Frequency domain analysis, Gammatone filters, Inferior colliculus, Log-spectral domain, Logarithmic frequency, Scale-space theory, Spectral selectivity, Time frequency domain, Time-frequency transformation
Nationell ämneskategori
Data- och informationsvetenskap
Forskningsämne
Tal- och musikkommunikation
Identifikatorer
urn:nbn:se:kth:diva-160481 (URN)10.1007/978-3-319-18461-6_1 (DOI)2-s2.0-84931078597 (Scopus ID)978-3-319-18461-6 (ISBN)
Konferens
SSVM 2015: Fifth International Conference on Scale Space and Variational Methods in Computer Vision, Lège Cap Ferret, France, 31 May - 4 June, 2015
Forskningsfinansiär
Vetenskapsrådet, 2010-4766,2012-4685,2014-4083EU, FP7, Sjunde ramprogrammet, FET-Open 618067
Anmärkning

QC 20150407

Tillgänglig från: 2015-02-20 Skapad: 2015-02-20 Senast uppdaterad: 2018-09-13Bibliografiskt granskad
Lindeberg, T. (2015). Separable time-causal and time-recursive spatio-temporal receptive fields. In: J.-F. Aujol et al. (Ed.), Scale Space and Variational Methods in Computer Vision: 5th International Conference, SSVM 2015, Lège-Cap Ferret, France, May 31 - June 4, 2015, Proceedings. Paper presented at SSVM 2015: Fifth International Conference on Scale Space and Variational Methods in Computer Vision, Lège Cap Ferret, France, 31 May - 4 June, 2015 (pp. 90-102). Springer
Öppna denna publikation i ny flik eller fönster >>Separable time-causal and time-recursive spatio-temporal receptive fields
2015 (Engelska)Ingår i: Scale Space and Variational Methods in Computer Vision: 5th International Conference, SSVM 2015, Lège-Cap Ferret, France, May 31 - June 4, 2015, Proceedings / [ed] J.-F. Aujol et al., Springer, 2015, s. 90-102Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

We present an improved model and theory for time-causal and time-recursive spatio-temporal receptive fields,obtained by a combination of Gaussian receptive fields over the spatial domain and first-order integrators or equivalently truncated exponential filters coupled in cascade over the temporal domain. Compared to previous spatio-temporal scale-space formulations in terms of non-enhancement of local extrema or scale invariance, these receptive fields are based on different scale-space axiomatics over time by ensuring non-creation of new local extrema or zero-crossings with increasing temporal scale. Specifically, extensions are presented about parameterizing the intermediate temporal scale levels, analysing the resulting temporal dynamics and transferring the theory to a discrete implementation in terms of recursive filters over time.

Ort, förlag, år, upplaga, sidor
Springer, 2015
Serie
Lecture Notes in Computer Science ; 9087
Nationell ämneskategori
Datavetenskap (datalogi) Datorseende och robotik (autonoma system) Bioinformatik (beräkningsbiologi)
Identifikatorer
urn:nbn:se:kth:diva-160482 (URN)10.1007/978-3-319-18461-6_8 (DOI)978-3-319-18461-6 (ISBN)
Konferens
SSVM 2015: Fifth International Conference on Scale Space and Variational Methods in Computer Vision, Lège Cap Ferret, France, 31 May - 4 June, 2015
Forskningsfinansiär
Vetenskapsrådet, 2010-4766,2014-4083
Anmärkning

QC 20150511

Tillgänglig från: 2015-02-20 Skapad: 2015-02-20 Senast uppdaterad: 2018-01-11Bibliografiskt granskad
Organisationer
Identifikatorer
ORCID-id: ORCID iD iconorcid.org/0000-0002-9081-2170

Sök vidare i DiVA

Visa alla publikationer