Endre søk
Begrens søket
2345678 201 - 250 of 467
Referera
Referensformat
• apa
• harvard1
• ieee
• modern-language-association-8th-edition
• vancouver
• Annet format
Fler format
Språk
• de-DE
• en-GB
• en-US
• fi-FI
• nn-NO
• nn-NB
• sv-SE
• Annet språk
Fler språk
Utmatningsformat
• html
• text
• asciidoc
• rtf
Treff pr side
• 5
• 10
• 20
• 50
• 100
• 250
Sortering
• Standard (Relevans)
• Forfatter A-Ø
• Forfatter Ø-A
• Tittel A-Ø
• Tittel Ø-A
• Type publikasjon A-Ø
• Type publikasjon Ø-A
• Eldste først
• Nyeste først
• Disputationsdatum (tidligste først)
• Disputationsdatum (siste først)
• Standard (Relevans)
• Forfatter A-Ø
• Forfatter Ø-A
• Tittel A-Ø
• Tittel Ø-A
• Type publikasjon A-Ø
• Type publikasjon Ø-A
• Eldste først
• Nyeste først
• Disputationsdatum (tidligste først)
• Disputationsdatum (siste først)
Merk
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
• 201.
KTH, Skolan för elektroteknik och datavetenskap (EECS), Robotik, perception och lärande, RPL.
KTH, Skolan för elektroteknik och datavetenskap (EECS), Robotik, perception och lärande, RPL. KTH, Skolan för elektroteknik och datavetenskap (EECS), Robotik, perception och lärande, RPL. KTH, Skolan för elektroteknik och datavetenskap (EECS), Robotik, perception och lärande, RPL. KTH, Skolan för elektroteknik och datavetenskap (EECS), Robotik, perception och lärande, RPL.
Comparing Human-Robot Proxemics between Virtual Reality and the Real World2019Inngår i: HRI '19: 2019 14TH ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, IEEE , 2019, s. 431-439Konferansepaper (Fagfellevurdert)

Virtual Reality (VR) can greatly benefit Human-Robot Interaction (HRI) as a tool to effectively iterate across robot designs. However, possible system limitations of VR could influence the results such that they do not fully reflect real-life encounters with robots. In order to better deploy VR in HRI, we need to establish a basic understanding of what the differences are between HRI studies in the real world and in VR. This paper investigates the differences between the real life and VR with a focus on proxemic preferences, in combination with exploring the effects of visual familiarity and spatial sound within the VR experience. Results suggested that people prefer closer interaction distances with a real, physical robot than with a virtual robot in VR. Additionally, the virtual robot was perceived as more discomforting than the real robot, which could result in the differences in proxemics. Overall, these results indicate that the perception of the robot has to be evaluated before the interaction can be studied. However, the results also suggested that VR settings with different visual familiarities are consistent with each other in how they affect HRI proxemics and virtual robot perceptions, indicating the freedom to study HRI in various scenarios in VR. The effect of spatial sound in VR drew a more complex picture and thus calls for more in-depth research to understand its influence on HRI in VR.

• 202.
KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
Evaluation of the CNN Based Architectures on the Problem of Wide Baseline Stereo Matching2016Independent thesis Advanced level (degree of Master (Two Years)), 20 poäng / 30 hpOppgave

Three-dimensional information is often used in robotics and 3D-mapping. There exist several ways to obtain a three-dimensional map. However, the time of flight used in the laser scanners or the structured light utilized by Kinect-like sensors sometimes are not sufficient. In this thesis, we investigate two CNN based stereo matching methods for obtaining 3D-information from a grayscaled pair of rectified images.While the state-of-the-art stereo matching method utilize a Siamese architecture, in this project a two-channel and a two stream network are trained in an attempt to outperform the state-of-the-art. A set of experiments were performed to achieve optimal hyperparameters. By changing one parameter at the time, the networks with architectures mentioned above are trained. After a completed training the networks are evaluated with two criteria, the error rate, and the runtime.Due to time limitations, we were not able to find optimal learning parameters. However, by using settings from [17] we train a two-channel network that performed almost on the same level as the state-of-the-art. The error rate on the test data for our best architecture is 2.64% while the error rate for the state-of-the-art Siamese network is 2.62%. We were not able to achieve better performance than the state-of-the-art, but we believe that it is possible to reduce the error rate further. On the other hand, the state-of-the-art Siamese stereo matching network is more efficient and faster during the disparity estimation. Therefore, if the time efficiency is prioritized, the Siamese based network should be considered.

• 203. Li, Y.
KTH, Skolan för teknikvetenskap (SCI), Matematik (Inst.), Optimeringslära och systemteori.
Autonomous control and target tracking algorithm design for a quadrotor2017Inngår i: 2017 36th Chinese Control Conference (CCC), IEEE Computer Society, 2017, s. 6749-6754, artikkel-id 8028422Konferansepaper (Fagfellevurdert)

In this paper, the task of Mission 7 of International Aerial Robotics Competition (IARC) is investigated. The quadrotor is required to autonomously navigate in a GPS-denied environment, and accomplish physical interaction with ground moving targets. In order to estimate the position of the quadrotor, a multi-sensor-compensation based method is designed and individual measurement errors are compensated. Also an extended Kalman filer (EKF) is utilized for attitude estimation. To accomplish the target tracking task, a path planning algorithm is designed under the constraints of quadrotor dynamics, and consecutive waypoint setpoints are generated. Then lower-level cascaded PID controllers are adopted to track the command waypoint. Finally, simulation and experimental results show the effectiveness and feasibility of the proposed methods.

• 204.
KTH, Skolan för elektroteknik och datavetenskap (EECS).
Deep learning navigation for UGVs on forests paths2018Independent thesis Advanced level (degree of Master (One Year)), 20 poäng / 30 hpOppgave

Artificial intelligence and machine learning have seen great progress in recent years. In this work, we will look at the application of machine learning in visual navigational systems for unmanned vehicles in natural environments. Previous works have focused on navigational systems with deep convolutional neural networks (CNNs) for unmanned aerial vehicles (UAVs). In this work, we evaluate the robustness and applicability of these methods for unmanned ground vehicles (UGVs).

To evaluate the robustness and applicability of this machine learning approach for UGV two experiments where performed. In the first, data from Swiss trails and photos collected in Swedish forests where used to train deep CNNs. Several models are trained using data collected in different environments at different heights. By cross evaluating the trained models on the other datasets the impact of changing camera position and switching environment can be evaluated. In the second experiment, a navigational system using the trained CNN models were constructed. By evaluating the ability of the system to autonomously follow a forest path an understanding of the applicability of these methods for UGVs in general can be obtained.

There where several results from the experiments. When comparing models trained on different datasets, we could see that the environment has an effect on the performance of the navigation, but even more so, the approach is sensitive to the camera position. Finally, an online test to evaluate the applicability of this approach as an end-to-end navigation system for UGVs is done. This experiment showed that these methods, on their own, are not a viable option for an end-to-end navigational system for UGVs in forest environments.

• 205.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Composed Complex-Cue Histograms: An Investigation of the Information Content in Receptive Field Based Image Descriptors for Object Recognition2012Inngår i: Computer Vision and Image Understanding, ISSN 1077-3142, E-ISSN 1090-235X, Vol. 116, nr 4, s. 538-560Artikkel i tidsskrift (Fagfellevurdert)

Recent work has shown that effective methods for recognizing objects and spatio-temporal events can be constructed based on histograms of receptive field like image operations.

This paper presents the results of an extensive study of the performance of different types of receptive field like image descriptors for histogram-based object recognition, based on different combinations of image cues in terms of Gaussian derivatives or differential invariants applied to either intensity information, colour-opponent channels or both. A rich set of composed complex-cue image descriptors is introduced and evaluated with respect to the problems of (i) recognizing previously seen object instances from previously unseen views, and (ii) classifying previously unseen objects into visual categories.

It is shown that there exist novel histogram descriptors with significantly better recognition performance compared to previously used histogram features within the same class. Specifically, the experiments show that it is possible to obtain more discriminative features by combining lower-dimensional scale-space features into composed complex-cue histograms. Furthermore, different types of image descriptors have different relative advantages with respect to the problems of object instance recognition vs. object category classification. These conclusions are obtained from extensive experimental evaluations on two mutually independent data sets.

For the task of recognizing specific object instances, combined histograms of spatial and spatio-chromatic derivatives are highly discriminative, and several image descriptors in terms rotationally invariant (intensity and spatio-chromatic) differential invariants up to order two lead to very high recognition rates.

For the task of category classification, primary information is contained in both first- and second-order derivatives, where second-order partial derivatives constitute the most discriminative cue.

Dimensionality reduction by principal component analysis and variance normalization prior to training and recognition can in many cases lead to a significant increase in recognition or classification performance. Surprisingly high recognition rates can even be obtained with binary histograms that reveal the polarity of local scale-space features, and which can be expected to be particularly robust to illumination variations.

An overall conclusion from this study is that compared to previously used lower-dimensional histograms, the use of composed complex-cue histograms of higher dimensionality reveals the co-variation of multiple cues and enables much better recognition performance, both with regard to the problems of recognizing previously seen objects from novel views and for classifying previously unseen objects into visual categories.

• 206.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
A computational theory of visual receptive fields2013Inngår i: Biological Cybernetics, ISSN 0340-1200, E-ISSN 1432-0770, Vol. 107, nr 6, s. 589-635Artikkel i tidsskrift (Fagfellevurdert)

A receptive field constitutes a region in the visual field where a visual cell or a visual operator responds to visual stimuli. This paper presents a theory for what types of receptive field profiles can be regarded as natural for an idealized vision system, given a set of structural requirements on the first stages of visual processing that reflect symmetry properties of the surrounding world.

These symmetry properties include (i) covariance properties under scale changes, affine image deformations, and Galilean transformations of space–time as occur for real-world image data as well as specific requirements of (ii) temporal causality implying that the future cannot be accessed and (iii) a time-recursive updating mechanism of a limited temporal buffer of the past as is necessary for a genuine real-time system. Fundamental structural requirements are also imposed to ensure (iv) mutual consistency and a proper handling of internal representations at different spatial and temporal scales.

It is shown how a set of families of idealized receptive field profiles can be derived by necessity regarding spatial, spatio-chromatic, and spatio-temporal receptive fields in terms of Gaussian kernels, Gaussian derivatives, or closely related operators. Such image filters have been successfully used as a basis for expressing a large number of visual operations in computer vision, regarding feature detection, feature classification, motion estimation, object recognition, spatio-temporal recognition, and shape estimation. Hence, the associated so-called scale-space theory constitutes a both theoretically well-founded and general framework for expressing visual operations.

There are very close similarities between receptive field profiles predicted from this scale-space theory and receptive field profiles found by cell recordings in biological vision. Among the family of receptive field profiles derived by necessity from the assumptions, idealized models with very good qualitative agreement are obtained for (i) spatial on-center/off-surround and off-center/on-surround receptive fields in the fovea and the LGN, (ii) simple cells with spatial directional preference in V1, (iii) spatio-chromatic double-opponent neurons in V1, (iv) space–time separable spatio-temporal receptive fields in the LGN and V1, and (v) non-separable space–time tilted receptive fields in V1, all within the same unified theory. In addition, the paper presents a more general framework for relating and interpreting these receptive fields conceptually and possibly predicting new receptive field profiles as well as for pre-wiring covariance under scaling, affine, and Galilean transformations into the representations of visual stimuli.

This paper describes the basic structure of the necessity results concerning receptive field profiles regarding the mathematical foundation of the theory and outlines how the proposed theory could be used in further studies and modelling of biological vision. It is also shown how receptive field responses can be interpreted physically, as the superposition of relative variations of surface structure and illumination variations, given a logarithmic brightness scale, and how receptive field measurements will be invariant under multiplicative illumination variations and exposure control mechanisms.

• 207.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
A framework for invariant visual operations based on receptive field responses2013Inngår i: SSVM 2013: Fourth International Conference on Scale Space and Variational Methods in Computer Vision, June 2-6, Schloss Seggau, Graz region, Austria: Invited keynote address / [ed] Arjan Kuijper, 2013Konferansepaper (Annet vitenskapelig)

The brain is able to maintain a stable perception although the visual stimuli vary substantially on the retina due to geometric transformations and lighting variations in the environment. This talk presents a unified theory for achieving basic invariance properties of visual operations already at the level of receptive fields.

This generalized framework for invariant receptive field responses comprises:

• local scaling transformations caused by objects of different size and at different distances to the observer,
• locally linearized image deformations caused by variations in the viewing direction in relation to the object,
• locally linearized relative motions between the object and the observer and
• local multiplicative intensity transformations caused by illumination variations.

The receptive field model can be derived by necessity from symmetry properties of the environment and leads to predictions about receptive field profiles in good agreement with receptive field profiles measured by cell recordings in mammalian vision. Indeed, the receptive field profiles in the retina, LGN and V1 can be seen as close to ideal to what is motivated by the idealized requirements.

By complementing receptive field measurements with selection mechanisms over the parameters in the receptive field families, it is shown how true invariance of receptive field responses can be obtained under scaling transformations, affine transformations and Galilean transformations. Thereby, the framework provides a mathematically well-founded and biologically plausible model for how basic invariance properties can be achieved already at the level of receptive fields and support invariant recognition of objects and events under variations in viewpoint, retinal size, object motion and illumination.

The theory can explain the different shapes of receptive field profiles found in biological vision, which are tuned to different sizes and orientations in the image domain as well as to different image velocities in space-time, from a requirement that the visual system should be invariant to the natural types of image transformations that occur in its environment.

References:

• T. Lindeberg (2011) "Generalized Gaussian scale-space axiomatics comprising linear scale-space, affine scale-space and spatio-temporal scale-space". Journal of Mathematical Imaging and Vision, volume 40, number 1, pages 36-81, May 2011.
• T. Lindeberg (2013) “Invariance of visual operations at the level of receptive fields”, PLoS ONE 8(7): e66990, doi:10.1371/journal.pone.0066990, preprint available from arXiv:1210.0754.
• T. Lindeberg (2013) "Generalized axiomatic scale-space theory", Advances in Imaging and Electron Physics, (P. Hawkes, ed.), Elsevier, volume 178, pages 1-96, Academic Press: Elsevier Inc., doi: 10.1016/B978-0-12-407701-0.00001-7
• 208.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
A scale selection principle for estimating image deformations1998Inngår i: Image and Vision Computing, ISSN 0262-8856, E-ISSN 1872-8138, Vol. 16, s. 961-977Artikkel i tidsskrift (Fagfellevurdert)

A basic functionality of a vision system concerns the ability to compute deformation fields between different images of the same physical structure. This article advocates the need for incorporating explicit mechanisms for scale selection in this context, in algorithms for computing descriptors such as optic flow and for performing stereo matching. A basic reason why such a mechanism is essential is the fact that in a coarse-to-fine propagation of disparity or flow information, it is not necessarily the case that the most accurate estimates are obtained at the finest scales. The existence of interfering structures at fine scales may make it impossible to accurately match the image data at fine scales. selecting deformation estimates from the scales that minimize the (suitably normalized) uncertainty over scales. A specific implementation of this idea is presented for a region based differential flow estimation scheme. It is shown that the integrated scale selection and flow estimation algorithm has the qualitative properties of leading to the selection of coarser scales for larger size image structures and increasing noise level, whereas it leads to the selection of finer scales in the neighbourhood of flow field discontinuities

• 209.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Automatic scale selection as a pre-processing stage for interpreting the visual world1999Inngår i: Proc. Fundamental StructuralProperties in Image and Pattern Analysis FSPIPA'99 , (Budapest, Hungary), September 6-7, 1999, Österreichischen Computer Gesellschaft , 1999, Vol. 130, s. 9-23Konferansepaper (Fagfellevurdert)

This paper reviews a systematic methodology for formulating mechanisms for automatic scale selection when performing feature detection in scale-space. An important property of the proposed approach is that the notion of scale is included already in the definition of image features

• 210.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Corner detection2001Inngår i: Encyclopaedia of Mathematics / [ed] Michiel Hazewinkel, Springer , 2001Kapittel i bok, del av antologi (Fagfellevurdert)
• 211.
KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
Dense scale selection over space, time and space-time2018Inngår i: SIAM Journal on Imaging Sciences, ISSN 1936-4954, E-ISSN 1936-4954, Vol. 11, nr 1, s. 407-441Artikkel i tidsskrift (Fagfellevurdert)

Scale selection methods based on local extrema over scale of scale-normalized derivatives have been primarily developed to be applied sparsely at image points where the magnitude of a scale normalized differential expression additionally assumes local extrema over the domain where the data are defined. This paper presents a methodology for performing dense scale selection, so that hypotheses about local characteristic scales in images, temporal signals, and video can be computed at every image point and every time moment. A critical problem when designing mechanisms for dense scale selection is that the scale at which scale-normalized differential entities assume local extrema over scale can be strongly dependent on the local order of the locally dominant differential structure. To address this problem, we propose a methodology where local extrema over scale are detected of a quasi quadrature measure involving scale-space derivatives up to order two and propose two independent mechanisms to reduce the phase dependency of the local scale estimates by (i) introducing a second layer of postsmoothing prior to the detection of local extrema over scale, and (ii) performing local phase compensation based on a model of the phase dependency of the local scale estimates depending on the relative strengths between first- and second-order differential structures. This general methodology is applied over three types of domains: (i) spatial images, (ii) temporal signals, and (iii) spatio-temporal video. Experiments demonstrate that the proposed methodology leads to intuitively reasonable results with local scale estimates that reflect variations in the characteristic scales of locally dominant structures over space and time.Scale selection methods based on local extrema over scale of scale-normalized derivatives have been primarily developed to be applied sparsely at image points where the magnitude of a scale normalized differential expression additionally assumes local extrema over the domain where the data are defined. This paper presents a methodology for performing dense scale selection, so that hypotheses about local characteristic scales in images, temporal signals, and video can be computed at every image point and every time moment. A critical problem when designing mechanisms for dense scale selection is that the scale at which scale-normalized differential entities assume local extrema over scale can be strongly dependent on the local order of the locally dominant differential structure. To address this problem, we propose a methodology where local extrema over scale are detected of a quasi quadrature measure involving scale-space derivatives up to order two and propose two independent mechanisms to reduce the phase dependency of the local scale estimates by (i) introducing a second layer of postsmoothing prior to the detection of local extrema over scale, and (ii) performing local phase compensation based on a model of the phase dependency of the local scale estimates depending on the relative strengths between first- and second-order differential structures. This general methodology is applied over three types of domains: (i) spatial images, (ii) temporal signals, and (iii) spatio-temporal video. Experiments demonstrate that the proposed methodology leads to intuitively reasonable results with local scale estimates that reflect variations in the characteristic scales of locally dominant structures over space and time.

• 212.
KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
Detecting salient blob-like image structures and their scales with a scale-space primal sketch: A method for focus-of-attention1993Inngår i: International Journal of Computer Vision, ISSN 0920-5691, E-ISSN 1573-1405, Vol. 11, nr 3, s. 283-318Artikkel i tidsskrift (Fagfellevurdert)

This article presents: (i) a multiscale representation of grey-level shape called the scale-space primal sketch, which makes explicit both features in scale-space and the relations between structures at different scales, (ii) a methodology for extracting significant blob-like image structures from this representation, and (iii) applications to edge detection, histogram analysis, and junction classification demonstrating how the proposed method can be used for guiding later-stage visual processes. The representation gives a qualitative description of image structure, which allows for detection of stable scales and associated regions of interest in a solely bottom-up data-driven way. In other words, it generates coarse segmentation cues, and can hence be seen as preceding further processing, which can then be properly tuned. It is argued that once such information is available, many other processing tasks can become much simpler. Experiments on real imagery demonstrate that the proposed theory gives intuitive results.

• 213.
KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
Direct estimation of affine image deformations using visual front-end operations with automatic scale selection1995Inngår i: Proc. 5th International Conference on Computer Vision: ICCV'95 (Boston, MA), IEEE Computer Society, 1995, s. 134-141Konferansepaper (Fagfellevurdert)

This article deals with the problem of estimating deformations of brightness patterns using visual front-end operations. Estimating such deformations constitutes an important subtask in several computer vision problems relating to image correspondence and shape estimation. The following subjects are treated: The problem of decomposing affine flow fields into simpler components is analysed in detail. A canonical parametrization is presented based on singular value decomposition, which naturally separates the rotationally invariant components of the flow field from the rotationally variant ones. A novel mechanism is presented for automatic selection of scale levels when estimating local affine deformations. This mechanism is expressed within a multi-scale framework where disparity estimates are computed in a hierarchical coarse-to-fine manner and corrected using iterative techniques. Then, deformation estimates are selected from the scales that minimize a certain normalized residual over scales. Finally, the descriptors so obtained serve as initial data for computing refined estimates of the local deformations.

• 214.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
Discrete approximations of affine Gaussian receptive fields2017Rapport (Annet vitenskapelig)

This paper presents a theory for discretizing the affine Gaussian scale-space concept so that scale-space properties hold also for the discrete implementation.

Two ways of discretizing spatial smoothing with affine Gaussian kernels are presented: (i) by solving semi-discretized affine diffusion equation as derived by necessity from the requirement of a semi-group structure over a continuum of scale parameters as parameterized by a family of spatial covariance matrices and obeying non-creation of new structures from any finer to any coarser scale as formalized by the requirement of non-enhancement of local extrema and (ii) a set of parameterized 3x3-kernels as derived from an additional discretization of the above theory along the scale direction and with the parameters of the kernels having a direct interpretation in terms of the covariance matrix of the composed discrete smoothing operation.

We show how convolutions with the first family of kernels can be implemented in terms of a closed form expression for the Fourier transform and analyse how a remaining degree of freedom in the theory can be explored to ensure a positive discretization and optionally also achieve higher-order discrete approximation of the angular dependency of the shapes of the affine Gaussian kernels.

We do also show how discrete directional derivative approximations can be efficiently implemented to approximate affine Gaussian derivatives as constituting a canonical model for receptive fields over a purely spatial image domain and with close relations to receptive fields in biological vision.

• 215.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
Discrete approximations of the affine Gaussian derivative model for visual receptive fields2017Rapport (Annet vitenskapelig)

The affine Gaussian derivative model can in several respects be regarded as a canonical model for receptive fields over a spatial image domain: (i) it can be derived by necessity from scale-space axioms that reflect structural properties of the world, (ii) it constitutes an excellent model for the receptive fields of simple cells in the primary visual cortex and (iii) it is covariant under affine image deformations, which enables more accurate modelling of image measurements under the local image deformations caused by the perspective mapping, compared to the more commonly used Gaussian derivative model based on derivatives of the rotationally symmetric Gaussian kernel.

This paper presents a theory for discretizing the affine Gaussian scale-space concept underlying the affine Gaussian derivative model, so that scale-space properties hold also for the discrete implementation.

Two ways of discretizing spatial smoothing with affine Gaussian kernels are presented: (i) by solving a semi-discretized affine diffusion equation, which has derived by necessity from the requirements of a semi-group structure over scale as parameterized by a family of spatial covariance matrices and obeying non-creation of new structures from any finer to any coarser scale in terms of non-enhancement of local extrema and (ii) approximating these semi-discrete affine receptive fields by parameterized families of 3x3-kernels as obtained from an additional discretization along the scale direction. The latter discrete approach can be optionally complemented by spatial subsampling at coarser scales, leading to the notion of affine hybrid pyramids.

For the first approach, we show how the solutions can be computed from a closed form expression for the Fourier transform, and analyse how a remaining degree of freedom in the theory can be explored to ensure a positive discretization and optionally achieve higher-order discrete approximation of the angular dependency of the discrete affine Gaussian receptive fields. For the second approach, we analyse how the step length in the scale direction can be determined, given the requirements of a positive discretization.

We do also show how discrete directional derivative approximations can be efficiently implemented to approximate affine Gaussian derivatives. Using these theoretical results, we outline hybrid architectures for discrete approximations of affine covariant receptive field families, to be used as a first processing layer for affine covariant and affine invariant visual operations at higher levels.

• 216.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Discrete Derivative Approximations with Scale-Space Properties: A Basis for Low-Level Feature Extraction1993Inngår i: Journal of Mathematical Imaging and Vision, ISSN 0924-9907, E-ISSN 1573-7683, Vol. 3, nr 4, s. 349-376Artikkel i tidsskrift (Fagfellevurdert)

This article shows how discrete derivative approximations can be defined so thatscale-space properties hold exactly also in the discrete domain. Starting from a set of natural requirements on the first processing stages of a visual system,the visual front end, it gives an axiomatic derivation of how a multiscale representation of derivative approximations can be constructed from a discrete signal, so that it possesses analgebraic structure similar to that possessed by the derivatives of the traditional scale-space representation in the continuous domain. A family of kernels is derived that constitutediscrete analogues to the continuous Gaussian derivatives.The representation has theoretical advantages over other discretizations of the scale-space theory in the sense that operators that commute before discretizationcommute after discretization. Some computational implications of this are that derivative approximations can be computeddirectly from smoothed data and that this will giveexactly the same result as convolution with the corresponding derivative approximation kernel. Moreover, a number ofnormalization conditions are automatically satisfied.The proposed methodology leads to a scheme of computations of multiscale low-level feature extraction that is conceptually very simple and consists of four basic steps: (i)large support convolution smoothing, (ii)small support difference computations, (iii)point operations for computing differential geometric entities, and (iv)nearest-neighbour operations for feature detection.Applications demonstrate how the proposed scheme can be used for edge detection and junction detection based on derivatives up to order three.

• 217.
KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
Discrete Scale-Space Theory and the Scale-Space Primal Sketch1991Doktoravhandling, monografi (Annet vitenskapelig)

This thesis, within the subfield of computer science known as computer vision, deals with the use of scale-space analysis in early low-level processing of visual information. The main contributions comprise the following five subjects:

• The formulation of a scale-space theory for discrete signals. Previously, the scale-space concept has been expressed for continuous signals only. We propose that the canonical way to construct a scale-space for discrete signals is by convolution with a kernel called the discrete analogue of the Gaussian kernel, or equivalently by solving a semi-discretized version of the diffusion equation. Both the one-dimensional and two-dimensional cases are covered. An extensive analysis of discrete smoothing kernels is carried out for one-dimensional signals and the discrete scale-space properties of the most common discretizations to the continuous theory are analysed.

• A representation, called the scale-space primal sketch, which gives a formal description of the hierarchical relations between structures at different levels of scale. It is aimed at making information in the scale-space representation explicit. We give a theory for its construction and an algorithm for computing it.

• A theory for extracting significant image structures and determining the scales of these structures from this representation in a solely bottom-up data-driven way.

• Examples demonstrating how such qualitative information extracted from the scale-space primal sketch can be used for guiding and simplifying other early visual processes. Applications are given to edge detection, histogram analysis and classification based on local features. Among other possible applications one can mention perceptual grouping, texture analysis, stereo matching, model matching and motion.

• A detailed theoretical analysis of the evolution properties of critical points and blobs in scale-space, comprising drift velocity estimates under scale-space smoothing, a classification of the possible types of generic events at bifurcation situations and estimates of how the number of local extrema in a signal can be expected to decrease as function of the scale parameter. For two-dimensional signals the generic bifurcation events are annihilations and creations of extremum-saddle point pairs. Interpreted in terms of blobs, these transitions correspond to annihilations, merges, splits and creations.

Experiments on different types of real imagery demonstrate that the proposed theory gives perceptually intuitive results.

• 218.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Edge detection2001Inngår i: Encyclopaedia of Mathematics / [ed] Michiel Hazewinkel, Springer , 2001Kapittel i bok, del av antologi (Fagfellevurdert)

Edge detection

An early processing stage in image processing and computer vision, aimed at detecting and characterizing discontinuities in the image domain.

The importance of edge detection for early machine vision is usually motivated from the observation that under rather general assumptions about the image formation process, a discontinuity in image brightness can be assumed to correspond to a discontinuity in either depth, surface orientation, reflectance, or illumination. In this respect, edges in the image domain constitute a strong link to physical properties of the world. A representation of image information in terms of edges is also compact in the sense that the two-dimensional image pattern is represented by a set of one-dimensional curves. For these reasons, edges have been used as main features in a large number of computer vision algorithms.

A non-trivial aspect of edge-based analysis of image data, however, concerns what should be meant by a discontinuity in image brightness. Real-world image data are inherently discrete, and for a function defined on a discrete domain, there is no natural notion of "discontinuity" , and there is no inherent way to judge what are the edges in a given discrete image.

An early approach to edge detection involved the convolution of the image  by a Gaussian kernel , followed by the detection of zero-crossings in the Laplacian response [a1] (cf. also Scale-space theory). However, such edge curves satisfying

give rise to false edges and have poor localization at curved edges.

A more refined approach is the notion of non-maximum suppression [a2], [a3], [a4], where edges are defined as points at which the gradient magnitude assumes a local maximum in the gradient direction. In differential-geometric terms, such edge points can be characterized as points at which [a5]:

i) the second-order directional derivative in the gradient direction is zero; and

ii) the third-order directional derivative in the gradient direction is negative.

In terms of partial derivatives, for a two-dimensional image  this edge definition can be written as

Again, the computation of discrete derivative approximations is preceded by smoothing the image  with a Gaussian kernel, and the choice of different standard deviations of the Gaussian kernel gives rise to edges at different scales (see Scale-space theory or [a5]). While other choices of linear smoothing kernels have also been advocated, their shapes can often be well approximated by Gaussians [a3], [a6], [a7].

Other approaches to edge detection involve the thresholding of edge strength measures, the computation of intensity derivatives from local least squares fitting, and functional minimization (see also [a8]).

A subject which has been given large attention during the 1990s is the replacement of the linear smoothing operation by a non-linear smoothing step, with the goal of avoiding smoothing across object boundaries [a9], [a10].

References:

[a1] D. Marr, E. Hildreth, "Theory of edge detection" Proc. R. Soc. London , 207 (1980) pp. 187–217

[a2] R.M. Haralick, "Digital step edges from zero-crossings of second directional derivatives" IEEE Trans. Pattern Anal. Machine Intell. , 6 (1984)

[a3] J. Canny, "A computational approach to edge detection" IEEE Trans. Pattern Anal. Machine Intell. , 8 : 6 (1986) pp. 679–698

[a4] A.F. Korn, "Toward a symbolic representation of intensity changes in images" IEEE Trans. Pattern Anal. Machine Intell. , 10 : 5 (1988) pp. 610–625

[a5] T. Lindeberg, "Edge detection and ridge detection with automatic scale selection" Internat. J. Computer Vision , 30 : 2 (1998) pp. 117–154

[a6] V. Torre, T.A. Poggio, "On edge detection" IEEE Trans. Pattern Anal. Machine Intell. , 8 : 2 (1980) pp. 147–163

[a7] R. Deriche, "Using Canny's criteria to derive a recursively implemented optimal edge detector" Internat. J. Computer Vision , 1 (1987) pp. 167–187

[a8] R. Jain, et al., "Machine vision" , McGraw-Hill (1995)

[a9] P. Perona, J. Malik, "Scale-space and edge detection using anisotropic diffusion" IEEE Trans. Pattern Anal. Machine Intell. , 12 : 7 (1990) pp. 629–639

[a10] "Geometry-driven diffusion in computer vision" B.M. ter Haar Romeny (ed.) , Kluwer Acad. Publ. (1994)

• 219.
KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
Edge detection and ridge detection with automatic scale selection1998Inngår i: International Journal of Computer Vision, ISSN 0920-5691, E-ISSN 1573-1405, Vol. 30, nr 2, s. 117-154Artikkel i tidsskrift (Fagfellevurdert)

When computing descriptors of image data, the type of information that can be extracted may be strongly dependent on the scales at which the image operators are applied. This article presents a systematic methodology for addressing this problem. A mechanism is presented for automatic selection of scale levels when detecting one-dimensional image features, such as edges and ridges.

A concept of a scale-space edge is introduced, defined as a connected set of points in scale-space at which: (i) the gradient magnitude assumes a local maximum in the gradient direction, and (ii) a normalized measure of the strength of the edge response is locally maximal over scales. An important consequence of this definition is that it allows the scale levels to vary along the edge.

Two specific measures of edge strength are analysed in detail, the gradient magnitude and a differential expression derived from the third-order derivative in the gradient direction. For a certain way of normalizing these differential descriptors, by expressing them in terms of so-called gamma-normalized derivatives, an immediate consequence of this definition is that the edge detector will adapt its scale levels to the local image structure. Specifically, sharp edges will be detected at fine scales so as to reduce the shape distortions due to scale-space smoothing, whereas sufficiently coarse scales will be selected at diffuse edges, such that an edge model is a valid abstraction of the intensity profile across the edge.

Since the scale-space edge is defined from the intersection of two zero-crossing surfaces in scale-space, the edges will by definition form closed curves. This simplifies selection of salient edges, and a novel significance measure is proposed, by integrating the edge strength along the edge. Moreover, the scale information associated with each edge provides useful clues to the physical nature of the edge.

With just slight modifications, similar ideas can be used for formulating ridge detectors with automatic selection, having the characteristic property that the selected scales on a scale-space ridge instead reflect the width of the ridge.

It is shown how the methodology can be implemented in terms of straightforward visual front-end operations, and the validity of the approach is supported by theoretical analysis as well as experiments on real-world and synthetic data.

• 220.
KTH, Tidigare Institutioner (före 2005), Numerisk analys och datalogi, NADA.
Edge detection and ridge detection with automatic scale selection1996Inngår i: Proc Computer Vision and Pattern Recognition (CPR’96), 1996, s. 465-470Konferansepaper (Fagfellevurdert)

When extracting features from image data, the type of information that can be extracted may be strongly dependent on the scales at which the feature detectors are applied. This article presents a systematic methodology for addressing this problem. A mechanism is presented for automatic selection of scale levels when detecting one-dimensional features, such as edges and ridges. A novel concept of a scale-space edge is introduced, defined as a connected set of points in scale-space at which: (i) the gradient magnitude assumes a local maximum in the gradient direction, and (ii) a normalized measure of the strength of the edge response is locally maximal over scales. An important property of this definition is that it allows the scale levels to vary along the edge. Two specific measures of edge strength are analysed in detail. It is shown that by expressing these in terms of γ-normalized derivatives, an immediate consequence of this definition is that fine scales are selected for sharp edges (so as to reduce the shape distortions due to scale-space smoothing), whereas coarse scales are selected for diffuse edges, such that an edge model constitutes a valid abstraction of the intensity profile across the edge. With slight modifications, this idea can be used for formulating a ridge detector with automatic scale selection, having the characteristic property that the selected scales on a scale-space ridge instead reflect the width of the ridge.

• 221.
KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
Effective Scale: A Natural Unit for Measuring Scale-Space Lifetime1993Inngår i: IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 15, nr 10, s. 1068-1074Artikkel i tidsskrift (Fagfellevurdert)

This article shows how a notion of effective scale can be introduced in a formal way. For continuous signals a scaling argument directly gives that a natural unit for measuring scale-space lifetime is in terms of the logarithm of the ordinary scale parameter. That approach is, however, not appropriate for discrete signals, since then an infinite lifetime would be assigned to structures existing in the original signal. Here we show how such an effective scale parameter can be defined as to give consistent results for both discrete and continuous signals. The treatment is based upon the assumption that the probability that a local extremum disappears during a short scale interval should not vary with scale. As a tool for the analysis we give estimates of how the density of local extrema can be expected to vary with scale in the scale-space representation of different random noise signals, both in the continuous and discrete cases.

• 222.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Feature detection with automatic scale selection1998Inngår i: International Journal of Computer Vision, Vol. 30, nr 2, s. 79-116Artikkel i tidsskrift (Fagfellevurdert)

The fact that objects in the world appear in different ways depending on the scale of observation has important implications if one aims at describing them. It shows that the notion of scale is of utmost importance when processing unknown measurement data by automatic methods. Whereas scale-space representation provides a well-founded framework for dealing with this issue by representing image structures at different scales, traditional scale-space theory does not address the problem of how to selectlocal appropriate scales for further analysis.

This article proposes a systematic approach for dealing with this problem---a heuristic principle is presented stating that local extrema over scales of different combinations of gamma-normalized derivatives are likely candidates to correspond to interesting structures. Specifically, it is proposed that this idea can be used as a major mechanism in algorithms for automatic scale selection, which adapt the local scales of processing to the local image structure.

Support is given in terms of a general theoretical investigation of the behaviour of the scale selection method under rescalings of the input pattern and by experiments on real-world and synthetic data. Support is also given by a detailed analysis of how different types of feature detectors perform when integrated with a scale selection mechanism and then applied to characteristic model patterns. Specifically, it is described in detail how the proposed methodology applies to the problems of blob detection, junction detection, edge detection, ridge detection and local frequency estimation.

• 223.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Generalized axiomatic scale-space theory2013Inngår i: Advances in Imaging and Electron Physics, Vol 178 / [ed] P. Hawkes, Elsevier , 2013, Vol. 178, s. 1-96Kapittel i bok, del av antologi (Fagfellevurdert)

A fundamental problem in vision is what types of image operations should be used at the first stages of visual processing. I discuss a principled approach to this problem by describing a generalized axiomatic scale-space theory that makes it possible to derive the notions of linear scale-space, affine Gaussian scale-space, and linear spatio-temporal scale-space using similar sets of assumptions (scale-space axioms).

Based on a requirement that new image structures should not be created with increasing scale formalized into a condition of non-enhancement of local extrema, a complete classification is given of the linear (Gaussian) scale-space concepts that satisfy these conditions on isotropic spatial, non-isotropic spatial, and spatio-temporal domains, which results in a general taxonomy of Gaussian scale-spaces for continuous image data. The resulting theory allows filter shapes to be tuned from specific context information and provides a theoretical foundation for the recently exploited mechanisms of affine shape adaptation and Galilean velocity adaptation, with highly useful applications in computer vision. It is also shown how time-causal and time-recursive spatio-temporal scale-space concepts can be derived from similar or closely related assumptions.

The receptive fields arising from the spatial, spatio-chromatic, and spatio-temporal derivatives resulting from these scale-space concepts can be used as a general basis for expressing image operations for a large class of computer vision or image analysis methods. The receptive field profiles generated by necessity from these theories also have close similarities to receptive fields measured by cell recordings in biological vision, specifically regarding space-time separable cells in the retina and the lateral geniculate nucleus (LGN), as well as both space-time separable and non-separable cells in the striate cortex (V1) of higher mammals.

• 224.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Generalized Gaussian Scale-Space Axiomatics Comprising Linear Scale-Space, Affine Scale-Space and Spatio-Temporal Scale-Space2011Inngår i: Journal of Mathematical Imaging and Vision, ISSN 0924-9907, E-ISSN 1573-7683, Vol. 40, nr 1, s. 36-81Artikkel i tidsskrift (Fagfellevurdert)

This paper describes a generalized axiomatic scale-space theory that makes it possible to derive the notions of linear scale-space, affine Gaussian scale-space and linear spatio-temporal scale-space using a similar set of assumptions (scale-space axioms). The notion of non-enhancement of local extrema is generalized from previous application over discrete and rotationally symmetric kernels to continuous and more general non-isotropic kernels over both spatial and spatio-temporal image domains. It is shown how a complete classification can be given of the linear (Gaussian) scale-space concepts that satisfy these conditions on isotropic spatial, non-isotropic spatial and spatio-temporal domains, which results in a general taxonomy of Gaussian scale-spaces for continuous image data. The resulting theory allows filter shapes to be tuned from specific context information and provides a theoretical foundation for the recently exploited mechanisms of shape adaptation and velocity adaptation, with highly useful applications in computer vision. It is also shown how time-causal spatio-temporal scale-spaces can be derived from similar assumptions. The mathematical structure of these scale-spaces is analyzed in detail concerning transformation properties over space and time, the temporal cascade structure they satisfy over time as well as properties of the resulting multi-scale spatio-temporal derivative operators. It is also shown how temporal derivatives with respect to transformed time can be defined, leading to the formulation of a novel analogue of scale normalized derivatives for time-causal scale-spaces. The kernels generated from these two types of theories have interesting relations to biological vision. We show how filter kernels generated from the Gaussian spatio-temporal scale-space as well as the time-causal spatio-temporal scale-space relate to spatio-temporal receptive field profiles registered from mammalian vision. Specifically, we show that there are close analogies to space-time separable cells in the LGN as well as to both space-time separable and non-separable cells in the striate cortex. We do also present a set of plausible models for complex cells using extended quasi-quadrature measures expressed in terms of scale normalized spatio-temporal derivatives. The theories presented as well as their relations to biological vision show that it is possible to describe a general set of Gaussian and/or time-causal scale-spaces using a unified framework, which generalizes and complements previously presented scale-space formulations in this area.

• 225.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Image matching using generalized scale-space interest points2015Inngår i: Journal of Mathematical Imaging and Vision, ISSN 0924-9907, E-ISSN 1573-7683, Vol. 52, nr 1, s. 3-36Artikkel i tidsskrift (Fagfellevurdert)

The performance of matching and object recognition methods based on interest points depends on both the properties of the underlying interest points and the choice of associated image descriptors. This paper demonstrates advantages of using generalized scale-space interest point detectors in this context for selecting a sparse set of points for computing image descriptors for image-based matching.

For detecting interest points at any given scale, we make use of the Laplacian, the determinant of the Hessian and four new unsigned or signed Hessian feature strength measures, which are defined by generalizing the definitions of the Harris and Shi-and-Tomasi operators from the second moment matrix to the Hessian matrix. Then, feature selection over different scales is performed either by scale selection from local extrema over scale of scale-normalized derivates or by linking features over scale into feature trajectories and computing a significance measure from an integrated measure of normalized feature strength over scale.

A theoretical analysis is presented of the robustness of the differential entities underlying these interest points under image deformations, in terms of invariance properties under affine image deformations or approximations thereof. Disregarding the effect of the rotationally symmetric scale-space smoothing operation, the determinant of the Hessian is a truly affine covariant differential entity and two of the new Hessian feature strength measures have a major contribution from the affine covariant determinant of the Hessian, implying that local extrema of these differential entities will bemore robust under affine image deformations than local extrema of the Laplacian operator or the two other new Hessian feature strength measures.

It is shown how these generalized scale-space interest points allow for a higher ratio of correct matches and a lower ratio of false matches compared to previously known interest point detectors within the same class. The best results are obtained using interest points computed with scale linking and with the new Hessian feature strength measures and the determinant of the Hessian being the differential entities that lead to the best matching performance under perspective image transformations with significant foreshortening, and better than the more commonly used Laplacian operator, its difference-of-Gaussians approximation or the Harris-Laplace operator.

We propose that these generalized scale-space interest points, when accompanied by associated local scale-invariant image descriptors, should allow for better performance of interest point based methods for image-based matching, object recognition and related visual tasks.

• 226.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Image matching using generalized scale-space interest points2013Inngår i: Scale Space and Variational Methods in Computer Vision: 4th International Conference, SSVM 2013, Schloss Seggau, Leibnitz, Austria, , June 2-6, 2013, Proceedings / [ed] A. Kuijper et al, Springer Berlin/Heidelberg, 2013, Vol. 7893, s. 355-367Konferansepaper (Fagfellevurdert)

The performance of matching and object recognition methods based on interest points depends on both the properties of the underlying interest points and the associated image descriptors. This paper demonstrates the advantages of using generalized scale-space interest point detectors when computing image descriptors for image-based matching. These generalized scale-space interest points are based on linking of image features over scale and scale selection by weighted averaging along feature trajectories over scale and allow for a higher ratio of correct matches and a lower ratio of false matches compared to previously known interest point detectors within the same class. Specifically, it is shown how a significant increase in matching performance can be obtained in relation to the underlying interest point detectors in the SIFT and the SURF operators. We propose that these generalized scale-space interest points when accompanied by associated scale-invariant image descriptors should allow for better performance of interest point based methods for image-based matching, object recognition and related vision tasks.

• 227.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Invariance of visual operations at the level of receptive fields2013Konferansepaper (Fagfellevurdert)

Receptive field profiles measured by cell recordings have shown that mammalian vision has developed receptive fields tuned to different sizes and orientations in the image domain as well as to different image velocities in space-time [1, 2]. This article presents a theory by which families of idealized receptive field profiles can be derived mathematically from a small set of basic assumptions that correspond to structural properties of the environment [3, 4]. The article also presents a theory for how basic invariance properties to variations in scale, viewing direction and relative motion can be obtained from the output of such receptive fields, using complementary selection mechanisms that operate over the output of families of receptive fields tuned to different parameters [4]. Thereby, the theory shows how basic invariance properties of a visual system can be obtained already at the level of receptive fields, and we can explain the different shapes of receptive field profiles found in biological vision from a requirement that the visual system should be invariant to the natural types of image transformations that occur in its environment.

Model.

The brain is able to maintain a stable perception although the visual stimuli vary substantially on the retina due to geometric transformations and lighting variations in the environment. These transformations comprise (i) local scaling transformations caused by objects of different size and at different distances to the observer, (ii) locally linearized image deformations caused by variations in the viewing direction in relation to the object, (iii) locally linearized relative motions between the object and the observer and (iv) local multiplicative intensity transformations caused by illumination variations. Let us assume that receptive fields should be constructed by linear operations that are shift-invariant over space and/or space-time, with an additional requirement that receptive fields must not create new image structures at coarser scales that do not correspond to simplifications of corresponding structures at finer scales.

Results.

Given the above structural conditions, we derive idealized families of spatial and spatio-temporal receptive fields that satisfy these structural requirements by necessity, based on Gaussian kernels, Gaussian derivatives or closely related operators [3, 4].  We show that there are very close similarities between the receptive fields predicted from this theory and receptive fields found by cell recordings in biological vision, including (i) spatial on-center-off-surround and off-center-on-surround receptive fields in the fovea and the LGN, (ii) simple cells with spatial directional preference in V1, (iii) space-time separable spatio-temporal receptive fields in the LGN and V1 and (iv) non-separable space-time tilted receptive fields in V1 [3, 4]. Indeed, from kernels predicted by this theory it is possible to generate receptive fields similar to all the basic types of monocular receptive fields reported by DeAngelis et al [2] in their survey of classical receptive fields.

By complementing such receptive field measurements with selection mechanisms over the parameters in the receptive field families, we show how true invariance of receptive field responses can be obtained under scaling transformations, affine transformations and Galilean transformations [4]. Thereby, the framework provides a mathematically well-founded and biologically plausible model for how basic invariance properties can be achieved already at the level of receptive fields. In this way, the presented theory supports invariant recognition of objects and events under variations in viewpoint, retinal size, object motion and illumination.

References.

1. Hubel DH, Wiesel TN: Brain and Visual Perception, Oxford University Press, 2005.

2. DeAngelis GC, Anzai A: A modern view of the classical receptive field: Linear and non-linear spatio-temporal processing by V1 neurons. The Visual Neurosciences, MIT Press, vol 1, 705-719, 2004.

3. Lindeberg T: Generalized Gaussian scale-space axiomatics comprising linear scale-space, affine scale-space and spatio-temporal scale-space. J Math Imaging Vis, 2011, 40(1):36-81.

4. Lindeberg T: Invariance of visual operations at the level of receptive fields. PLOS One, in press, doi:10.1371/journal.pone.0066990, preprint available from arXiv:1210.0754.

• 228.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Invariance of visual operations at the level of receptive fields2013Inngår i: PLoS ONE, ISSN 1932-6203, E-ISSN 1932-6203, Vol. 8, nr 7, s. e66990-1-e66990-33Artikkel i tidsskrift (Fagfellevurdert)

The brain is able to maintain a stable perception although the visual stimuli vary substantially on the retina due to geometric transformations and lighting variations in the environment. This paper presents a theory for achieving basic invariance properties already at the level of receptive fields. Specifically, the presented framework comprises (i) local scaling transformations caused by objects of different size and at different distances to the observer, (ii) locally linearized image deformations caused by variations in the viewing direction in relation to the object, (iii) locally linearized relative motions between the object and the observer and (iv) local multiplicative intensity transformations caused by illumination variations. The receptive field model can be derived by necessity from symmetry properties of the environment and leads to predictions about receptive field profiles in good agreement with receptive field profiles measured by cell recordings in mammalian vision. Indeed, the receptive field profiles in the retina, LGN and V1 are close to ideal to what is motivated by the idealized requirements. By complementing receptive field measurements with selection mechanisms over the parameters in the receptive field families, it is shown how true invariance of receptive field responses can be obtained under scaling transformations, affine transformations and Galilean transformations. Thereby, the framework provides a mathematically well-founded and biologically plausible model for how basic invariance properties can be achieved already at the level of receptive fields and support invariant recognition of objects and events under variations in viewpoint, retinal size, object motion and illumination. The theory can explain the different shapes of receptive field profiles found in biological vision, which are tuned to different sizes and orientations in the image domain as well as to different image velocities in space-time, from a requirement that the visual system should be invariant to the natural types of image transformations that occur in its environment.

• 229.
KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
Junction detection with automatic selection of detection scales and localization scales1994Inngår i: Proc. 1st International Conference on Image Processing: ICIP'94 (Austin, Texas), 1994, s. I:924-928Konferansepaper (Fagfellevurdert)

The subject of scale selection is essential to many aspects of multi-scale and multi-resolution processing of image data. This article shows how a general heuristic principle for scale selection can be applied to the problem of detecting and localizing junctions. In a first uncommitted processing step initial hypotheses about interesting scale levels (and regions of interest) are generated from scales where normalized differential invariants assume maxima over scales (and space). Then, based on this scale (and region) information, a more refined processing stage is invoked tuned to the task at hand. The resulting method is the first junction detector with automatic scale selection.

Whereas this article deals with the specific problem of junction detection, the underlying ideas apply also to other types of differential feature detectors, such as blob detectors, edge detectors, and ridge detectors.

• 230.
KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
Linear spatio-temporal scale-space2001Rapport (Annet vitenskapelig)

This article shows how a linear scale-space formulation previously expressed for spatial domains extends to spatio-temporal data. Starting from the main assumptions that: (i) the scale-space should be generated by convolution with a semi-group of filter kernels and that (ii) local extrema must not be enhanced when the scale parameter increases, a complete taxonomy is given of the linear scale-space concepts that satisfy these conditions on spatial, temporal and spatio-temporal domains, including the cases with continuous as well as discrete data.

Key aspects captured by this theory include that: (i) time-causal scale-space kernels must not extend into the future, (ii) filter shapes can be tuned from specific context information, permitting mechanisms such local shifting, shape adaptation and velocity adaptation, all expressed in terms of local diffusion operations.

Receptive field profiles generated by the proposed theory show high qualitative similarities to receptive field profiles recorded from biological vision.

• 231.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Linear spatio-temporal scale-space1997Inngår i: Scale-Space Theory in Computer Vision: Proceedings of First International Conference, Scale-Space'97 Utrecht, The Netherlands, July 2–4, 1997, Springer Berlin/Heidelberg, 1997, Vol. 1252, s. 113-127Konferansepaper (Fagfellevurdert)

This article shows how a linear scale-space formulation previously expressed for spatial domains extends to spatio-temporal data. Starting from the main assumptions that: (i) the scale-space should be generated by convolution with a semi-group of filter kernels and that (ii) local extrema must not be enhanced when the scale parameter increases, a complete taxonomy is given of the linear scale-space concepts that satisfy these conditions on spatial, temporal and spatio-temporal domains, including the cases with continuous as well as discrete data.

Key aspects captured by this theory include that: (i) time-causal scale-space kernels must not extend into the future, (ii) filter shapes can be tuned from specific context information, permitting mechanisms such local shifting, shape adaptation and velocity adaptation, all expressed in terms of local diffusion operations.

Receptive field profiles generated by the proposed theory show high qualitative similarities to receptive field profiles recorded from biological vision.

• 232.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
Normative theory of visual receptive fields2017Rapport (Annet vitenskapelig)

This article gives an overview of a normative computational theory of visual receptive fields, by which idealized shapes of early spatial, spatio-chromatic and spatio-temporal receptive fields can be derived in an axiomatic way based on structural properties of the environment in combination with assumptions about the internal structure of a vision system to guarantee consistent handling of image representations over multiple spatial and temporal scales. Interestingly, this theory leads to predictions about visual receptive field shapes with qualitatively very good similarity to biological receptive fields measured in the retina, the LGN and the primary visual cortex (V1) of mammals.

• 233.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
On Automatic Selection of Temporal Scales in Time-Casual Scale-Space1997Inngår i: Proceedings of the Algebraic Frames for the Perception-Action Cycle: AFPAC'97 (Kiel, Germany), 1997, Vol. 1315, s. 94-113Konferansepaper (Fagfellevurdert)

This paper outlines a general framework for automatic selection in multi-scale representations of temporal and spatio-temporal data, A general principle for automatic scale selection based on local maxima of normalized differential entities is adapted to the temporal domain, and it is shown how the notion of normalized derivatives can be defined for three main types of (continuous and discrete) temporal scale-space representations. Closed-form analysis is carried out for basic model patterns, and shows how the suggested theory applies to motion detection and motion estimation.

• 234.
KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
On scale selection for differential operators1993Inngår i: Proc. 8th Scandinavian Conference on Image Analysis, (Troms, Norway), May 1993,: SCIA'93, 1993, s. 857-866Konferansepaper (Fagfellevurdert)

Although traditional scale-space theory provides a well-founded framework for dealing with image structures at different scales, it does not directly address the problem of how to selectappropriate scales for further analysis.

This paper describes a systematic approach for dealing with this problem---a heuristic principle stating that local extrema over scales of different combinations of normalized scale invariant derivatives are likely candidates to correspond to interesting structures. Support is given by theoretical considerations and experiments on real and synthetic data.

The resulting methodology lends itself naturally to two-stage algorithms; feature detection at coarse scales followed by feature localization at finer scales. Experiments on blob detection, junction detection and edge detection demonstrate that the proposed method gives intuitively reasonable results.

• 235.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
On the axiomatic foundations of linear scale-space: Combining semi-group structure with causality vs. scale invariance1996Inngår i: Gaussian Scale-Space Theory: Proceedings of PhD School on Scale-Space (Copenhagen, Denmark) May 1996 / [ed] J. Sporring, M. Nielsen, L. Florack and P. Johansen, Kluwer Academic Publishers, 1996Kapittel i bok, del av antologi (Fagfellevurdert)

The notion of multi-scale representation is essential to many aspects of early visual processing. This article deals with the axiomatic formulation of the special type of multi-scale representation known as scale-space representation. Specifically, this work is concerned with the problem of how different choices of basic assumptions (scale-space axioms) restrict the class of permissible smoothing operations.

A scale-space formulation previously expressed for discrete signals is adapted to the continuous domain. The basic assumptions are that the scale-space family should be generated by convolution with a one-parameter family of rotationally symmetric smoothing kernels that satisfy a semi-group structure and obey a causality condition expressed as a non-enhancement requirement of local extrema. Under these assumptions, it is shown that the smoothing kernel is uniquely determined to be a Gaussian.

Relations between this scale scale-space formulation and recent formulations based on scale invariance are explained in detail. Connections are also pointed out to approaches based on non-uniform smoothing.

• 236.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
On the behaviour in scale-space of local extrema and blobs1991Inngår i: Theory and Applications of Image Analysis: Selected Papers from the 7th Scandinavian Conference on Image Analysis (Aalborg, Denmark, 1991) / [ed] P. Johansen and S. Olsen, World Scientific, 1991, s. 38-47Kapittel i bok, del av antologi (Fagfellevurdert)

We apply elementary techniques from real analysis and singularity theory to derive analytical results for the behaviour in scale-space of critical points and related entities. The main results of the treatment comprise:

• a description of the general nature of trajectories of critical points in scale-space.
• an estimation of the drift velocity of critical points and edges.
• an analysis of the qualitative behaviour of critical points in bifurcation situations.
• a classification of what types of blob bifurcations are possible.
• 237.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
On the Construction of a Scale-Space for Discrete Images1988Rapport (Annet vitenskapelig)

In this paper we address the formulation of a scale-space theory for discrete images. We denote a one-dimensional kernel a scale-space kernel if it reduces the number of local extrema and discuss which discrete kernels are possible scale-space kernels. Unimodality and positivity properties are shown to hold for such kernels as well as their Fourier transforms. An explicit expression characterizing all discrete scale-space kernels is given.

We propose that there is only one reasonable way to define a scale-space family of images L(x; t) for a one-dimensional discrete signal f(x) namely by convolution with the family of discrete kernels T(n; t) = e^(-t) I_nt(t) where I_n is the modified Bessel function of order n.

With this representation, comprising a continuous scale parameter, we are no longer restricted to specific predetermined levels of scale. Further, T(n; t) appears naturally in the solution of a discretized version of the heat equation, both in one and two dimensions.

The family T(n; t) (t >= 0) is the only one-parameter family of discrete symmetric shift-invariant kernels satisfying both necessary scale-space requirements and the semigroup property T(n; s) * T(n; t) = T(n; s+t). Similar arguments applied in the continuous case uniquely lead to the family of Gaussian kernels.

The commonly adapted technique with a sampled Gaussian produces undesirable effects. It is shown that scale-space violations might occur in the family of functions generated by convolution with the sampled Gaussian kernel. The result exemplifies that properties derived in the continuous case might be violated after discretization.

A discussion about the numerical implementation is performed and an algorithm generating the filter coefficients is supplied.

• 238.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Principles for Automatic Scale Selection1999Inngår i: Handbook on Computer Vision and Applications: volume II / [ed] Berndt Jähne, Academic Press, 1999, s. 239-274Kapittel i bok, del av antologi (Annet vitenskapelig)

An inherent property of objects in the world is that they only exist as meaningful entities over certain ranges of scale. If one aims at describing the structure of unknown real-world signals, then a multi-scale representation of data is of crucial importance. Whereas conventional scale-space theory provides a well-founded framework for dealing with image structures at different scales, this theory does not directly address the problem of how to selectappropriate scales for further analysis. This article outlines a systematic methodology of how mechanisms for automatic scale selection can be formulated in the problem domains of feature detection and image matching (flow estimation), respectively.

For feature detectors expressed in terms of Gaussian derivatives, hypotheses about interesting scale levels can be generated from scales at which normalized measures of feature strength assume local maxima with respect to scale. It is shown how the notion of $\gamma$-normalized derivatives arises by necessity given the requirement that the scale selection mechanism should commute with rescalings of the image pattern. Specifically, it is worked out in detail how feature detection algorithms with automatic scale selection can be formulated for the problems of edge detection, blob detection, junction detection, ridge detection and frequency estimation. A general property of this scheme is that the selected scale levels reflect the size of the image structures.

When estimating image deformations, such as in image matching and optic flow computations, scale levels with associated deformation estimates can be selected from the scales at which normalized measures of uncertainty assume local minima with respect to scales. It is shown how an integrated scale selection and flow estimation algorithm has the qualitative properties of leading to the selection of coarser scales for larger size image structures and increasing noise level, whereas it leads to the selection of finer scales in the neighbourhood of flow field discontinuities.

• 239.
KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
Provably scale-covariant continuous hierarchical networks based on scale-normalized differential expressions coupled in cascade2019Rapport (Annet vitenskapelig)

This article presents a theory for constructing hierarchical networks in such a way that the networks are guaranteed to be provably scale covariant. We first present a general sufficiency argument for obtaining scale covariance, which holds for a wide class of networks defined from linear and non-linear differential expressions expressed in terms of scale-normalized scale-space derivatives. Then, we present a more detailed development of one example of such a network constructed from a combination of mathematically derived models of receptive fields and biologically inspired computations. Based on a functional model of complex cells in terms of an oriented quasi quadrature combination of first- and second-order directional Gaussian derivatives, we couple such primitive computations in cascade over combinatorial expansions over image orientations. Scale-space properties of the computational primitives are analysed and we give explicit proofs of how the resulting representation allows for scale and rotation covariance. A prototype application to texture analysis is developed and it is demonstrated that a simplified mean-reduced representation of the resulting QuasiQuadNet leads to promising experimental results on three texture datasets.

• 240.
KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
Provably scale-covariant networks from oriented quasi quadrature measures in cascade2019Inngår i: Scale Space and Variational Methods in Computer Vision / [ed] M. Burger, J. Lellmann and J. Modersitzki, Springer Berlin/Heidelberg, 2019, Vol. 11603, s. 328-340Konferansepaper (Fagfellevurdert)

This article presents a continuous model for hierarchical networks based on a combination of mathematically derived models of receptive fields and biologically inspired computations.

Based on a functional model of complex cells in terms of an oriented quasi quadrature combination of first- and second-order directional Gaussian derivatives, we couple such primitive computations in cascade over combinatorial expansions over image orientations. Scale-space properties of the computational primitives are analysed and it is shown that the resulting representation allows for provable scale and rotation covariance.

A prototype application to texture analysis is developed and it is demonstrated that a simplified mean-reduced representation of the resulting QuasiQuadNet leads to promising experimental results on three texture datasets.

• 241.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Scale invariant feature transform2012Annet (Fagfellevurdert)

Scale Invariant Feature Transform (SIFT) is an image descriptor for image-based matching developed by David Lowe (1999,2004). This descriptor as well as related image descriptors are used for a large number of purposes in computer vision related to point matching between different views of a 3-D scene and view-based object recognition. The SIFT descriptor is invariant to translations, rotations and scaling transformations in the image domain and robust to moderate perspective transformations and illumination variations. Experimentally, the SIFT descriptor has been proven to be very useful in practice for robust image matching and object recognition under real-world conditions.

In its original formulation, the SIFT descriptor comprised a method for detecting interest points from a grey-level image at which statistics of local gradient directions of image intensities were accumulated to give a summarizing description of the local image structures in a local neighbourhood around each interest point, with the intention that this descriptor should be used for matching corresponding interest points between different images. Later, the SIFT descriptor has also been applied at dense grids (dense SIFT) which have been shown to lead to better performance for tasks such as object categorization and texture classification. The SIFT descriptor has also been extended from grey-level to colour images and from 2-D spatial images to 2+1-D spatio-temporal video.

• 242.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Scale selection2014Inngår i: Computer Vision: A Reference Guide / [ed] Katsushi Ikeuchi, Springer US , 2014, s. 701-713Kapittel i bok, del av antologi (Fagfellevurdert)

The notion of scale selection refers to methods for estimating characteristic scales in image data and for automatically determining locally appropriate scales in a scale-space representation, so as to adapt subsequent processing to the local image structure and compute scale invariant image features and image descriptors.

An essential aspect of the approach is that it allows for a bottom-up determination of inherent scales of features and objects without first recognizing them or delimiting alternatively segmenting them from their surrounding.

Scale selection methods have also been developed from other viewpoints of performing noise suppression and exploring top-down information.

• 243.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Scale Selection Properties of Generalized Scale-Space Interest Point Detectors2013Inngår i: Journal of Mathematical Imaging and Vision, ISSN 0924-9907, E-ISSN 1573-7683, Vol. 46, nr 2, s. 177-210Artikkel i tidsskrift (Fagfellevurdert)

Scale-invariant interest points have found several highly successful applications in computer vision, in particular for image-based matching and recognition. This paper presents a theoretical analysis of the scale selection properties of a generalized framework for detecting interest points from scale-space features presented in Lindeberg (Int. J. Comput. Vis. 2010, under revision) and comprising: an enriched set of differential interest operators at a fixed scale including the Laplacian operator, the determinant of the Hessian, the new Hessian feature strength measures I and II and the rescaled level curve curvature operator, as well as an enriched set of scale selection mechanisms including scale selection based on local extrema over scale, complementary post-smoothing after the computation of non-linear differential invariants and scale selection based on weighted averaging of scale values along feature trajectories over scale. A theoretical analysis of the sensitivity to affine image deformations is presented, and it is shown that the scale estimates obtained from the determinant of the Hessian operator are affine covariant for an anisotropic Gaussian blob model. Among the other purely second-order operators, the Hessian feature strength measure I has the lowest sensitivity to non-uniform scaling transformations, followed by the Laplacian operator and the Hessian feature strength measure II. The predictions from this theoretical analysis agree with experimental results of the repeatability properties of the different interest point detectors under affine and perspective transformations of real image data. A number of less complete results are derived for the level curve curvature operator.

• 244.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Scale-Space2009Inngår i: Wiley Encyclopedia of Computer Science and Engineering / [ed] Benjamin Wah, Hoboken, New Jersey: John Wiley & Sons, 2009, s. 2495-2504Kapittel i bok, del av antologi (Fagfellevurdert)

Scale-space theory is a framework for multiscale image representation, which has been developed by the computer vision community with complementary motivations from physics and biologic vision. The idea is to handle the multiscale nature of real-world objects, which implies that object may be perceived in different ways depending on the scale of observation. If one aims to develop automatic algorithms for interpreting images of unknown scenes, no way exists to know a priori what scales are relevant. Hence, the only reasonable approach is to consider representations at all scales simultaneously. From axiomatic derivations is has been shown that given the requirement that coarse-scale representations should correspond to true simplifications of fine scale structures, convolution with Gaussian kernels and Gaussian derivatives is singled out as a canonical class of image operators forthe earliest stages of visual processing. These image operators can be used as basis to solve a large variety of visual tasks, including feature detection, feature classification, stereo matching, motion descriptors, shape cues, and image-based recognition. By complementing scale-space representation with a module for automatic scale selection based on the maximization of normalized derivatives over scales, early visual modules can be made scale invariant. In this way, visual modules canadapt automatically to the unknown scale variations that may occur because of objects and substructures of varying physical size as well as objects with varying distances to the camera. An interesting similarity to biologic vision is that the scale-space operators resemble closely receptive field profiles registered in neurophysiologic studies of the mammalian retina and visual cortex.

• 245.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Scale-Space Behaviour and Invariance Properties of Differential Singularities1992Inngår i: Shape inPicture: Mathematical Description of Shape in Grey-Level Images: Proc. of Workshop in Driebergen, Netherlands, Sep. 7--11, 1992, Springer, 1992, s. 591-600Konferansepaper (Fagfellevurdert)

This article describes how a certain way of expressing low-level feature detectors, in terms of singularities of differential expressions defined at multiple scales in scale-space, simplifies the analysis of the effect of smoothing. It is shown how such features can be related across scales, and generally valid expressions for drift velocities are derived with examples concerning edges, junctions, Laplacean zero-crossings, and blobs. A number of invariance properties are pointed out, and a particular representation defined from such singularities, the scale-space primal sketch, is treated in more detail.

• 246.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Scale-space for discrete images1989Inngår i: Scandinavian Conference on Image Analysis: SCIA'89 (Oulo, Finland), 1989, s. 1098-1107Konferansepaper (Fagfellevurdert)

This article addresses the formulation of a scale-space theory for one-dimensional discrete images. Two main subjects are treated:

1. Which linear transformations remove structure in the sense that the number of local extrema (or zero-crossings) in the output image does not exceed the number of local extrema (or zero-crossings) in the original image?
2. How should one create a multi-resolution family of representations with the property that an image at a coarser level of scale never contains more structure than an image at a finer level of scale?

We propose that there is only one reasonable way to define a scale-space for discrete images comprising a continuous scale parameter, namely by (discrete) convolution with the family of kernels T(n; t) = e^{-t} I_n(t),, where $I_n$ are the modified Bessel functions of integer order. Similar arguments applied in the continuous case uniquely lead to the Gaussian kernel.

Some obvious discretizations of the continuous scale-space theory are discussed in view of the results presented. An important result is that scale-space violations might occur in the family of representations generated by discrete convolution with the sampled Gaussian kernel.

• 247.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Scale-space for discrete signals1990Inngår i: IEEE Transaction on Pattern Analysis and Machine Intelligence, ISSN 0162-8828, E-ISSN 1939-3539, Vol. 12, nr 3, s. 234-254Artikkel i tidsskrift (Fagfellevurdert)

This article addresses the formulation of a scale-space theory for discrete signals. In one dimension it is possible to characterize the smoothing transformations completely and an exhaustive treatment is given, answering the following two main questions:

• Which linear transformations remove structure in the sense that the number of local extrema (or zero-crossings) in the output signal does not exceed the number of local extrema (or zero-crossings) in the original signal?
• How should one create a multi-resolution family of representations with the property that a signal at a coarser level of scale never contains more structure than a signal at a finer level of scale?

It is proposed that there is only one reasonable way to define a scale-space for 1D discrete signals comprising a continuous scale parameter, namely by (discrete) convolution with the family of kernels T(n; t) = e^{-t} I_n(t), where I_n are the modified Bessel functions of integer order. Similar arguments applied in the continuous case uniquely lead to the Gaussian kernel.

Some obvious discretizations of the continuous scale-space theory are discussed in view of the results presented. It is shown that the kernel T(n; t) arises naturally in the solution of a discretized version of the diffusion equation. The commonly adapted technique with a sampled Gaussian can lead to undesirable effects since scale-space violations might occur in the corresponding representation. The result exemplifies the fact that properties derived in the continuous case might be violated after discretization.

A two-dimensional theory, showing how the scale-space should be constructed for images, is given based on the requirement that local extrema must not be enhanced, when the scale parameter is increased continuously. In the separable case the resulting scale-space representation can be calculated by separated convolution with the kernel T(n; t).

The presented discrete theory has computational advantages compared to a scale-space implementation based on the sampled Gaussian, for instance concerning the Laplacian of the Gaussian. The main reason is that the discrete nature of the implementation has been taken into account already in the theoretical formulation of the scale-space representation.

• 248.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Scale-Space for N-dimensional discrete signals1992Inngår i: Shape inPicture: Mathematical Description of Shape in Grey-Level Images: Proc. of Workshop in Driebergen, Netherlands, Sep. 7--11, 1992, Springer, 1992, s. 571-590Konferansepaper (Fagfellevurdert)

This article shows how a (linear) scale-space representation can be defined for discrete signals of arbitrary dimension. The treatment is based upon the assumptions that (i) the scale-space representation should be defined by convolving the original signal with a one-parameter family of symmetric smoothing kernels possessing a semi-group property, and (ii) local extrema must not be enhanced when the scale parameter is increased continuously.

It is shown that given these requirements the scale-space representation must satisfy the differential equation \partial_t L = A L for some linear and shift invariant operator A satisfying locality, positivity, zero sum, and symmetry conditions. Examples in one, two, and three dimensions illustrate that this corresponds to natural semi-discretizations of the continuous (second-order) diffusion equation using different discrete approximations of the Laplacean operator. In a special case the multi-dimensional representation is given by convolution with the one-dimensional discrete analogue of the Gaussian kernel along each dimension.

• 249.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Scale-space theory2001Inngår i: Encyclopaedia of Mathematics / [ed] Michiel Hazewinkel, Springer , 2001Kapittel i bok, del av antologi (Fagfellevurdert)

Scale-space theory

A theory of multi-scale representation of sensory data developed by the image processing and computer vision communities. The purpose is to represent signals at multiple scales in such a way that fine scale structures are successively suppressed, and a scale parameter  is associated with each level in the multi-scale representation.

For a given signal , a linear scale-space representation is a family of derived signals , defined by  and

for some family  of convolution kernels [a1], [a2] (cf. also Integral equation of convolution type). An essential requirement on the scale-space family  is that the representation at a coarse scale constitutes a simplification of the representations at finer scales. Several different ways of formalizing this requirement about non-creation of new structures with increasing scales show that the Gaussian kernel

constitutes a canonical choice for generating a scale-space representation [a3], [a4], [a5], [a6]. Equivalently, the scale-space family satisfies the diffusion equation

The motivation for generating a scale-space representation of a given data set originates from the basic fact that real-world objects are composed of different structures at different scales and may appear in different ways depending on the scale of observation. For example, the concept of a "tree" is appropriate at the scale of meters, while concepts such as leaves and molecules are more appropriate at finer scales. For a machine vision system analyzing an unknown scene, there is no way to know what scales are appropriate for describing the data. Thus, the only reasonable approach is to consider descriptions at all scales simultaneously [a1], [a2].

From the scale-space representation, at any level of scale one can define scale-space derivatives by

where  and  constitute multi-index notation for the derivative operator . Such Gaussian derivative operators provide a compact way to characterize the local image structure around a certain image point at any scale. Specifically, the output from scale-space derivatives can be combined into multi-scale differential invariants, to serve as feature detectors (see Edge detection and Corner detection for two examples).

More generally, a scale-space representation with its Gaussian derivative operators can serve as a basis for expressing a large number of early visual operations, including feature detection, stereo matching, computation of motion descriptors and the computation of cues to surface shape [a3], [a4]. Neuro-physiological studies have shown that there are receptive field profiles in the mammalian retina and visual cortex, which can be well modeled by the scale-space framework [a7].

Pyramid representation [a8] is a predecessor to scale-space representation, constructed by simultaneously smoothing and subsampling a given signal. In this way, computationally highly efficient algorithms can be obtained. A problem noted with pyramid representations, however, is that it is usually algorithmically hard to relate structures at different scales, due to the discrete nature of the scale levels. In a scale-space representation, the existence of a continuous scale parameter makes it conceptually much easier to express this deep structure [a2]. For features defined as zero-crossings of differential invariants, the implicit function theorem (cf. Implicit function) directly defines trajectories across scales, and at those scales where a bifurcation occurs, the local behaviour can be modeled by singularity theory [a3], [a5].

Extensions of linear scale-space theory concern the formulation of non-linear scale-space concepts more committed to specific purposes [a9]. There are strong relations between scale-space theory and wavelet theory (cf. also Wavelet analysis), although these two notions of multi-scale representation have been developed from slightly different premises.

References

[a1] A.P. Witkin, "Scale-space filtering" , Proc. 8th Internat. Joint Conf. Art. Intell. Karlsruhe, West Germany Aug. 1983 (1983) pp. 1019–1022

[a2] J.J. Koenderink, "The structure of images" Biological Cybernetics , 50 (1984) pp. 363–370

[a3] T. Lindeberg, "Scale-space theory in computer vision" , Kluwer Acad. Publ. (1994)

[a4] L.M.J. Florack, "Image structure" , Kluwer Acad. Publ. (1997)[a5]J. Sporring, et al., "Gaussian scale-space theory" , Kluwer Acad. Publ. (1997)

[a6] B.M ter Haar Romeny, et al., "Proc. First Internat. Conf. scale-space" , Lecture Notes Computer Science , 1252 , Springer (1997)

[a7] R.A. Young, "The Gaussian derivative model for spatial vision: Retinal mechanisms" Spatial Vision , 2 (1987) pp. 273–293

[a8] P.J. Burt, E.H. Adelson, "The Laplacian Pyramid as a Compact Image Code" IEEE Trans. Commun. , 9 : 4 (1983) pp. 532–540

[a9] "Geometry-driven diffusion in computer vision" B.M ter Haar Romeny (ed.) , Kluwer Acad. Publ. (1994)

• 250.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Scale-Space Theory: A Basic Tool for Analysing Structures at Different Scales1994Inngår i: Journal of Applied Statistics, ISSN 0266-4763, E-ISSN 1360-0532, Vol. 21, s. 225-270Artikkel i tidsskrift (Fagfellevurdert)

An inherent property of objects in the world is that they only exist as meaningful entities over certain ranges of scale. If one aims at describing the structure of unknown real-world signals, then a multi-scale representation of data is of crucial importance.

This article gives a tutorial review of a special type of multi-scale representation, linear scale-space representation, which has been developed by the computer vision community in order to handle image structures at different scales in a consistent manner. The basic idea is to embed the original signal into a one-parameter family of gradually smoothed signals, in which the fine scale details are successively suppressed.

Under rather general conditions on the type of computations that are to performed at the first stages of visual processing, in what can be termed the visual front end, it can be shown that the Gaussian kernel and its derivatives are singled out as the only possible smoothing kernels. The conditions that specify the Gaussian kernel are, basically, linearity and shift-invariance combined with different ways of formalizing the notion that structures at coarse scales should correspond to simplifications of corresponding structures at fine scales --- they should not be accidental phenomena created by the smoothing method. Notably, several different ways of choosing scale-space axioms give rise to the same conclusion.

The output from the scale-space representation can be used for a variety of early visual tasks; operations like feature detection, feature classification and shape computation can be expressed directly in terms of (possibly non-linear) combinations of Gaussian derivatives at multiple scales. In this sense, the scale-space representation can serve as a basis for early vision.

During the last few decades a number of other approaches to multi-scale representations have been developed, which are more or less related to scale-space theory, notably the theories of pyramids, wavelets and multi-grid methods. Despite their qualitative differences, the increasing popularity of each of these approaches indicates that the crucial notion of scaleis increasingly appreciated by the computer vision community and by researchers in other related fields.

An interesting similarity with biological vision is that the scale-space operators closely resemble receptive field profiles registered in neurophysiological studies of the mammalian retina and visual cortex.

2345678 201 - 250 of 467
Referera
Referensformat
• apa
• harvard1
• ieee
• modern-language-association-8th-edition
• vancouver
• Annet format
Fler format
Språk
• de-DE
• en-GB
• en-US
• fi-FI
• nn-NO
• nn-NB
• sv-SE
• Annet språk
Fler språk
Utmatningsformat
• html
• text
• asciidoc
• rtf
v. 2.35.7
| | | |