Ändra sökning
Avgränsa sökresultatet
3456789 251 - 300 av 476
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
• apa
• harvard1
• ieee
• modern-language-association-8th-edition
• vancouver
• Annat format
Fler format
Språk
• de-DE
• en-GB
• en-US
• fi-FI
• nn-NO
• nn-NB
• sv-SE
• Annat språk
Fler språk
Utmatningsformat
• html
• text
• asciidoc
• rtf
Träffar per sida
• 5
• 10
• 20
• 50
• 100
• 250
Sortering
• Standard (Relevans)
• Författare A-Ö
• Författare Ö-A
• Titel A-Ö
• Titel Ö-A
• Publikationstyp A-Ö
• Publikationstyp Ö-A
• Äldst först
• Nyast först
• Disputationsdatum (tidigaste först)
• Disputationsdatum (senaste först)
• Standard (Relevans)
• Författare A-Ö
• Författare Ö-A
• Titel A-Ö
• Titel Ö-A
• Publikationstyp A-Ö
• Publikationstyp Ö-A
• Äldst först
• Nyast först
• Disputationsdatum (tidigaste först)
• Disputationsdatum (senaste först)
Markera
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
• 251.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Scale-Space2009Ingår i: Wiley Encyclopedia of Computer Science and Engineering / [ed] Benjamin Wah, Hoboken, New Jersey: John Wiley & Sons, 2009, s. 2495-2504Kapitel i bok, del av antologi (Refereegranskat)

Scale-space theory is a framework for multiscale image representation, which has been developed by the computer vision community with complementary motivations from physics and biologic vision. The idea is to handle the multiscale nature of real-world objects, which implies that object may be perceived in different ways depending on the scale of observation. If one aims to develop automatic algorithms for interpreting images of unknown scenes, no way exists to know a priori what scales are relevant. Hence, the only reasonable approach is to consider representations at all scales simultaneously. From axiomatic derivations is has been shown that given the requirement that coarse-scale representations should correspond to true simplifications of fine scale structures, convolution with Gaussian kernels and Gaussian derivatives is singled out as a canonical class of image operators forthe earliest stages of visual processing. These image operators can be used as basis to solve a large variety of visual tasks, including feature detection, feature classification, stereo matching, motion descriptors, shape cues, and image-based recognition. By complementing scale-space representation with a module for automatic scale selection based on the maximization of normalized derivatives over scales, early visual modules can be made scale invariant. In this way, visual modules canadapt automatically to the unknown scale variations that may occur because of objects and substructures of varying physical size as well as objects with varying distances to the camera. An interesting similarity to biologic vision is that the scale-space operators resemble closely receptive field profiles registered in neurophysiologic studies of the mammalian retina and visual cortex.

• 252.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Scale-Space Behaviour and Invariance Properties of Differential Singularities1992Ingår i: Shape inPicture: Mathematical Description of Shape in Grey-Level Images: Proc. of Workshop in Driebergen, Netherlands, Sep. 7--11, 1992, Springer, 1992, s. 591-600Konferensbidrag (Refereegranskat)

This article describes how a certain way of expressing low-level feature detectors, in terms of singularities of differential expressions defined at multiple scales in scale-space, simplifies the analysis of the effect of smoothing. It is shown how such features can be related across scales, and generally valid expressions for drift velocities are derived with examples concerning edges, junctions, Laplacean zero-crossings, and blobs. A number of invariance properties are pointed out, and a particular representation defined from such singularities, the scale-space primal sketch, is treated in more detail.

• 253.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Scale-space for discrete images1989Ingår i: Scandinavian Conference on Image Analysis: SCIA'89 (Oulo, Finland), 1989, s. 1098-1107Konferensbidrag (Refereegranskat)

This article addresses the formulation of a scale-space theory for one-dimensional discrete images. Two main subjects are treated:

1. Which linear transformations remove structure in the sense that the number of local extrema (or zero-crossings) in the output image does not exceed the number of local extrema (or zero-crossings) in the original image?
2. How should one create a multi-resolution family of representations with the property that an image at a coarser level of scale never contains more structure than an image at a finer level of scale?

We propose that there is only one reasonable way to define a scale-space for discrete images comprising a continuous scale parameter, namely by (discrete) convolution with the family of kernels T(n; t) = e^{-t} I_n(t),, where $I_n$ are the modified Bessel functions of integer order. Similar arguments applied in the continuous case uniquely lead to the Gaussian kernel.

Some obvious discretizations of the continuous scale-space theory are discussed in view of the results presented. An important result is that scale-space violations might occur in the family of representations generated by discrete convolution with the sampled Gaussian kernel.

• 254.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Scale-space for discrete signals1990Ingår i: IEEE Transaction on Pattern Analysis and Machine Intelligence, ISSN 0162-8828, E-ISSN 1939-3539, Vol. 12, nr 3, s. 234-254Artikel i tidskrift (Refereegranskat)

This article addresses the formulation of a scale-space theory for discrete signals. In one dimension it is possible to characterize the smoothing transformations completely and an exhaustive treatment is given, answering the following two main questions:

• Which linear transformations remove structure in the sense that the number of local extrema (or zero-crossings) in the output signal does not exceed the number of local extrema (or zero-crossings) in the original signal?
• How should one create a multi-resolution family of representations with the property that a signal at a coarser level of scale never contains more structure than a signal at a finer level of scale?

It is proposed that there is only one reasonable way to define a scale-space for 1D discrete signals comprising a continuous scale parameter, namely by (discrete) convolution with the family of kernels T(n; t) = e^{-t} I_n(t), where I_n are the modified Bessel functions of integer order. Similar arguments applied in the continuous case uniquely lead to the Gaussian kernel.

Some obvious discretizations of the continuous scale-space theory are discussed in view of the results presented. It is shown that the kernel T(n; t) arises naturally in the solution of a discretized version of the diffusion equation. The commonly adapted technique with a sampled Gaussian can lead to undesirable effects since scale-space violations might occur in the corresponding representation. The result exemplifies the fact that properties derived in the continuous case might be violated after discretization.

A two-dimensional theory, showing how the scale-space should be constructed for images, is given based on the requirement that local extrema must not be enhanced, when the scale parameter is increased continuously. In the separable case the resulting scale-space representation can be calculated by separated convolution with the kernel T(n; t).

The presented discrete theory has computational advantages compared to a scale-space implementation based on the sampled Gaussian, for instance concerning the Laplacian of the Gaussian. The main reason is that the discrete nature of the implementation has been taken into account already in the theoretical formulation of the scale-space representation.

• 255.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Scale-Space for N-dimensional discrete signals1992Ingår i: Shape inPicture: Mathematical Description of Shape in Grey-Level Images: Proc. of Workshop in Driebergen, Netherlands, Sep. 7--11, 1992, Springer, 1992, s. 571-590Konferensbidrag (Refereegranskat)

This article shows how a (linear) scale-space representation can be defined for discrete signals of arbitrary dimension. The treatment is based upon the assumptions that (i) the scale-space representation should be defined by convolving the original signal with a one-parameter family of symmetric smoothing kernels possessing a semi-group property, and (ii) local extrema must not be enhanced when the scale parameter is increased continuously.

It is shown that given these requirements the scale-space representation must satisfy the differential equation \partial_t L = A L for some linear and shift invariant operator A satisfying locality, positivity, zero sum, and symmetry conditions. Examples in one, two, and three dimensions illustrate that this corresponds to natural semi-discretizations of the continuous (second-order) diffusion equation using different discrete approximations of the Laplacean operator. In a special case the multi-dimensional representation is given by convolution with the one-dimensional discrete analogue of the Gaussian kernel along each dimension.

• 256.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Scale-space theory2001Ingår i: Encyclopaedia of Mathematics / [ed] Michiel Hazewinkel, Springer , 2001Kapitel i bok, del av antologi (Refereegranskat)

# Scale-space theory

A theory of multi-scale representation of sensory data developed by the image processing and computer vision communities. The purpose is to represent signals at multiple scales in such a way that fine scale structures are successively suppressed, and a scale parameter  is associated with each level in the multi-scale representation.

For a given signal , a linear scale-space representation is a family of derived signals , defined by  and

for some family  of convolution kernels [a1], [a2] (cf. also Integral equation of convolution type). An essential requirement on the scale-space family  is that the representation at a coarse scale constitutes a simplification of the representations at finer scales. Several different ways of formalizing this requirement about non-creation of new structures with increasing scales show that the Gaussian kernel

constitutes a canonical choice for generating a scale-space representation [a3], [a4], [a5], [a6]. Equivalently, the scale-space family satisfies the diffusion equation

The motivation for generating a scale-space representation of a given data set originates from the basic fact that real-world objects are composed of different structures at different scales and may appear in different ways depending on the scale of observation. For example, the concept of a "tree" is appropriate at the scale of meters, while concepts such as leaves and molecules are more appropriate at finer scales. For a machine vision system analyzing an unknown scene, there is no way to know what scales are appropriate for describing the data. Thus, the only reasonable approach is to consider descriptions at all scales simultaneously [a1], [a2].

From the scale-space representation, at any level of scale one can define scale-space derivatives by

where  and  constitute multi-index notation for the derivative operator . Such Gaussian derivative operators provide a compact way to characterize the local image structure around a certain image point at any scale. Specifically, the output from scale-space derivatives can be combined into multi-scale differential invariants, to serve as feature detectors (see Edge detection and Corner detection for two examples).

More generally, a scale-space representation with its Gaussian derivative operators can serve as a basis for expressing a large number of early visual operations, including feature detection, stereo matching, computation of motion descriptors and the computation of cues to surface shape [a3], [a4]. Neuro-physiological studies have shown that there are receptive field profiles in the mammalian retina and visual cortex, which can be well modeled by the scale-space framework [a7].

Pyramid representation [a8] is a predecessor to scale-space representation, constructed by simultaneously smoothing and subsampling a given signal. In this way, computationally highly efficient algorithms can be obtained. A problem noted with pyramid representations, however, is that it is usually algorithmically hard to relate structures at different scales, due to the discrete nature of the scale levels. In a scale-space representation, the existence of a continuous scale parameter makes it conceptually much easier to express this deep structure [a2]. For features defined as zero-crossings of differential invariants, the implicit function theorem (cf. Implicit function) directly defines trajectories across scales, and at those scales where a bifurcation occurs, the local behaviour can be modeled by singularity theory [a3], [a5].

Extensions of linear scale-space theory concern the formulation of non-linear scale-space concepts more committed to specific purposes [a9]. There are strong relations between scale-space theory and wavelet theory (cf. also Wavelet analysis), although these two notions of multi-scale representation have been developed from slightly different premises.

References

[a1] A.P. Witkin, "Scale-space filtering" , Proc. 8th Internat. Joint Conf. Art. Intell. Karlsruhe, West Germany Aug. 1983 (1983) pp. 1019–1022

[a2] J.J. Koenderink, "The structure of images" Biological Cybernetics , 50 (1984) pp. 363–370

[a3] T. Lindeberg, "Scale-space theory in computer vision" , Kluwer Acad. Publ. (1994)

[a4] L.M.J. Florack, "Image structure" , Kluwer Acad. Publ. (1997)[a5]J. Sporring, et al., "Gaussian scale-space theory" , Kluwer Acad. Publ. (1997)

[a6] B.M ter Haar Romeny, et al., "Proc. First Internat. Conf. scale-space" , Lecture Notes Computer Science , 1252 , Springer (1997)

[a7] R.A. Young, "The Gaussian derivative model for spatial vision: Retinal mechanisms" Spatial Vision , 2 (1987) pp. 273–293

[a8] P.J. Burt, E.H. Adelson, "The Laplacian Pyramid as a Compact Image Code" IEEE Trans. Commun. , 9 : 4 (1983) pp. 532–540

[a9] "Geometry-driven diffusion in computer vision" B.M ter Haar Romeny (ed.) , Kluwer Acad. Publ. (1994)

• 257.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Scale-Space Theory: A Basic Tool for Analysing Structures at Different Scales1994Ingår i: Journal of Applied Statistics, ISSN 0266-4763, E-ISSN 1360-0532, Vol. 21, s. 225-270Artikel i tidskrift (Refereegranskat)

An inherent property of objects in the world is that they only exist as meaningful entities over certain ranges of scale. If one aims at describing the structure of unknown real-world signals, then a multi-scale representation of data is of crucial importance.

This article gives a tutorial review of a special type of multi-scale representation, linear scale-space representation, which has been developed by the computer vision community in order to handle image structures at different scales in a consistent manner. The basic idea is to embed the original signal into a one-parameter family of gradually smoothed signals, in which the fine scale details are successively suppressed.

Under rather general conditions on the type of computations that are to performed at the first stages of visual processing, in what can be termed the visual front end, it can be shown that the Gaussian kernel and its derivatives are singled out as the only possible smoothing kernels. The conditions that specify the Gaussian kernel are, basically, linearity and shift-invariance combined with different ways of formalizing the notion that structures at coarse scales should correspond to simplifications of corresponding structures at fine scales --- they should not be accidental phenomena created by the smoothing method. Notably, several different ways of choosing scale-space axioms give rise to the same conclusion.

The output from the scale-space representation can be used for a variety of early visual tasks; operations like feature detection, feature classification and shape computation can be expressed directly in terms of (possibly non-linear) combinations of Gaussian derivatives at multiple scales. In this sense, the scale-space representation can serve as a basis for early vision.

During the last few decades a number of other approaches to multi-scale representations have been developed, which are more or less related to scale-space theory, notably the theories of pyramids, wavelets and multi-grid methods. Despite their qualitative differences, the increasing popularity of each of these approaches indicates that the crucial notion of scaleis increasingly appreciated by the computer vision community and by researchers in other related fields.

An interesting similarity with biological vision is that the scale-space operators closely resemble receptive field profiles registered in neurophysiological studies of the mammalian retina and visual cortex.

• 258.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Scale-space theory: A framework for handling image structures at multiple scales1996Ingår i: Proc. CERN School of Computing, Egmond aan Zee, The Netherlands, 8–21 September, 1996, 1996, Vol. 96, 8, s. 27-38Konferensbidrag (Refereegranskat)

This article gives a tutorial overview of essential components of scale-space theory --- a framework for multi-scale signal representation, which has been developed by the computer vision community to analyse and interpret real-world images by automatic methods.

• 259.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Scale-Space Theory in Computer Vision1993Bok (Övrigt vetenskapligt)

A basic problem when deriving information from measured data, such as images, originates from the fact that objects in the world, and hence image structures, exist as meaningful entities only over certain ranges of scale. "Scale-Space Theory in Computer Vision" describes a formal theory for representing the notion of scale in image data, and shows how this theory applies to essential problems in computer vision such as computation of image features and cues to surface shape. The subjects range from the mathematical foundation to practical computational techniques. The power of the methodology is illustrated by a rich set of examples.

This book is the first monograph on scale-space theory. It is intended as an introduction, reference, and inspiration for researchers, students, and system designers in computer vision as well as related fields such as image processing, photogrammetry, medical image analysis, and signal processing in general.

The presentation starts with a philosophical discussion about computer vision in general. The aim is to put the scope of the book into its wider context, and to emphasize why the notion of scaleis crucial when dealing with measured signals, such as image data. An overview of different approaches to multi-scale representation is presented, and a number special properties of scale-space are pointed out.

Then, it is shown how a mathematical theory can be formulated for describing image structures at different scales. By starting from a set of axioms imposed on the first stages of processing, it is possible to derive a set of canonical operators, which turn out to be derivatives of Gaussian kernels at different scales.

The problem of applying this theory computationally is extensively treated. A scale-space theory is formulated for discrete signals, and it demonstrated how this representation can be used as a basis for expressing a large number of visual operations. Examples are smoothed derivatives in general, as well as different types of detectors for image features, such as edges, blobs, and junctions. In fact, the resulting scheme for feature detection induced by the presented theory is very simple, both conceptually and in terms of practical implementations.

Typically, an object contains structures at many different scales, but locally it is not unusual that some of these "stand out" and seem to be more significant than others. A problem that we give special attention to concerns how to find such locally stable scales, or rather how to generate hypotheses about interesting structures for further processing. It is shown how the scale-space theory, based on a representation called the scale-space primal sketch, allows us to extract regions of interest from an image without prior information about what the image can be expected to contain. Such regions, combined with knowledge about the scales at which they occur constitute qualitative information, which can be used for guiding and simplifying other low-level processes.

Experiments on different types of real and synthetic images demonstrate how the suggested approach can be used for different visual tasks, such as image segmentation, edge detection, junction detection, and focus-of-attention. This work is complemented by a mathematical treatment showing how the behaviour of different types of image structures in scale-space can be analysed theoretically.

It is also demonstrated how the suggested scale-space framework can be used for computing direct cues to three-dimensional surface structure, using in principle only the same types of visual front-end operations that underlie the computation of image features.

Although the treatment is concerned with the analysis of visual data, the general notion of scale-space representation is of much wider generality and arises in several contexts where measured data are to be analyzed and interpreted automatically.

• 260.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Separable time-causal and time-recursive spatio-temporal receptive fields2015Ingår i: Scale Space and Variational Methods in Computer Vision: 5th International Conference, SSVM 2015, Lège-Cap Ferret, France, May 31 - June 4, 2015, Proceedings / [ed] J.-F. Aujol et al., Springer, 2015, s. 90-102Konferensbidrag (Refereegranskat)

We present an improved model and theory for time-causal and time-recursive spatio-temporal receptive fields,obtained by a combination of Gaussian receptive fields over the spatial domain and first-order integrators or equivalently truncated exponential filters coupled in cascade over the temporal domain. Compared to previous spatio-temporal scale-space formulations in terms of non-enhancement of local extrema or scale invariance, these receptive fields are based on different scale-space axiomatics over time by ensuring non-creation of new local extrema or zero-crossings with increasing temporal scale. Specifically, extensions are presented about parameterizing the intermediate temporal scale levels, analysing the resulting temporal dynamics and transferring the theory to a discrete implementation in terms of recursive filters over time.

• 261.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
Spatio-temporal scale selection in video data2017Ingår i: Scale Space and Variational Methods in Computer Vision, Springer-Verlag Tokyo Inc., 2017, Vol. 10302, s. 3-15Konferensbidrag (Refereegranskat)

We present a theory and a method for simultaneous detection of local spatial and temporal scales in video data. The underlying idea is that if we process video data by spatio-temporal receptive fields at multiple spatial and temporal scales, we would like to generate hypotheses about the spatial extent and the temporal duration of the underlying spatio-temporal image structures that gave rise to the feature responses.

For two types of spatio-temporal scale-space representations, (i) a non-causal Gaussian spatio-temporal scale space for offline analysis of pre-recorded video sequences and (ii) a time-causal and time-recursive spatio-temporal scale space for online analysis of real-time video streams, we express sufficient conditions for spatio-temporal feature detectors in terms of spatio-temporal receptive fields to deliver scale covariant and scale invariant feature responses.

A theoretical analysis is given of the scale selection properties of six types of spatio-temporal interest point detectors, showing that five of them allow for provable scale covariance and scale invariance. Then, we describe a time-causal and time-recursive algorithm for detecting sparse spatio-temporal interest points from video streams and show that it leads to intuitively reasonable results.

• 262.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
Spatio-temporal scale selection in video data2018Ingår i: Journal of Mathematical Imaging and Vision, ISSN 0924-9907, E-ISSN 1573-7683, Vol. 60, nr 4, s. 525-562Artikel i tidskrift (Refereegranskat)

This work presents a theory and methodology for simultaneous detection of local spatial and temporal scales in video data. The underlying idea is that if we process video data by spatio-temporal receptive fields at multiple spatial and temporal scales, we would like to generate hypotheses about the spatial extent and the temporal duration of the underlying spatio-temporal image structures that gave rise to the feature responses.

For two types of spatio-temporal scale-space representations, (i) a non-causal Gaussian spatio-temporal scale space for offline analysis of pre-recorded video sequences and (ii) a time-causal and time-recursive spatio-temporal scale space for online analysis of real-time video streams, we express sufficient conditions for spatio-temporal feature detectors in terms of spatio-temporal receptive fields to deliver scale covariant and scale invariant feature responses.

We present an in-depth theoretical analysis of the scale selection properties of eight types of spatio-temporal interest point detectors in terms of either: (i)-(ii) the spatial Laplacian applied to the first- and second-order temporal derivatives, (iii)-(iv) the determinant of the spatial Hessian applied to the first- and second-order temporal derivatives, (v) the determinant of the spatio-temporal Hessian matrix, (vi) the spatio-temporal Laplacian and (vii)-(viii) the first- and second-order temporal derivatives of the determinant of the spatial Hessian matrix. It is shown that seven of these spatio-temporal feature detectors allow for provable scale covariance and scale invariance. Then, we describe a time-causal and time-recursive algorithm for detecting sparse spatio-temporal interest points from video streams and show that it leads to intuitively reasonable results.

An experimental quantification of the accuracy of the spatio-temporal scale estimates and the amount of temporal delay obtained these spatio-temporal interest point detectors is given showing that: (i) the spatial and temporal scale selection properties predicted by the continuous theory are well preserved in the discrete implementation and (ii) the spatial Laplacian or the determinant of the spatial Hessian applied to the first- and second-order temporal derivatives lead to much shorter temporal delays in a time-causal implementation compared to the determinant of the spatio-temporal Hessian or the first- and second-order temporal derivatives of the determinant of the spatial Hessian matrix.

• 263.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
Temporal scale selection in time-causal scale space2017Ingår i: Journal of Mathematical Imaging and Vision, ISSN 0924-9907, E-ISSN 1573-7683, Vol. 58, nr 1, s. 57-101Artikel i tidskrift (Refereegranskat)

When designing and developing scale selection mechanisms for generating hypotheses about characteristic scales in signals, it is essential that the selected scale levels reflect the extent of the underlying structures in the signal.

This paper presents a theory and in-depth theoretical analysis about the scale selection properties of methods for automatically selecting local temporal scales in time-dependent signals based on local extrema over temporal scales of scale-normalized temporal derivative responses. Specifically, this paper develops a novel theoretical framework for performing such temporal scale selection over a time-causal and time-recursive temporal domain as is necessary when processing continuous video or audio streams in real time or when modelling biological perception.

For a recently developed time-causal and time-recursive scale-space concept defined by convolution with a scale-invariant limit kernel, we show that it is possible to transfer a large number of the desirable scale selection properties that hold for the Gaussian scale-space concept over a non-causal temporal domain to this temporal scale-space concept over a truly time-causal domain. Specifically, we show that for this temporal scale-space concept, it is possible to achieve true temporal scale invariance although the temporal scale levels have to be discrete, which is a novel theoretical construction.

The analysis starts from a detailed comparison of different temporal scale-space concepts and their relative advantages and disadvantages, leading the focus to a class of recently extended time-causal and time-recursive temporal scale-space concepts based on first-order integrators or equivalently truncated exponential kernels coupled in cascade. Specifically, by the discrete nature of the temporal scale levels in this class of time-causal scale-space concepts, we study two special cases of distributing the intermediate temporal scale levels, by using either a uniform distribution in terms of the variance of the composed temporal scale-space kernel or a logarithmic distribution.

In the case of a uniform distribution of the temporal scale levels, we show that scale selection based on local extrema of scale-normalized derivatives over temporal scales makes it possible to estimate the temporal duration of sparse local features defined in terms of temporal extrema of first- or second-order temporal derivative responses. For dense features modelled as a sine wave, the lack of temporal scale invariance does, however, constitute a major limitation for handling dense temporal structures of different temporal duration in a uniform manner.

In the case of a logarithmic distribution of the temporal scale levels, specifically taken to the limit of a time-causal limit kernel with an infinitely dense distribution of the temporal scale levels towards zero temporal scale, we show that it is possible to achieve true temporal scale invariance to handle dense features modelled as a sine wave in a uniform manner over different temporal durations of the temporal structures as well to achieve more general temporal scale invariance for any signal over any temporal scaling transformation with a temporal scaling factor that is an integer power of the distribution parameter of the time-causal limit kernel.

It is shown how these temporal scale selection properties developed for a pure temporal domain carry over to feature detectors defined over time-causal spatio-temporal and spectro-temporal domains.

• 264.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
Time-causal and time-recursive receptive fields for invariance and covariance under natural image transformations2016Konferensbidrag (Övrigt vetenskapligt)

Due to the huge variability of image information under natural image transformations, the receptive field responses of the local image operations that serve as input to higher level visual processes will in general be strongly dependent on the geometric and illumination conditions in the image formation process. To obtain robustness of a vision system, it is natural to require the receptive field families underlying the image operators to be either invariant or covariant under the relevant families of natural image transformations.

This talk presents an improved model and theory for time-causal and time-recursive spatio-temporal receptive fields, obtained by a combination of Gaussian receptive fields over the spatial domain and first-order integrators or equivalently truncated exponential filters coupled in cascade over the temporal domain. This model inherits the theoretically attractive properties of the Gaussian scale-space model over a spatial domain in terms of (i) invariance or covariance of receptive field responses under scaling transformation and affine transformations over the spatial domain combined with (ii) non-creation of new image structures from finer to coarser scales. When complemented by velocity adaptation the receptive field responses can be made (iii) Galilean covariant or invariant to account for unknown or variable relative motions between objects in the world and the observer. Additionally when expressed over a logarithmic distribution of the temporal scale levels, this model allows for (iv) scale invariance and self-similarity over the temporal domain while simultaneously expressed over a time-causal and time-recursive temporal domain, which is a theoretically new type of construction.

We propose this axiomatically derived theory as the natural extension of the Gaussian scale-space paradigm for local image operations from a spatial domain to a time-causal spatio-temporal domain, to be used as a general framework for expressing spatial and spatio-temporal image operators for a computer vision system. The theory leads to (v) predictions about spatial and spatio-temporal receptive fields with good qualitative similarity to biological receptive fields measured by cell recordings in the retina, the lateral geniculate nucleus (LGN) and the primary visual cortex (V1). Specifically, this framework allows for (vi) computationally efficient real-time operations and leads to (vii) much better temporal dynamics (shorter temporal delays) compared to previously formulated time-causal temporal scale-space models.

Reference:

Lindeberg (2016) "Time-causal and time-recursive spatio-temporal receptive fields", Journal of Mathematical Imaging and Vision, 55(1): 50-88.

• 265.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Time-causal and time-recursive spatio-temporal receptive fields2016Ingår i: Journal of Mathematical Imaging and Vision, ISSN 0924-9907, E-ISSN 1573-7683, Vol. 55, nr 1, s. 50-88Artikel i tidskrift (Refereegranskat)

We present an improved model and theory for time-causal and time-recursive spatio-temporal receptive fields, obtained by a combination of Gaussian receptive fields over the spatial domain and first-order integrators or equivalently truncated exponential filters coupled in cascade over the temporal domain.

Compared to previous spatio-temporal scale-space formulations in terms of non-enhancement of local extrema or scale invariance, these receptive fields are based on different scale-space axiomatics over time by ensuring non-creation of new local extrema or zero-crossings with increasing temporal scale. Specifically, extensions are presented about (i) parameterizing the intermediate temporal scale levels, (ii) analysing the resulting temporal dynamics, (iii) transferring the theory to a discrete implementation in terms of recursive filters over time, (iv) computing scale-normalized spatio-temporal derivative expressions for spatio-temporal feature detection and (v) computational modelling of receptive fields in the lateral geniculate nucleus (LGN) and the primary visual cortex (V1) in biological vision.

We show that by distributing the intermediate temporal scale levels according to a logarithmic distribution, we obtain a new family of temporal scale-space kernels with better temporal characteristics compared to a more traditional approach of using a uniform distribution of the intermediate temporal scale levels. Specifically, the new family of time-causal kernels has much faster temporal response properties (shorter temporal delays) compared to the kernels obtained from a uniform distribution. When increasing the number of temporal scale levels, the temporal scale-space kernels in the new family do also converge very rapidly to a limit kernel possessing true self-similar scale-invariant properties over temporal scales. Thereby, the new representation allows for true scale invariance over variations in the temporal scale, although the underlying temporal scale-space representation is based on a discretized temporal scale parameter.

We show how scale-normalized temporal derivatives can be defined for these time-causal scale-space kernels and how the composed theory can be used for computing basic types of scale-normalized spatio-temporal derivative expressions in a computationally efficient manner.

• 266.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Time-causal and time-recursive spatio-temporal receptive fields2015Rapport (Övrigt vetenskapligt)

We present an improved model and theory for time-causal and time-recursive spatio-temporal receptive fields, obtained by a combination of Gaussian receptive fields over the spatial domain and first-order integrators or equivalently truncated exponential filters coupled in cascade over the temporal domain.

Compared to previous spatio-temporal scale-space formulations in terms of non-enhancement of local extrema or scale invariance, these receptive fields are based on different scale-space axiomatics over time by ensuring non-creation of new local extrema or zero-crossings with increasing temporal scale. Specifically, extensions are presented about (i) parameterizing the intermediate temporal scale levels, (ii) analysing the resulting temporal dynamics, (iii) transferring the theory to a discrete implementation in terms of recursive filters over time, (iv) computing scale-normalized spatio-temporal derivative expressions for spatio-temporal feature detection and (v) computational modelling of receptive fields in the lateral geniculate nucleus (LGN) and the primary visual cortex (V1) in biological vision.

We show that by distributing the intermediate temporal scale levels according to a logarithmic distribution, we obtain a new family of temporal scale-space kernels with better temporal characteristics compared to a more traditional approach of using a uniform distribution of the intermediate temporal scale levels. Specifically, the new family of time-causal kernels has much faster temporal response properties (shorter temporal delays) compared to the kernels obtained from a uniform distribution. When increasing the number of temporal scale levels, the temporal scale-space kernels in the new family do also converge very rapidly to a limit kernel possessing true self-similar scale-invariant properties over temporal scales. Thereby, the new representation allows for true scale invariance over variations in the temporal scale, although the underlying temporal scale-space representation is based on a discretized temporal scale parameter.

We show how scale-normalized temporal derivatives can be defined for these time-causal scale-space kernels and how the composed theory can be used for computing basic types of scale-normalized spatio-temporal derivative expressions in a computationally efficient manner.

• 267.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
Time-causal and time-recursive spatio-temporal receptive fields for computer vision and computational modelling of biological vision2016Ingår i: International Workshop on Geometry, PDE’s and Lie Groups in Image Analysis, Eindhoven, The Netherlands, August 24-26, 2016., 2016Konferensbidrag (Övrigt vetenskapligt)

When operating on time-dependent image information in real time, a fundamental constraint originates from the fact that image operations must be both time-causal and time-recursive.

In this talk, we will present an improved model and theory for time-causal and time-recursive spatio-temporal receptive fields, obtained by a combination of Gaussian filters over the spatial domain and first-order integrators or equivalently truncated exponential filters coupled in cascade over the temporal domain. This receptive field family obeys scale-space axiomatics in the sense of non-enhancement of local extrema over the spatial domain and non-creation of new local extrema over time for any purely temporal signal and does in these respects guarantee theoretically well-founded treatment of spatio-temporal image structures at different spatial and temporal scales.

By a logarithmic distribution of the temporal scale levels in combination with the construction of a time-causal limit kernel based on an infinitely dense distribution of the temporal scale levels towards zero temporal scale, it will be shown that this family allows for temporal scale invariance although the temporal scale levels by the theory have to be discrete. Additionally, the family obeys basic invariance or covariance properties under other classes of natural image transformations including spatial scaling transformations, rotations/affine image deformations over the spatial domain, Galilean transformations of space time and local multiplicative intensity transformations. Thereby, this receptive field family allows for the formulation of multi-scale differential geometric image features with invariance or covariance properties under basic classes of natural image transformations over space-time.

It is shown how this spatio-temporal scale-space concept (i) allows for efficient computation of different types of spatio-temporal features for purposes in computer vision and (ii) leads to predictions about biological receptive fields with good qualitative similarities to the results of cell recordings in the lateral geniculate nucleus (LGN) and the primary visual cortex (V1) in biological vision.

References:

T. Lindeberg (2016) ”Time-causal and time-recursive spatio-temporal receptive fields”, Journal of Mathematical Imaging and Vision, 55(1): 50-88.

T. Lindeberg (2013) ”A computational theory of visual receptive fields”, Biological Cybernetics, 107(6): 589–635.

T. Lindeberg (2013) ”Invariance of visual operations at the level of receptive fields”, PLOS One, 8(7): e66990.

T. Lindeberg (2011) ”Generalized Gaussian scale-space axiomatics comprising linear scale space, affine scale space and spatio-temporal scale space”, Journal of Mathematical Imaging and Vision, 40(1): 36–81.

• 268.
KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
Time-recursive velocity-adapted spatio-temporal scale-space filters2001Rapport (Övrigt vetenskapligt)

This paper presents a framework for constructing and computing velocity-adapted scale-space filters for spatio-temporal image data. Starting from basic criteria in terms of time-causality, time-recursivity, locality and adaptivity with respect to motion estimates, a family of spatio-temporal recursive filters is proposed and analysed. An important property of the proposed family of smoothing kernels is that the spatio-temporal covariance matrices of the discrete kernels obey similar transformation properties under Galilean transformations as for continuous smoothing kernels on continuous domains. Moreover, the proposed framework provides an efficient way to compute and generate non-separable scale-space representations without need for explicit external warping mechanisms or keeping extended temporal buffers of the past. The approach can thus be seen as a natural extension of recursive scale-space filters from pure temporal data to spatio-temporal domains.

Receptive field profiles generated by the proposed theory show high qualitative similarities to receptive field profiles recorded from biological vision.

• 269.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Time-recursive velocity-adapted spatio-temporal scale-space filters2002Ingår i: : ECCV'02 published in Springer Lecture Notes in Computer Science, volume 2350, 2002, Vol. 2350, s. 52-67Konferensbidrag (Refereegranskat)

This paper presents a theory for constructing and computing velocity-adapted scale-space filters for spatio-temporal image data. Starting from basic criteria in terms of time-causality, time-recursivity, locality and adaptivity with respect to motion estimates, a family of spatio-temporal recursive filters is proposed and analysed. An important property of the proposed family of smoothing kernels is that the spatio-temporal covariance matrices of the discrete kernels obey similar transformation properties under Galilean transformations as for continuous smoothing kernels on continuous domains. Moreover, the proposed theory provides an efficient way to compute and generate nonseparable scale-space representations without need for explicit external warping mechanisms or keeping extended temporal buffers of the past. The approach can thus be seen as a natural extension of recursive scale-space filters from pure temporal data to spatio-temporal domains.

• 270.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
Galilean-corrected spatio-temporal interest operators2004Rapport (Övrigt vetenskapligt)

This paper presents a set of image operators for detecting regions in space-time where interesting events occur. To define such regions of interest, we compute a spatio-temporal secondmoment matrix from a spatio-temporal scale-space representation, and diagonalize this matrix locally, using a local Galilean transformation in space-time, optionally combined with a spatial rotation, so as to make the Galilean invariant degrees of freedom explicit. From the Galilean-diagonalized descriptor so obtained, we then formulate different types of space-time interest operators, and illustrate their properties on different types of image sequences.

• 271.
KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA. KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
Galilean-diagonalized spatio-temporal interest operators2004Ingår i: Proc. 17th International Conference on Pattern Recognition (ICPR), 2004, s. 57-62Konferensbidrag (Refereegranskat)

This paper presents a set of image operators for detecting regions in space-time where interesting events occur. To define such regions of interest, we compute a spatio-temporal second-moment matrix from a spatio-temporal scale-space representation, and diagonalize this matrix locally, using a local Galilean transformation in space-time, optionally combined with a spatial rotation, so as to make the Galilean invariant degrees of freedom explicit. From the Galilean-diagonalized descriptor so obtained, we then formulate different types of space-time interest operators, and illustrate their properties on different types of image sequences.

• 272.
KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
Förfarande och anordning för överföring av information genom rörelsedetektering, samt användning av anordningen: [Method and arrangement for controlling means for three-dimensional transfer of information by motion detection]1998Patent (Övrig (populärvetenskap, debatt, mm))

The invention concerns a method and an arrangement for controlling means (24, 26), themselves controlled by processors, for three-dimensional transfer of information by motion detection using an image capturing device (20). Features of an object (10) are detected and transferred to line and point correspondences, which are used for controlling means (22, 26) to perform rotational and translational motion.

• 273.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
Real-time scale selection in hybrid multi-scale representations2003Ingår i: Proc. Scale-Space’03, Springer Berlin/Heidelberg, 2003, Vol. 2695, s. 148-163Konferensbidrag (Refereegranskat)

Local scale information extracted from visual data in a bottom-up manner constitutes an important cue for a large number of visual tasks. This article presents a framework for how the computation of such scale descriptors can be performed in real time on a standard computer.

The proposed scale selection framework is expressed within a novel type of multi-scale representation, referred to as hybrid multi-scale representation, which aims at integrating and providing variable trade-offs between the relative advantages of pyramids and scale-space representation, in terms of computational efficiency and computational accuracy. Starting from binomial scale-space kernels of different widths, we describe a family pyramid representations, in which the regular pyramid concept and the regular scale-space representation constitute limiting cases. In particular, the steepness of the pyramid as well as the sampling density in the scale direction can be varied.

It is shown how the definition of gamma-normalized derivative operators underlying the automatic scale selection mechanism can be transferred from a regular scale-space to a hybrid pyramid, and two alternative definitions are studied in detail, referred to as variance normalization and l(p)-normalization. The computational accuracy of these two schemes is evaluated, and it is shown how the choice of sub-sampling rate provides a trade-off between the computational efficiency and the accuracy of the scale descriptors. Experimental evaluations are presented for both synthetic and real data. In a simplified form, this scale selection mechanism has been running for two years, in a real-time computer vision system.

• 274.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
Analysis of aerosol images using the scale-space primal sketch1991Ingår i: Machine Vision and Applications, ISSN 0932-8092, E-ISSN 1432-1769, Vol. 4, nr 3, s. 135-144Artikel i tidskrift (Refereegranskat)

We outline a method to analyze aerosol images using the scale-space representation. The pictures, which are photographs of an aerosol generated by a fuel injector, contain phenomena that by a human observer are perceived as periodic or oscillatory structures. The presence of these structures is not immediately apparent since the periodicity manifests itself at a coarse level of scale while the dominating objects inthe images are small dark blobs, that is, fine scale objects. Experimentally, we illustrate that the scale-space theory provides an objective method to bring out these events. However, in this form the method still relies on a subjective observer in order to extract and verify the existence of the periodic phenomena.Then we extend the analysis by adding a recently developed image analysis concept called the scale-space primal sketch. With this tool, we are able to extract significant structures from a grey-level image automatically without any strong a priori assumptions about either the shape or the scale (size) of the primitives. Experiments demonstrate that the periodic drop clusters we perceived in the image are detected by the algorithm as significant image structures. These results provide objective evidence verifying the existence of oscillatory phenomena.

• 275.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
Construction of a Scale-Space Primal Sketch1990Ingår i: Proceedings of the British Machine Vision Conference 1990: BMVC'90 (Oxford, England), The British Machine Vision Association and Society for Pattern Recognition , 1990, s. 97-102Konferensbidrag (Refereegranskat)

We present a multi-scale representation of grey-level shape, called scale-space primal sketch, that makes explicit features in scale-space as well as the relations between features at different levels of scale. The representation gives a qualitative description of the image structure that allows for extraction of significant image structure --- stable scales and regions of interest --- in a solely bottom-up data-driven manner. Hence, it can be seen as preceding further processing, which can then be properly tuned. Experiments on real imagery demonstrate that the proposed theory gives perceptually intuitive results.

• 276.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
On the Computation of a Scale-Space Primal Sketch1991Ingår i: Journal of Visual Communication and Image Representation, ISSN 1047-3203, E-ISSN 1095-9076, Vol. 2, nr 1, s. 55-78Artikel i tidskrift (Refereegranskat)

Scale-space theory provides a well-founded framework for dealing with image structures that naturally occur at different scales. According to this theory one can from a given signal derive a family of signals by successively removing features when moving from fine to coarse scale. In contrast to other multiscale representations, scale-space is based on a precise mathematical definition of causality, and the behavior of structure as scale changes can be analytically described. However, the information in the scale-space embedding is only implicit. There is no explicit representation of features or the relations between features at different levels of scale. In this paper we present a theory for constructing such an explicit representation on the basis of formal scale-space theory. We treat gray-level images, but the approach is valid for any bounded function, and can therefore be used to derive properties of, e.g., spatial derivatives. Hence it is useful for studying representations based on intensity discontinuities as well. The representation is obtained in a completely data-driven manner, without relying on any specific parameters. It gives a description of the image structure that is rather coarse. However, since significant scales and regions are actually determined from the data, our approach can be seen as preceding further processing, which can then be properly tuned. An important problem in deriving the representation concerns measuring structure in such a way that the significance over scale can be established. This problem and the problem of proper parameterization of scale are given a careful analysis. Experiments on real imagery demonstrate that the proposed theory gives perceptually intuitive results.

• 277.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
Scale detection and region extraction from a scale-space primal sketch1990Ingår i: Computer Vision, 1990. Proceedings, Third International Conference on, IEEE Computer Society, 1990, s. 416-426Konferensbidrag (Refereegranskat)

We present: (1) a multi-scale representation of gray-level shape, called a scale-space primal sketch, which makes explicit both features in scale-space and the relations between features at different levels of scales; (2) a theory for extraction of significant image structure from this representation; and (3) applications to edge detection, histogram analysis and junction classification demonstrating how the proposed method can be used for guiding later stage processing. The representation gives a qualitative description of the image structure that allows for detection of stable scales and regions of interest in a solely bottom-up data-driven way. In other words, it generates coarse segmentation cues and can be hence seen as preceding further processing, which can then be properly tuned. We argue that once such information is available many other processing tasks can become much simpler. Experiments on real imagery demonstrate that the proposed theory gives perceptually intuitive results.

• 278.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
The Scale-Space Primal Sketch: Construction and Experiments1992Ingår i: Image and Vision Computing, ISSN 0262-8856, E-ISSN 1872-8138, Vol. 10, nr 1, s. 3-18Artikel i tidskrift (Refereegranskat)

We present a multi-scale representation of grey-level shape, called the scale-space primal sketch, that makes explicit features in scale-space as well as the relations between features at different levels of scale. The representation gives a qualitative description of the image structure that allows for extraction of significant image structure — stable scales and regions of interest-in a solely bottom-up data-driven manner. Hence, it can be seen as preceding further processing, which can then be properly tuned. Experiments on real imagery demonstrate that the proposed theory gives intuitively reasonable results.

• 279.
KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA. KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA. Dept. of Neuroscience, Karolinska Institute.
Automatic matching of brain images and brain atlases using multi-scale fusion algorithms1997Ingår i: / [ed] L. Friberg, A. Gjedde, S. Holm, N.A. Lassen, and M. Novak, 1997, s. 419-Konferensbidrag (Refereegranskat)

This paper presents a method for automatic matching of brain images using automatic scale selection

• 280.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
Scale-space with causal time direction1996Ingår i: : ECCV'96 (Cambridge, U.K.) published in Springer Lecture Notes in Computer Science, vol 1064, Berlin / Heidelberg: Springer , 1996, Vol. 1064, s. 229-240Konferensbidrag (Refereegranskat)

This article presents a theory for multi-scale representation of temporal data. Assuming that a real-time vision system should represent the incoming data at different time scales, an additional causality constraint arises compared to traditional scale-space theory—we can only use what has occurred in the past for computing representations at coarser time scales. Based on a previously developed scale-space theory in terms of noncreation of local maxima with increasing scale, a complete classification is given of the scale-space kernels that satisfy this property of non-creation of structure and respect the time direction as causal. It is shown that the cases of continuous and discrete time are inherently different.

• 281.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Utrecht University.
Foveal scale-space and the linear increase of receptive field size as a function of eccentricity1994Rapport (Övrigt vetenskapligt)

This paper addresses the formulation of a foveal scale-space and its relation to the scaling property of receptive field sizes with eccentricity. It is shown how the notion of a fovea can be incorporated into conventional scale-space theory leading to a foveal log-polar scale-space. Natural assumptions about uniform treatment of structures over scales and finite processing capacity imply a linear increase of minimum receptive field size as a function of eccentricity. These assumptions are similar to the ones used for deriving linear scale-space theory and the Gaussian receptive field model for an idealized visual front-end.

• 282.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Utrecht University.
On the decrease of resolution as a function of eccentricity for a foveal vision system1992Rapport (Övrigt vetenskapligt)
• 283.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Shape from Texture from a Multi-Scale Perspective1993Ingår i: Fourth International Conference on Computer Vision, 1993. Proceedings: ICCV'93 / [ed] H.-H. Nagel, IEEE conference proceedings, 1993, s. 683-691Konferensbidrag (Refereegranskat)

The problem of scale in shape from texture is addressed. The need for (at least) two scale parameters is emphasized; a local scale describing the amount of smoothing used for suppressing noise and irrelevant details when computing primitive texture descriptors from image data, and an integration scale describing the size of the region in space over which the statistics of the local descriptors is accumulated.

A novel mechanism for automatic scale selection is used, based on normalized derivatives. It is used for adaptive determination of the two scale parameters in a multi-scale texture descriptor, thewindowed second moment matrix, which is defined in terms of Gaussian smoothing, first order derivatives, and non-linear pointwise combinations of these. The same scale-selection method can be used for multi-scale blob detection without any tuning parameters or thresholding.

The resulting texture description can be combined with various assumptions about surface texture in order to estimate local surface orientation. Two specific assumptions, weak isotropy'' and constant area'', are explored in more detail. Experiments on real and synthetic reference data with known geometry demonstrate the viability of the approach.

• 284.
KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
Shape-adapted smoothing in estimation of 3-D depth cues from affine distortions of local 2-D brightness structure1997Ingår i: Image and Vision Computing, ISSN 0262-8856, E-ISSN 1872-8138, Vol. 15, nr 6, s. 415-434Artikel i tidskrift (Refereegranskat)

This article describes a method for reducing the shape distortions due to scale-space smoothing that arise in the computation of 3-D shape cues using operators (derivatives) defined from scale-space representation. More precisely, we are concerned with a general class of methods for deriving 3-D shape cues from a 2-D image data based on the estimation of locally linearized deformations of brightness patterns. This class constitutes a common framework for describing several problems in computer vision (such as shape-from-texture, shape-from disparity-gradients, and motion estimation) and for expressing different algorithms in terms of similar types of visual front-end-operations. It is explained how surface orientation estimates will be biased due to the use of rotationally symmetric smoothing in the image domain. These effects can be reduced by extending the linear scale-space concept into an affine Gaussian scalespace representation and by performing affine shape adaptation of the smoothing kernels. This improves the accuracy of the surface orientation estimates, since the image descriptors, on which the methods are based, will be relative invariant under affine transformations, and the error thus confined to the higher-order terms in the locally linearized perspective transformation. A straightforward algorithm is presented for performing shape adaptation in practice. Experiments on real and synthetic images with known orientation demonstrate that in the presence of moderately high noise levels the accuracy is improved by typically one order of magnitude.

• 285.
KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
Shape-Adapted Smoothing in Estimation of 3-D Depth Cues from Affine Distortions of Local 2-D Brightness Structure1994Ingår i: Computer Vision — ECCV '94: Third European Conference on Computer Vision Stockholm, Sweden, May 2–6, 1994 Proceedings, Volume I, 1994, s. 389-400Konferensbidrag (Refereegranskat)

Rotationally symmetric operations in the image domain may give rise to shape distortions. This article describes a way of reducing this effect for a general class of methods for deriving 3-D shape cues from 2-D image data, which are based on the estimation of locally linearized distortion of brightness patterns. By extending the linear scale-space concept into an affine scale-spacerepresentation and performing affine shape adaption of the smoothing kernels, the accuracy of surface orientation estimates derived from texture and disparity cues can be improved by typically one order of magnitude. The reason for this is that the image descriptors, on which the methods are based, will be relative invariant under affine transformations, and the error will thus be confined to the higher-order terms in the locally linearized perspective mapping.

• 286.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
Automatic generation of break points for MDL based curve classification1995Ingår i: Scandinavian Conference on Image Analysis: SCIA'95 / [ed] G. Borgefors, 1995, s. 767-776Konferensbidrag (Refereegranskat)

This article presents a method for segmenting and classifying edges using minimum description length (MDL) approximation with automatically generated break points. A scheme is proposed where junction candidates are first detected in a multi-scale pre-processing step, which generates junction candidates with associated regions of interest. These junction features are matched to edges based on spatial coincidence. For each matched pair, a tentative break point is introduced at the edge point closest to the junction. Finally, these feature combinations serve as input for an MDL approximation method which tests the validity of the break point hypotheses and classifies the resulting edge segments as either straight'' or curved''. Experiments on real world image data demonstrate the viability of the approach.

• 287.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
Segmentation and classification of edges using minimum description length approximation and complementary junction cues1997Ingår i: Computer Vision and Image Understanding, ISSN 1077-3142, E-ISSN 1090-235X, Vol. 67, nr 1, s. 88-98Artikel i tidskrift (Refereegranskat)

This article presents a method for segmenting and classifying edges using minimum description length (MDL) approximation with automatically generated break points. A scheme is proposed where junction candidates are first detected in a multiscale preprocessing step, which generates junction candidates with associated regions of interest. These junction features are matched to edges based on spatial coincidence. For each matched pair, a tentative break point is introduced at the edge point closest to the junction. Finally, these feature combinations serve as input for an MDL approximation method which tests the validity of the break point hypotheses and classifies the resulting edge segments as either “straight” or “curved.” Experiments on real world image data demonstrate the viability of the approach.

• 288.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
Segmentation and classification of edges using minimum description length approximation and complementary junction cues1995Ingår i: Theory and Applications of Image Analysis II: Selected Papers from the 9th Scandinavian Conference on Image Analysis, Uppsala, Sweden, 1995 / [ed] Gunilla Borgefors, World Scientific, 1995Kapitel i bok, del av antologi (Refereegranskat)

This article presents a method for segmenting and classifying edges using minimum description length (MDL) approximation with automatically generated break points. A scheme is proposed where junction candidates are first detected in a multi-scale pre-processing step, which generates junction candidates with associated regions of interest. These junction features are matched to edges based on spatial coincidence. For each matched pair, a tentative break point is introduced at the edge point closest to the junction. Finally, these feature combinations serve as input for an MDL approximation method which tests the validity of the break point hypotheses and classifies the resulting edge segments as either straight'' or curved''. Experiments on real world image data demonstrate the viability of the approach.

• 289.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Karolinska Institutet.
Analysis of Brain Activation Patterns Using A 3-D Scale-Space Primal Sketch1997Ingår i: : HBM'97, published in Neuroimage, volume 5, number 4, 1997, s. 393-393Konferensbidrag (Refereegranskat)

This paper presents a method for automatically determining the spatial extent and the significance ofrCBF changes.

• 290.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Analysis of brain activation patterns using a 3-D scale-space primal sketch1999Ingår i: Human Brain Mapping, ISSN 1065-9471, E-ISSN 1097-0193, Vol. 7, nr 3, s. 166-94Artikel i tidskrift (Refereegranskat)

A fundamental problem in brain imaging concerns how to define functional areas consisting of neurons that are activated together as populations. We propose that this issue can be ideally addressed by a computer vision tool referred to as the scale-space primal sketch. This concept has the attractive properties that it allows for automatic and simultaneous extraction of the spatial extent and the significance of regions with locally high activity. In addition, a hierarchical nested tree structure of activated regions and subregions is obtained. The subject in this article is to show how the scale-space primal sketch can be used for automatic determination of the spatial extent and the significance of rCBF changes. Experiments show the result of applying this approach to functional PET data, including a preliminary comparison with two more traditional clustering techniques. Compared to previous approaches, the method overcomes the limitations of performing the analysis at a single scale or assuming specific models of the data.

• 291.
KTH, Tidigare Institutioner (före 2005), Numerisk analys och datalogi, NADA.
Utrecht University.
Linear Scale-Space I: Basic Theory1994Ingår i: Geometry-Driven Diffusion in Computer Vision, Kluwer Academic Publishers, 1994, s. 1-41Kapitel i bok, del av antologi (Övrigt vetenskapligt)

Vision deals with the problem of deriving information about the world from the light reflected from it. Although the active and task-oriented nature of vision is only implicit in this formulation, this view captures several of the essential aspects of vision. As Marr (1982) phrased it in his book Vision, vision is an information processing task, in which an internal representation of information is of utmost importance. Only by representation information can be captured and made available to decision processes. The purpose of a representation is to make certain aspects of the information content explicit, that is, immediately accessible without any need for additional processing.

This introductory chapter deals with a fundamental aspect of early image representation---the notion of scale. As Koenderink (1984) emphasizes, the problem of scale must be faced in any imaging situation. An inherent property of objects in the world and details in images is that they only exist as meaningful entities over certain ranges of scale. A simple example of this is the concept of a branch of a tree, which makes sense only at a scale from, say, a few centimeters to at most a few meters. It is meaningless to discuss the tree concept at the nanometer or the kilometer level. At those scales it is more relevant to talk about the molecules that form the leaves of the tree, or the forest in which the tree grows. Consequently, a multi-scale representation is of crucial importance if one aims at describing the structure of the world, or more specifically the structure of projections of the three-dimensional world onto two-dimensional images.

The need for multi-scale representation is well understood, for example, in cartography; maps are produced at different degrees of abstraction. A map of the world contains the largest countries and islands, and possibly, some of the major cities, whereas towns and smaller islands appear at first in a map of a country. In a city guide, the level of abstraction is changed considerably to include streets and buildings etc. In other words, maps constitute symbolic multi-scale representations of the world around us, although constructed manually and with very specific purposes in mind.

To compute any type of representation from image data, it is necessary to extract information, and hence interact with the data using certain operators. Some of the most fundamental problems in low-level vision and image analysis concern: what operators to use, where to apply them, and how large they should be. If these problems are not appropriately addressed, the task of interpreting the output results can be very hard. Ultimately, the task of extracting information from real image data is severely influenced by the inherent measurement problem that real-world structures, in contrast to certain ideal mathematical entities, such as points'' or lines'', appear in different ways depending upon the scale of observation.

Phrasing the problem in this way shows the intimate relation to physics. Any physical observation by necessity has to be done through some finite aperture, and the result will, in general, depend on the aperture of observation. This holds for any device that registers physical entities from the real world including a vision system based on brightness data. Whereas constant size aperture functions may be sufficient in many (controlled) physical applications, e.g., fixed measurement devices, and also the aperture functions of the basic sensors in a camera (or retina) may have to determined a priori because of practical design constraints, it is far from clear that registering data at a fixed level of resolution is sufficient. A vision system for handling objects of different sizes and at difference distances needs a way to control the scale(s) at which the world is observed.

The goal of this chapter is to review some fundamental results concerning a framework known as scale-space that has been developed by the computer vision community for controlling the scale of observation and representing the multi-scale nature of image data. Starting from a set of basic constraints (axioms) on the first stages of visual processing it will be shown that under reasonable conditions it is possible to substantially restrict the class of possible operations and to derive a (unique) set of weighting profiles for the aperture functions. In fact, the operators that are obtained bear qualitative similarities to receptive fields at the very earliest stages of (human) visual processing (Koenderink 1992). We shall mainly be concerned with the operations that are performed directly on raw image data by the processing modules are collectively termed the visual front-end. The purpose of this processing is to register the information on the retina, and to make important aspects of it explicit that are to be used in later stage processes. If the operations are to be local, they have to preserve the topology at the retina; for this reason the processing can be termed retinotopic processing.

Early visual operationsAn obvious problem concerns what information should be extracted and what computations should be performed at these levels. Is any type of operation feasible? An axiomatic approach that has been adopted in order to restrict the space of possibilities is to assume that the very first stages of visual processing should be able to function without any direct knowledge about what can be expected to be in the scene. As a consequence, the first stages of visual processing should be as uncommitted and make as few irreversible decisions or choices as possible.

The Euclidean nature of the world around us and the perspective mapping onto images impose natural constraints on a visual system. Objects move rigidly, the illumination varies, the size of objects at the retina changes with the depth from the eye, view directions may change etc. Hence, it is natural to require early visual operations to be unaffected by certain primitive transformations (e.g. translations, rotations, and grey-scale transformations). In other words, the visual system should extract properties that are invariant with respect to these transformations.

As we shall see below, these constraints leads to operations that correspond to spatio-temporal derivatives which are then used for computing (differential) geometric descriptions of the incoming data flow. Based on the output of these operations, in turn, a large number of feature detectors can be expressed as well as modules for computing surface shape.

The subject of this chapter is to present a tutorial overview on the historical and current insights of linear scale-space theories as a paradigm for describing the structure of scalar images and as a basis for early vision. For other introductory texts on scale-space; see the monographs by Lindeberg (1991, 1994) and Florack (1993) as well as the overview articles by ter Haar Romeny and Florack (1993) and Lindeberg (1994).

• 292.
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
Utrecht University,.
Linear Scale-Space II: Early visual operations1994Ingår i: Geometry-Driven Diffusion in Vision, Kluwer Academic Publishers, 1994, s. 43-77Kapitel i bok, del av antologi (Övrigt vetenskapligt)

Vision deals with the problem of deriving information about the world from the light reflected from it. Although the active and task-oriented nature of vision is only implicit in this formulation, this view captures several of the essential aspects of vision. As Marr (1982) phrased it in his book Vision, vision is an information processing task, in which an internal representation of information is of utmost importance. Only by representation information can be captured and made available to decision processes. The purpose of a representation is to make certain aspects of the information content explicit, that is, immediately accessible without any need for additional processing.

This introductory chapter deals with a fundamental aspect of early image representation---the notion of scale. As Koenderink (1984) emphasizes, the problem of scale must be faced in any imaging situation. An inherent property of objects in the world and details in images is that they only exist as meaningful entities over certain ranges of scale. A simple example of this is the concept of a branch of a tree, which makes sense only at a scale from, say, a few centimeters to at most a few meters. It is meaningless to discuss the tree concept at the nanometer or the kilometer level. At those scales it is more relevant to talk about the molecules that form the leaves of the tree, or the forest in which the tree grows. Consequently, a multi-scale representation is of crucial importance if one aims at describing the structure of the world, or more specifically the structure of projections of the three-dimensional world onto two-dimensional images.

The need for multi-scale representation is well understood, for example, in cartography; maps are produced at different degrees of abstraction. A map of the world contains the largest countries and islands, and possibly, some of the major cities, whereas towns and smaller islands appear at first in a map of a country. In a city guide, the level of abstraction is changed considerably to include streets and buildings etc. In other words, maps constitute symbolic multi-scale representations of the world around us, although constructed manually and with very specific purposes in mind.

To compute any type of representation from image data, it is necessary to extract information, and hence interact with the data using certain operators. Some of the most fundamental problems in low-level vision and image analysis concern: what operators to use, where to apply them, and how large they should be. If these problems are not appropriately addressed, the task of interpreting the output results can be very hard. Ultimately, the task of extracting information from real image data is severely influenced by the inherent measurement problem that real-world structures, in contrast to certain ideal mathematical entities, such as points'' or lines'', appear in different ways depending upon the scale of observation.

Phrasing the problem in this way shows the intimate relation to physics. Any physical observation by necessity has to be done through some finite aperture, and the result will, in general, depend on the aperture of observation. This holds for any device that registers physical entities from the real world including a vision system based on brightness data. Whereas constant size aperture functions may be sufficient in many (controlled) physical applications, e.g., fixed measurement devices, and also the aperture functions of the basic sensors in a camera (or retina) may have to determined a priori because of practical design constraints, it is far from clear that registering data at a fixed level of resolution is sufficient. A vision system for handling objects of different sizes and at difference distances needs a way to control the scale(s) at which the world is observed.

The goal of this chapter is to review some fundamental results concerning a framework known as scale-space that has been developed by the computer vision community for controlling the scale of observation and representing the multi-scale nature of image data. Starting from a set of basic constraints (axioms) on the first stages of visual processing it will be shown that under reasonable conditions it is possible to substantially restrict the class of possible operations and to derive a (unique) set of weighting profiles for the aperture functions. In fact, the operators that are obtained bear qualitative similarities to receptive fields at the very earliest stages of (human) visual processing (Koenderink 1992). We shall mainly be concerned with the operations that are performed directly on raw image data by the processing modules are collectively termed the visual front-end. The purpose of this processing is to register the information on the retina, and to make important aspects of it explicit that are to be used in later stage processes. If the operations are to be local, they have to preserve the topology at the retina; for this reason the processing can be termed retinotopic processing.

Early visual operationsAn obvious problem concerns what information should be extracted and what computations should be performed at these levels. Is any type of operation feasible? An axiomatic approach that has been adopted in order to restrict the space of possibilities is to assume that the very first stages of visual processing should be able to function without any direct knowledge about what can be expected to be in the scene. As a consequence, the first stages of visual processing should be as uncommitted and make as few irreversible decisions or choices as possible.

The Euclidean nature of the world around us and the perspective mapping onto images impose natural constraints on a visual system. Objects move rigidly, the illumination varies, the size of objects at the retina changes with the depth from the eye, view directions may change etc. Hence, it is natural to require early visual operations to be unaffected by certain primitive transformations (e.g. translations, rotations, and grey-scale transformations). In other words, the visual system should extract properties that are invariant with respect to these transformations.

As we shall see below, these constraints leads to operations that correspond to spatio-temporal derivatives which are then used for computing (differential) geometric descriptions of the incoming data flow. Based on the output of these operations, in turn, a large number of feature detectors can be expressed as well as modules for computing surface shape.

The subject of this chapter is to present a tutorial overview on the historical and current insights of linear scale-space theories as a paradigm for describing the structure of scalar images and as a basis for early vision. For other introductory texts on scale-space; see the monographs by Lindeberg (1991, 1994) and Florack (1993) as well as the overview articles by ter Haar Romeny and Florack (1993) and Lindeberg (1994).

• 293.
KTH, Skolan för elektro- och systemteknik (EES), Teknisk informationsvetenskap.
KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling (Stängd 130101). KTH, Skolan för elektro- och systemteknik (EES), Centra, ACCESS Linnaeus Centre. KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
Video coding using multi-reference motion-adaptive transforms based on graphs2016Ingår i: 2016 IEEE 12th Image, Video, and Multidimensional Signal Processing Workshop, IVMSP 2016, IEEE, 2016Konferensbidrag (Refereegranskat)

The purpose of the work is to produce jointly coded frames for efficient video coding. We use motion-adaptive transforms in the temporal domain to generate the temporal subbands. The motion information is used to form graphs for transform construction. In our previous work, the motion-adaptive transform allows only one reference pixel to be the lowband coefficient. In this paper, we extend the motion-adaptive transform such that it permits multiple references and produces multiple lowband coefficients, which can be used in the case of bidirectional or multihypothesis motion estimation. The multi-reference motion-adaptive transform (MRMAT) is always orthonormal, thus, the energy is preserved by the transform. We compare MRMAT and the motion-compensated orthogonal transform (MCOT) [1], while HEVC intra coding is used to encode the temporal subbands. The experimental results show that MRMAT outperforms MCOT by about 0.6dB.

• 294.
KTH, Skolan för industriell teknik och management (ITM), Industriell ekonomi och organisation (Inst.), Industriell Management.
KTH, Skolan för informations- och kommunikationsteknik (ICT), Elektroniksystem.
Design of evaluation platform of machine vision for portable wireless terminals2011Konferensbidrag (Refereegranskat)

An evaluation platform for Machine vision algorithm is designed in this paper. The platform is constructed with DM6437 DSP processor and image input-output circuit models. An image process algorithm used for machine vision can be performed on the platform. With DFG model of the algorithm, the algorithm architecture can be built for programming and analyzing expediently. As an example the image segmentation algorithm has been modeled and executed with the platform. The result shows that the platform is useful for algorithm analysis and could be compared with other implementation system as design reference.

• 295. Loianno, G.
KTH, Skolan för elektro- och systemteknik (EES), Reglerteknik. KTH, Skolan för elektro- och systemteknik (EES), Centra, ACCESS Linnaeus Centre.
Visual and inertial multi-rate data fusion for motion estimation via Pareto-optimization2013Ingår i: 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE , 2013, s. 3993-3999Konferensbidrag (Refereegranskat)

Motion estimation is an open research field in control and robotic applications. Sensor fusion algorithms are generally used to achieve an accurate estimation of the vehicle motion by combining heterogeneous sensors measurements with different statistical characteristics. In this paper, a new method that combines measurements provided by an inertial sensor and a vision system is presented. Compared to classical modelbased techniques, the method relies on a Pareto optimization that trades off the statistical properties of the measurements. The proposed technique is evaluated with simulations in terms of computational requirements and estimation accuracy with respect to a classical Kalman filter approach. It is shown that the proposed method gives an improved estimation accuracy at the cost of a slightly increased computational complexity.

• 296. Lu, G.
KTH, Skolan för datavetenskap och kommunikation (CSC), Medieteknik och interaktionsdesign, MID. Nanjing University of Posts and Telecommunications.
Convolutional neural network for facial expression recognition2016Ingår i: Journal of Nanjing University of Posts and Telecommunications, ISSN 1673-5439, Vol. 36, nr 1, s. 16-22Artikel i tidskrift (Refereegranskat)

To avoid the complex explicit feature extraction process in traditional expression recognition, a convolutional neural network (CNN) for the facial expression recognition is proposed. Firstly, the facial expression image is normalized and the implicit features are extracted by using the trainable convolution kernel. Then, the maximum pooling is used to reduce the dimensions of the extracted implicit features. Finally, the Softmax classifier is used to classify the facial expressions of the test samples. The experiment is carried out on the CK+ facial expression database using the graphics processing unit (GPU). Experimental results show the performance and the generalization ability of the CNN for facial expression recognition.

• 297.
KTH, Skolan för datavetenskap och kommunikation (CSC), Numerisk Analys och Datalogi, NADA. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
KTH, Skolan för datavetenskap och kommunikation (CSC), Numerisk Analys och Datalogi, NADA. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS. KTH, Skolan för datavetenskap och kommunikation (CSC), Numerisk Analys och Datalogi, NADA. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
The use of robots in harsh and unstructured field applications2005Ingår i: 2005 IEEE International Workshop on Robot and Human Interactive Communication (RO-MAN), NEW YORK, NY: IEEE , 2005, s. 143-150Konferensbidrag (Refereegranskat)

Robots have a potential to be a significant aid in high risk, unstructured and stressing situations such as experienced by police, fire brigade, rescue workers and military. In this project we have explored the abilities of today's robot technology in the mentioned fields. This was done by, studying the user, identifying scenarios where a robot could be used and implementing a robot system for these cases. We have concluded that highly portable field robots are emerging to be an available technology but that the human-robot interaction is currently a major limiting factor of today's systems. Further we have found that operational protocols, stating how to use the robots, have to be designed in order to make robots an effective tool in harsh and unstructured field environments.

• 298. Lundberg, I.
KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP. KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
Intrinsic camera and hand-eye calibration for a robot vision system using a point marker2015Ingår i: IEEE-RAS International Conference on Humanoid Robots, IEEE Computer Society, 2015, s. 59-66Konferensbidrag (Refereegranskat)

Accurate robot camera calibration is a requirement for vision guided robots to perform precision assembly tasks. In this paper, we address the problem of doing intrinsic camera and hand-eye calibration on a robot vision system using a single point marker. This removes the need for using bulky special purpose calibration objects, and also facilitates on line accuracy checking and re-calibration when needed, without altering the robots production environment. The proposed solution provides a calibration routine that produces high quality results on par with the robot accuracy and completes a calibration in 3 minutes without need of manual intervention. We also present a method for automatic testing of camera calibration accuracy. Results from experimental verification on the dual arm concept robot FRIDA are presented.

• 299.
KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP. KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP. KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
Representing actions with Kernels2011Ingår i: IEEE/RSJ International Conference on Intelligent Robots and Systems, 2011, s. 2028-2035Konferensbidrag (Refereegranskat)

A long standing research goal is to create robots capable of interacting with humans in dynamic environments.To realise this a robot needs to understand and interpret the underlying meaning and intentions of a human action through a model of its sensory data. The visual domain provides a rich description of the environment and data is readily available in most system through inexpensive cameras. However, such data is very high-dimensional and extremely redundant making modeling challenging.Recently there has been a significant interest in semantic modeling from visual stimuli. Even though results are encouraging available methods are unable to perform robustly in realworld scenarios.In this work we present a system for action modeling from visual data by proposing a new and principled interpretation for representing semantic information. The representation is integrated with a real-time segmentation. The method is robust and flexible making it applicable for modeling in a realistic interaction scenario which demands handling noisy observations and require real-time performance. We provide extensive evaluation and show significant improvements compared to the state-of-the-art.

• 300.
KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
Animal Recognition Using Joint Visual Vocabulary2009Självständigt arbete på avancerad nivå (magisterexamen), 20 poäng / 30 hpStudentuppsats (Examensarbete)

This thesis presents a series of experiments on recognizing animals in complex

scenes. Unlike usual objects used for the recognition task (cars, airplanes, ...)

animals appear in a variety of poses and shapes in outdoor images. To perform

this task a dataset of outdoor images should be provided. Among the available

datasets there are some animal classes but as discussed in this thesis these

datasets do not capture the necessary variations needed for realistic analysis.

To overcome this problem a new extensive dataset,

KTH-animals

, containing

realistic images of animals in complex natural environments. The methods

designed on the other datasets do not preform well on the animals dataset

due to the larger variations in this dataset. One of the methods that showed

promising results on one of these datasets on the animals dataset was applied

on

KTH-animals

and showed how it failed to encode the large variations in

this dataset.

To familiarize the reader with the concept of computer vision and the

mathematics backgrounds a chapter of this thesis is dedicated to this matter.

This section presents a brief review of the texture descriptors and several

classification methods together with mathematical and statistical algorithms

needed by them.

To analyze the images of the dataset two different methodologies are introduced

in this thesis. In the first methodology

fuzzy classifiers

we analyze

the images solely based on the animals skin texture of the animals. To do so an

accurate manual segmentation of the images is provided. Here the skin texture

is judged using many different features and the results are combined with each

other with

fuzzy classifiers

. Since the assumption of neglecting the background

information in unrealistic the joint visual vocabularies are introduced.

Joint visual vocabularies

is a method for visual object categorization based

on encoding the joint textural information in objects and the surrounding background,

and requiring no segmentation during recognition. The framework can

be used together with various learning techniques and model representations.

Here we use this framework with simple probabilistic models and more complex

representations obtained using Support Vector Machines. We prove that

our approach provides good recognition performance for complex problems

for which some of the existing methods have difficulties.

The achievements of this thesis are a challenging database for animal

recognition. A review of the previous work and related mathematical background.

Texture feature evaluation on the "KTH-animal" dataset. Introduction

a method for object recognition based on joint statistics over the image.

Applying

different model representation of different complexity within the same

classification framework, simple probabilistic models and more complex ones

based on Support Vector Machines.

3456789 251 - 300 av 476
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
• apa
• harvard1
• ieee
• modern-language-association-8th-edition
• vancouver
• Annat format
Fler format
Språk
• de-DE
• en-GB
• en-US
• fi-FI
• nn-NO
• nn-NB
• sv-SE
• Annat språk
Fler språk
Utmatningsformat
• html
• text
• asciidoc
• rtf