Endre søk
Begrens søket
3456789 251 - 300 of 467
RefereraExporteraLink til resultatlisten
Permanent link
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Treff pr side
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Forfatter A-Ø
  • Forfatter Ø-A
  • Tittel A-Ø
  • Tittel Ø-A
  • Type publikasjon A-Ø
  • Type publikasjon Ø-A
  • Eldste først
  • Nyeste først
  • Skapad (Eldste først)
  • Skapad (Nyeste først)
  • Senast uppdaterad (Eldste først)
  • Senast uppdaterad (Nyeste først)
  • Disputationsdatum (tidligste først)
  • Disputationsdatum (siste først)
  • Standard (Relevans)
  • Forfatter A-Ø
  • Forfatter Ø-A
  • Tittel A-Ø
  • Tittel Ø-A
  • Type publikasjon A-Ø
  • Type publikasjon Ø-A
  • Eldste først
  • Nyeste først
  • Skapad (Eldste først)
  • Skapad (Nyeste først)
  • Senast uppdaterad (Eldste først)
  • Senast uppdaterad (Nyeste først)
  • Disputationsdatum (tidligste først)
  • Disputationsdatum (siste først)
Merk
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 251.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Scale-space theory: A framework for handling image structures at multiple scales1996Inngår i: Proc. CERN School of Computing, Egmond aan Zee, The Netherlands, 8–21 September, 1996, 1996, Vol. 96, 8, s. 27-38Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This article gives a tutorial overview of essential components of scale-space theory --- a framework for multi-scale signal representation, which has been developed by the computer vision community to analyse and interpret real-world images by automatic methods.

  • 252.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Scale-Space Theory in Computer Vision1993Bok (Annet vitenskapelig)
    Abstract [en]

    A basic problem when deriving information from measured data, such as images, originates from the fact that objects in the world, and hence image structures, exist as meaningful entities only over certain ranges of scale. "Scale-Space Theory in Computer Vision" describes a formal theory for representing the notion of scale in image data, and shows how this theory applies to essential problems in computer vision such as computation of image features and cues to surface shape. The subjects range from the mathematical foundation to practical computational techniques. The power of the methodology is illustrated by a rich set of examples.

    This book is the first monograph on scale-space theory. It is intended as an introduction, reference, and inspiration for researchers, students, and system designers in computer vision as well as related fields such as image processing, photogrammetry, medical image analysis, and signal processing in general.

    The presentation starts with a philosophical discussion about computer vision in general. The aim is to put the scope of the book into its wider context, and to emphasize why the notion of scaleis crucial when dealing with measured signals, such as image data. An overview of different approaches to multi-scale representation is presented, and a number special properties of scale-space are pointed out.

    Then, it is shown how a mathematical theory can be formulated for describing image structures at different scales. By starting from a set of axioms imposed on the first stages of processing, it is possible to derive a set of canonical operators, which turn out to be derivatives of Gaussian kernels at different scales.

    The problem of applying this theory computationally is extensively treated. A scale-space theory is formulated for discrete signals, and it demonstrated how this representation can be used as a basis for expressing a large number of visual operations. Examples are smoothed derivatives in general, as well as different types of detectors for image features, such as edges, blobs, and junctions. In fact, the resulting scheme for feature detection induced by the presented theory is very simple, both conceptually and in terms of practical implementations.

    Typically, an object contains structures at many different scales, but locally it is not unusual that some of these "stand out" and seem to be more significant than others. A problem that we give special attention to concerns how to find such locally stable scales, or rather how to generate hypotheses about interesting structures for further processing. It is shown how the scale-space theory, based on a representation called the scale-space primal sketch, allows us to extract regions of interest from an image without prior information about what the image can be expected to contain. Such regions, combined with knowledge about the scales at which they occur constitute qualitative information, which can be used for guiding and simplifying other low-level processes.

    Experiments on different types of real and synthetic images demonstrate how the suggested approach can be used for different visual tasks, such as image segmentation, edge detection, junction detection, and focus-of-attention. This work is complemented by a mathematical treatment showing how the behaviour of different types of image structures in scale-space can be analysed theoretically.

    It is also demonstrated how the suggested scale-space framework can be used for computing direct cues to three-dimensional surface structure, using in principle only the same types of visual front-end operations that underlie the computation of image features.

    Although the treatment is concerned with the analysis of visual data, the general notion of scale-space representation is of much wider generality and arises in several contexts where measured data are to be analyzed and interpreted automatically.

  • 253.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Separable time-causal and time-recursive spatio-temporal receptive fields2015Inngår i: Scale Space and Variational Methods in Computer Vision: 5th International Conference, SSVM 2015, Lège-Cap Ferret, France, May 31 - June 4, 2015, Proceedings / [ed] J.-F. Aujol et al., Springer, 2015, s. 90-102Konferansepaper (Fagfellevurdert)
    Abstract [en]

    We present an improved model and theory for time-causal and time-recursive spatio-temporal receptive fields,obtained by a combination of Gaussian receptive fields over the spatial domain and first-order integrators or equivalently truncated exponential filters coupled in cascade over the temporal domain. Compared to previous spatio-temporal scale-space formulations in terms of non-enhancement of local extrema or scale invariance, these receptive fields are based on different scale-space axiomatics over time by ensuring non-creation of new local extrema or zero-crossings with increasing temporal scale. Specifically, extensions are presented about parameterizing the intermediate temporal scale levels, analysing the resulting temporal dynamics and transferring the theory to a discrete implementation in terms of recursive filters over time.

  • 254.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Spatio-temporal scale selection in video data2017Inngår i: Scale Space and Variational Methods in Computer Vision, Springer-Verlag Tokyo Inc., 2017, Vol. 10302, s. 3-15Konferansepaper (Fagfellevurdert)
    Abstract [en]

    We present a theory and a method for simultaneous detection of local spatial and temporal scales in video data. The underlying idea is that if we process video data by spatio-temporal receptive fields at multiple spatial and temporal scales, we would like to generate hypotheses about the spatial extent and the temporal duration of the underlying spatio-temporal image structures that gave rise to the feature responses.

    For two types of spatio-temporal scale-space representations, (i) a non-causal Gaussian spatio-temporal scale space for offline analysis of pre-recorded video sequences and (ii) a time-causal and time-recursive spatio-temporal scale space for online analysis of real-time video streams, we express sufficient conditions for spatio-temporal feature detectors in terms of spatio-temporal receptive fields to deliver scale covariant and scale invariant feature responses.

    A theoretical analysis is given of the scale selection properties of six types of spatio-temporal interest point detectors, showing that five of them allow for provable scale covariance and scale invariance. Then, we describe a time-causal and time-recursive algorithm for detecting sparse spatio-temporal interest points from video streams and show that it leads to intuitively reasonable results.

  • 255.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Spatio-temporal scale selection in video data2018Inngår i: Journal of Mathematical Imaging and Vision, ISSN 0924-9907, E-ISSN 1573-7683, Vol. 60, nr 4, s. 525-562Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    This work presents a theory and methodology for simultaneous detection of local spatial and temporal scales in video data. The underlying idea is that if we process video data by spatio-temporal receptive fields at multiple spatial and temporal scales, we would like to generate hypotheses about the spatial extent and the temporal duration of the underlying spatio-temporal image structures that gave rise to the feature responses.

    For two types of spatio-temporal scale-space representations, (i) a non-causal Gaussian spatio-temporal scale space for offline analysis of pre-recorded video sequences and (ii) a time-causal and time-recursive spatio-temporal scale space for online analysis of real-time video streams, we express sufficient conditions for spatio-temporal feature detectors in terms of spatio-temporal receptive fields to deliver scale covariant and scale invariant feature responses.

    We present an in-depth theoretical analysis of the scale selection properties of eight types of spatio-temporal interest point detectors in terms of either: (i)-(ii) the spatial Laplacian applied to the first- and second-order temporal derivatives, (iii)-(iv) the determinant of the spatial Hessian applied to the first- and second-order temporal derivatives, (v) the determinant of the spatio-temporal Hessian matrix, (vi) the spatio-temporal Laplacian and (vii)-(viii) the first- and second-order temporal derivatives of the determinant of the spatial Hessian matrix. It is shown that seven of these spatio-temporal feature detectors allow for provable scale covariance and scale invariance. Then, we describe a time-causal and time-recursive algorithm for detecting sparse spatio-temporal interest points from video streams and show that it leads to intuitively reasonable results.

    An experimental quantification of the accuracy of the spatio-temporal scale estimates and the amount of temporal delay obtained these spatio-temporal interest point detectors is given showing that: (i) the spatial and temporal scale selection properties predicted by the continuous theory are well preserved in the discrete implementation and (ii) the spatial Laplacian or the determinant of the spatial Hessian applied to the first- and second-order temporal derivatives lead to much shorter temporal delays in a time-causal implementation compared to the determinant of the spatio-temporal Hessian or the first- and second-order temporal derivatives of the determinant of the spatial Hessian matrix.

  • 256.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Temporal scale selection in time-causal scale space2017Inngår i: Journal of Mathematical Imaging and Vision, ISSN 0924-9907, E-ISSN 1573-7683, Vol. 58, nr 1, s. 57-101Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    When designing and developing scale selection mechanisms for generating hypotheses about characteristic scales in signals, it is essential that the selected scale levels reflect the extent of the underlying structures in the signal.

    This paper presents a theory and in-depth theoretical analysis about the scale selection properties of methods for automatically selecting local temporal scales in time-dependent signals based on local extrema over temporal scales of scale-normalized temporal derivative responses. Specifically, this paper develops a novel theoretical framework for performing such temporal scale selection over a time-causal and time-recursive temporal domain as is necessary when processing continuous video or audio streams in real time or when modelling biological perception.

    For a recently developed time-causal and time-recursive scale-space concept defined by convolution with a scale-invariant limit kernel, we show that it is possible to transfer a large number of the desirable scale selection properties that hold for the Gaussian scale-space concept over a non-causal temporal domain to this temporal scale-space concept over a truly time-causal domain. Specifically, we show that for this temporal scale-space concept, it is possible to achieve true temporal scale invariance although the temporal scale levels have to be discrete, which is a novel theoretical construction.

    The analysis starts from a detailed comparison of different temporal scale-space concepts and their relative advantages and disadvantages, leading the focus to a class of recently extended time-causal and time-recursive temporal scale-space concepts based on first-order integrators or equivalently truncated exponential kernels coupled in cascade. Specifically, by the discrete nature of the temporal scale levels in this class of time-causal scale-space concepts, we study two special cases of distributing the intermediate temporal scale levels, by using either a uniform distribution in terms of the variance of the composed temporal scale-space kernel or a logarithmic distribution.

    In the case of a uniform distribution of the temporal scale levels, we show that scale selection based on local extrema of scale-normalized derivatives over temporal scales makes it possible to estimate the temporal duration of sparse local features defined in terms of temporal extrema of first- or second-order temporal derivative responses. For dense features modelled as a sine wave, the lack of temporal scale invariance does, however, constitute a major limitation for handling dense temporal structures of different temporal duration in a uniform manner.

    In the case of a logarithmic distribution of the temporal scale levels, specifically taken to the limit of a time-causal limit kernel with an infinitely dense distribution of the temporal scale levels towards zero temporal scale, we show that it is possible to achieve true temporal scale invariance to handle dense features modelled as a sine wave in a uniform manner over different temporal durations of the temporal structures as well to achieve more general temporal scale invariance for any signal over any temporal scaling transformation with a temporal scaling factor that is an integer power of the distribution parameter of the time-causal limit kernel.

    It is shown how these temporal scale selection properties developed for a pure temporal domain carry over to feature detectors defined over time-causal spatio-temporal and spectro-temporal domains.

  • 257.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Time-causal and time-recursive receptive fields for invariance and covariance under natural image transformations2016Konferansepaper (Annet vitenskapelig)
    Abstract [en]

    Due to the huge variability of image information under natural image transformations, the receptive field responses of the local image operations that serve as input to higher level visual processes will in general be strongly dependent on the geometric and illumination conditions in the image formation process. To obtain robustness of a vision system, it is natural to require the receptive field families underlying the image operators to be either invariant or covariant under the relevant families of natural image transformations.

    This talk presents an improved model and theory for time-causal and time-recursive spatio-temporal receptive fields, obtained by a combination of Gaussian receptive fields over the spatial domain and first-order integrators or equivalently truncated exponential filters coupled in cascade over the temporal domain. This model inherits the theoretically attractive properties of the Gaussian scale-space model over a spatial domain in terms of (i) invariance or covariance of receptive field responses under scaling transformation and affine transformations over the spatial domain combined with (ii) non-creation of new image structures from finer to coarser scales. When complemented by velocity adaptation the receptive field responses can be made (iii) Galilean covariant or invariant to account for unknown or variable relative motions between objects in the world and the observer. Additionally when expressed over a logarithmic distribution of the temporal scale levels, this model allows for (iv) scale invariance and self-similarity over the temporal domain while simultaneously expressed over a time-causal and time-recursive temporal domain, which is a theoretically new type of construction.

    We propose this axiomatically derived theory as the natural extension of the Gaussian scale-space paradigm for local image operations from a spatial domain to a time-causal spatio-temporal domain, to be used as a general framework for expressing spatial and spatio-temporal image operators for a computer vision system. The theory leads to (v) predictions about spatial and spatio-temporal receptive fields with good qualitative similarity to biological receptive fields measured by cell recordings in the retina, the lateral geniculate nucleus (LGN) and the primary visual cortex (V1). Specifically, this framework allows for (vi) computationally efficient real-time operations and leads to (vii) much better temporal dynamics (shorter temporal delays) compared to previously formulated time-causal temporal scale-space models.

    Reference:

    Lindeberg (2016) "Time-causal and time-recursive spatio-temporal receptive fields", Journal of Mathematical Imaging and Vision, 55(1): 50-88.

  • 258.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Time-causal and time-recursive spatio-temporal receptive fields2016Inngår i: Journal of Mathematical Imaging and Vision, ISSN 0924-9907, E-ISSN 1573-7683, Vol. 55, nr 1, s. 50-88Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    We present an improved model and theory for time-causal and time-recursive spatio-temporal receptive fields, obtained by a combination of Gaussian receptive fields over the spatial domain and first-order integrators or equivalently truncated exponential filters coupled in cascade over the temporal domain. 

    Compared to previous spatio-temporal scale-space formulations in terms of non-enhancement of local extrema or scale invariance, these receptive fields are based on different scale-space axiomatics over time by ensuring non-creation of new local extrema or zero-crossings with increasing temporal scale. Specifically, extensions are presented about (i) parameterizing the intermediate temporal scale levels, (ii) analysing the resulting temporal dynamics, (iii) transferring the theory to a discrete implementation in terms of recursive filters over time, (iv) computing scale-normalized spatio-temporal derivative expressions for spatio-temporal feature detection and (v) computational modelling of receptive fields in the lateral geniculate nucleus (LGN) and the primary visual cortex (V1) in biological vision. 

    We show that by distributing the intermediate temporal scale levels according to a logarithmic distribution, we obtain a new family of temporal scale-space kernels with better temporal characteristics compared to a more traditional approach of using a uniform distribution of the intermediate temporal scale levels. Specifically, the new family of time-causal kernels has much faster temporal response properties (shorter temporal delays) compared to the kernels obtained from a uniform distribution. When increasing the number of temporal scale levels, the temporal scale-space kernels in the new family do also converge very rapidly to a limit kernel possessing true self-similar scale-invariant properties over temporal scales. Thereby, the new representation allows for true scale invariance over variations in the temporal scale, although the underlying temporal scale-space representation is based on a discretized temporal scale parameter. 

    We show how scale-normalized temporal derivatives can be defined for these time-causal scale-space kernels and how the composed theory can be used for computing basic types of scale-normalized spatio-temporal derivative expressions in a computationally efficient manner.

  • 259.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Time-causal and time-recursive spatio-temporal receptive fields2015Rapport (Annet vitenskapelig)
    Abstract [en]

    We present an improved model and theory for time-causal and time-recursive spatio-temporal receptive fields, obtained by a combination of Gaussian receptive fields over the spatial domain and first-order integrators or equivalently truncated exponential filters coupled in cascade over the temporal domain. 

    Compared to previous spatio-temporal scale-space formulations in terms of non-enhancement of local extrema or scale invariance, these receptive fields are based on different scale-space axiomatics over time by ensuring non-creation of new local extrema or zero-crossings with increasing temporal scale. Specifically, extensions are presented about (i) parameterizing the intermediate temporal scale levels, (ii) analysing the resulting temporal dynamics, (iii) transferring the theory to a discrete implementation in terms of recursive filters over time, (iv) computing scale-normalized spatio-temporal derivative expressions for spatio-temporal feature detection and (v) computational modelling of receptive fields in the lateral geniculate nucleus (LGN) and the primary visual cortex (V1) in biological vision. 

    We show that by distributing the intermediate temporal scale levels according to a logarithmic distribution, we obtain a new family of temporal scale-space kernels with better temporal characteristics compared to a more traditional approach of using a uniform distribution of the intermediate temporal scale levels. Specifically, the new family of time-causal kernels has much faster temporal response properties (shorter temporal delays) compared to the kernels obtained from a uniform distribution. When increasing the number of temporal scale levels, the temporal scale-space kernels in the new family do also converge very rapidly to a limit kernel possessing true self-similar scale-invariant properties over temporal scales. Thereby, the new representation allows for true scale invariance over variations in the temporal scale, although the underlying temporal scale-space representation is based on a discretized temporal scale parameter. 

    We show how scale-normalized temporal derivatives can be defined for these time-causal scale-space kernels and how the composed theory can be used for computing basic types of scale-normalized spatio-temporal derivative expressions in a computationally efficient manner.

  • 260.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Time-causal and time-recursive spatio-temporal receptive fields for computer vision and computational modelling of biological vision2016Inngår i: International Workshop on Geometry, PDE’s and Lie Groups in Image Analysis, Eindhoven, The Netherlands, August 24-26, 2016., 2016Konferansepaper (Annet vitenskapelig)
    Abstract [en]

    When operating on time-dependent image information in real time, a fundamental constraint originates from the fact that image operations must be both time-causal and time-recursive.

    In this talk, we will present an improved model and theory for time-causal and time-recursive spatio-temporal receptive fields, obtained by a combination of Gaussian filters over the spatial domain and first-order integrators or equivalently truncated exponential filters coupled in cascade over the temporal domain. This receptive field family obeys scale-space axiomatics in the sense of non-enhancement of local extrema over the spatial domain and non-creation of new local extrema over time for any purely temporal signal and does in these respects guarantee theoretically well-founded treatment of spatio-temporal image structures at different spatial and temporal scales.

    By a logarithmic distribution of the temporal scale levels in combination with the construction of a time-causal limit kernel based on an infinitely dense distribution of the temporal scale levels towards zero temporal scale, it will be shown that this family allows for temporal scale invariance although the temporal scale levels by the theory have to be discrete. Additionally, the family obeys basic invariance or covariance properties under other classes of natural image transformations including spatial scaling transformations, rotations/affine image deformations over the spatial domain, Galilean transformations of space time and local multiplicative intensity transformations. Thereby, this receptive field family allows for the formulation of multi-scale differential geometric image features with invariance or covariance properties under basic classes of natural image transformations over space-time.

    It is shown how this spatio-temporal scale-space concept (i) allows for efficient computation of different types of spatio-temporal features for purposes in computer vision and (ii) leads to predictions about biological receptive fields with good qualitative similarities to the results of cell recordings in the lateral geniculate nucleus (LGN) and the primary visual cortex (V1) in biological vision.

    References:

    T. Lindeberg (2016) ”Time-causal and time-recursive spatio-temporal receptive fields”, Journal of Mathematical Imaging and Vision, 55(1): 50-88.

    T. Lindeberg (2013) ”A computational theory of visual receptive fields”, Biological Cybernetics, 107(6): 589–635.

    T. Lindeberg (2013) ”Invariance of visual operations at the level of receptive fields”, PLOS One, 8(7): e66990.

    T. Lindeberg (2011) ”Generalized Gaussian scale-space axiomatics comprising linear scale space, affine scale space and spatio-temporal scale space”, Journal of Mathematical Imaging and Vision, 40(1): 36–81.

  • 261.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Time-recursive velocity-adapted spatio-temporal scale-space filters2001Rapport (Annet vitenskapelig)
    Abstract [en]

    This paper presents a framework for constructing and computing velocity-adapted scale-space filters for spatio-temporal image data. Starting from basic criteria in terms of time-causality, time-recursivity, locality and adaptivity with respect to motion estimates, a family of spatio-temporal recursive filters is proposed and analysed. An important property of the proposed family of smoothing kernels is that the spatio-temporal covariance matrices of the discrete kernels obey similar transformation properties under Galilean transformations as for continuous smoothing kernels on continuous domains. Moreover, the proposed framework provides an efficient way to compute and generate non-separable scale-space representations without need for explicit external warping mechanisms or keeping extended temporal buffers of the past. The approach can thus be seen as a natural extension of recursive scale-space filters from pure temporal data to spatio-temporal domains.

    Receptive field profiles generated by the proposed theory show high qualitative similarities to receptive field profiles recorded from biological vision.

  • 262.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Time-recursive velocity-adapted spatio-temporal scale-space filters2002Inngår i: : ECCV'02 published in Springer Lecture Notes in Computer Science, volume 2350, 2002, Vol. 2350, s. 52-67Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This paper presents a theory for constructing and computing velocity-adapted scale-space filters for spatio-temporal image data. Starting from basic criteria in terms of time-causality, time-recursivity, locality and adaptivity with respect to motion estimates, a family of spatio-temporal recursive filters is proposed and analysed. An important property of the proposed family of smoothing kernels is that the spatio-temporal covariance matrices of the discrete kernels obey similar transformation properties under Galilean transformations as for continuous smoothing kernels on continuous domains. Moreover, the proposed theory provides an efficient way to compute and generate nonseparable scale-space representations without need for explicit external warping mechanisms or keeping extended temporal buffers of the past. The approach can thus be seen as a natural extension of recursive scale-space filters from pure temporal data to spatio-temporal domains.

  • 263.
    Lindeberg, Tony
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Akbarzadeh, A.
    Laptev, Ivan
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Galilean-corrected spatio-temporal interest operators2004Rapport (Annet vitenskapelig)
    Abstract [en]

    This paper presents a set of image operators for detecting regions in space-time where interesting events occur. To define such regions of interest, we compute a spatio-temporal secondmoment matrix from a spatio-temporal scale-space representation, and diagonalize this matrix locally, using a local Galilean transformation in space-time, optionally combined with a spatial rotation, so as to make the Galilean invariant degrees of freedom explicit. From the Galilean-diagonalized descriptor so obtained, we then formulate different types of space-time interest operators, and illustrate their properties on different types of image sequences.

  • 264.
    Lindeberg, Tony
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Akbarzadeh, Amir
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Laptev, Ivan
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Galilean-diagonalized spatio-temporal interest operators2004Inngår i: Proc. 17th International Conference on Pattern Recognition (ICPR), 2004, s. 57-62Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This paper presents a set of image operators for detecting regions in space-time where interesting events occur. To define such regions of interest, we compute a spatio-temporal second-moment matrix from a spatio-temporal scale-space representation, and diagonalize this matrix locally, using a local Galilean transformation in space-time, optionally combined with a spatial rotation, so as to make the Galilean invariant degrees of freedom explicit. From the Galilean-diagonalized descriptor so obtained, we then formulate different types of space-time interest operators, and illustrate their properties on different types of image sequences.

  • 265.
    Lindeberg, Tony
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Bretzner, Lars
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Förfarande och anordning för överföring av information genom rörelsedetektering, samt användning av anordningen: [Method and arrangement for controlling means for three-dimensional transfer of information by motion detection]1998Patent (Annet (populærvitenskap, debatt, mm))
    Abstract [en]

    The invention concerns a method and an arrangement for controlling means (24, 26), themselves controlled by processors, for three-dimensional transfer of information by motion detection using an image capturing device (20). Features of an object (10) are detected and transferred to line and point correspondences, which are used for controlling means (22, 26) to perform rotational and translational motion.

  • 266.
    Lindeberg, Tony
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Bretzner, Lars
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Real-time scale selection in hybrid multi-scale representations2003Inngår i: Proc. Scale-Space’03, Springer Berlin/Heidelberg, 2003, Vol. 2695, s. 148-163Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Local scale information extracted from visual data in a bottom-up manner constitutes an important cue for a large number of visual tasks. This article presents a framework for how the computation of such scale descriptors can be performed in real time on a standard computer.

    The proposed scale selection framework is expressed within a novel type of multi-scale representation, referred to as hybrid multi-scale representation, which aims at integrating and providing variable trade-offs between the relative advantages of pyramids and scale-space representation, in terms of computational efficiency and computational accuracy. Starting from binomial scale-space kernels of different widths, we describe a family pyramid representations, in which the regular pyramid concept and the regular scale-space representation constitute limiting cases. In particular, the steepness of the pyramid as well as the sampling density in the scale direction can be varied.

    It is shown how the definition of gamma-normalized derivative operators underlying the automatic scale selection mechanism can be transferred from a regular scale-space to a hybrid pyramid, and two alternative definitions are studied in detail, referred to as variance normalization and l(p)-normalization. The computational accuracy of these two schemes is evaluated, and it is shown how the choice of sub-sampling rate provides a trade-off between the computational efficiency and the accuracy of the scale descriptors. Experimental evaluations are presented for both synthetic and real data. In a simplified form, this scale selection mechanism has been running for two years, in a real-time computer vision system.

  • 267.
    Lindeberg, Tony
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Eklundh, Jan-Olof
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Analysis of aerosol images using the scale-space primal sketch1991Inngår i: Machine Vision and Applications, ISSN 0932-8092, E-ISSN 1432-1769, Vol. 4, nr 3, s. 135-144Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    We outline a method to analyze aerosol images using the scale-space representation. The pictures, which are photographs of an aerosol generated by a fuel injector, contain phenomena that by a human observer are perceived as periodic or oscillatory structures. The presence of these structures is not immediately apparent since the periodicity manifests itself at a coarse level of scale while the dominating objects inthe images are small dark blobs, that is, fine scale objects. Experimentally, we illustrate that the scale-space theory provides an objective method to bring out these events. However, in this form the method still relies on a subjective observer in order to extract and verify the existence of the periodic phenomena.Then we extend the analysis by adding a recently developed image analysis concept called the scale-space primal sketch. With this tool, we are able to extract significant structures from a grey-level image automatically without any strong a priori assumptions about either the shape or the scale (size) of the primitives. Experiments demonstrate that the periodic drop clusters we perceived in the image are detected by the algorithm as significant image structures. These results provide objective evidence verifying the existence of oscillatory phenomena.

  • 268.
    Lindeberg, Tony
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Eklundh, Jan-Olof
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Construction of a Scale-Space Primal Sketch1990Inngår i: Proceedings of the British Machine Vision Conference 1990: BMVC'90 (Oxford, England), The British Machine Vision Association and Society for Pattern Recognition , 1990, s. 97-102Konferansepaper (Fagfellevurdert)
    Abstract [en]

    We present a multi-scale representation of grey-level shape, called scale-space primal sketch, that makes explicit features in scale-space as well as the relations between features at different levels of scale. The representation gives a qualitative description of the image structure that allows for extraction of significant image structure --- stable scales and regions of interest --- in a solely bottom-up data-driven manner. Hence, it can be seen as preceding further processing, which can then be properly tuned. Experiments on real imagery demonstrate that the proposed theory gives perceptually intuitive results.

  • 269.
    Lindeberg, Tony
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Eklundh, Jan-Olof
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    On the Computation of a Scale-Space Primal Sketch1991Inngår i: Journal of Visual Communication and Image Representation, ISSN 1047-3203, E-ISSN 1095-9076, Vol. 2, nr 1, s. 55-78Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Scale-space theory provides a well-founded framework for dealing with image structures that naturally occur at different scales. According to this theory one can from a given signal derive a family of signals by successively removing features when moving from fine to coarse scale. In contrast to other multiscale representations, scale-space is based on a precise mathematical definition of causality, and the behavior of structure as scale changes can be analytically described. However, the information in the scale-space embedding is only implicit. There is no explicit representation of features or the relations between features at different levels of scale. In this paper we present a theory for constructing such an explicit representation on the basis of formal scale-space theory. We treat gray-level images, but the approach is valid for any bounded function, and can therefore be used to derive properties of, e.g., spatial derivatives. Hence it is useful for studying representations based on intensity discontinuities as well. The representation is obtained in a completely data-driven manner, without relying on any specific parameters. It gives a description of the image structure that is rather coarse. However, since significant scales and regions are actually determined from the data, our approach can be seen as preceding further processing, which can then be properly tuned. An important problem in deriving the representation concerns measuring structure in such a way that the significance over scale can be established. This problem and the problem of proper parameterization of scale are given a careful analysis. Experiments on real imagery demonstrate that the proposed theory gives perceptually intuitive results.

  • 270.
    Lindeberg, Tony
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Eklundh, Jan-Olof
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Scale detection and region extraction from a scale-space primal sketch1990Inngår i: Computer Vision, 1990. Proceedings, Third International Conference on, IEEE Computer Society, 1990, s. 416-426Konferansepaper (Fagfellevurdert)
    Abstract [en]

    We present: (1) a multi-scale representation of gray-level shape, called a scale-space primal sketch, which makes explicit both features in scale-space and the relations between features at different levels of scales; (2) a theory for extraction of significant image structure from this representation; and (3) applications to edge detection, histogram analysis and junction classification demonstrating how the proposed method can be used for guiding later stage processing. The representation gives a qualitative description of the image structure that allows for detection of stable scales and regions of interest in a solely bottom-up data-driven way. In other words, it generates coarse segmentation cues and can be hence seen as preceding further processing, which can then be properly tuned. We argue that once such information is available many other processing tasks can become much simpler. Experiments on real imagery demonstrate that the proposed theory gives perceptually intuitive results.

  • 271.
    Lindeberg, Tony
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Eklundh, Jan-Olof
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    The Scale-Space Primal Sketch: Construction and Experiments1992Inngår i: Image and Vision Computing, ISSN 0262-8856, E-ISSN 1872-8138, Vol. 10, nr 1, s. 3-18Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    We present a multi-scale representation of grey-level shape, called the scale-space primal sketch, that makes explicit features in scale-space as well as the relations between features at different levels of scale. The representation gives a qualitative description of the image structure that allows for extraction of significant image structure — stable scales and regions of interest-in a solely bottom-up data-driven manner. Hence, it can be seen as preceding further processing, which can then be properly tuned. Experiments on real imagery demonstrate that the proposed theory gives intuitively reasonable results.

  • 272.
    Lindeberg, Tony
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Eriksson, Björn
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Johansson, Fredrik
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Roland, Per
    Dept. of Neuroscience, Karolinska Institute.
    Automatic matching of brain images and brain atlases using multi-scale fusion algorithms1997Inngår i: / [ed] L. Friberg, A. Gjedde, S. Holm, N.A. Lassen, and M. Novak, 1997, s. 419-Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This paper presents a method for automatic matching of brain images using automatic scale selection

  • 273.
    Lindeberg, Tony
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Fagerström, Daniel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Scale-space with causal time direction1996Inngår i: : ECCV'96 (Cambridge, U.K.) published in Springer Lecture Notes in Computer Science, vol 1064, Berlin / Heidelberg: Springer , 1996, Vol. 1064, s. 229-240Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This article presents a theory for multi-scale representation of temporal data. Assuming that a real-time vision system should represent the incoming data at different time scales, an additional causality constraint arises compared to traditional scale-space theory—we can only use what has occurred in the past for computing representations at coarser time scales. Based on a previously developed scale-space theory in terms of noncreation of local maxima with increasing scale, a complete classification is given of the scale-space kernels that satisfy this property of non-creation of structure and respect the time direction as causal. It is shown that the cases of continuous and discrete time are inherently different.

  • 274.
    Lindeberg, Tony
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Florack, Luc
    Utrecht University.
    Foveal scale-space and the linear increase of receptive field size as a function of eccentricity1994Rapport (Annet vitenskapelig)
    Abstract [en]

    This paper addresses the formulation of a foveal scale-space and its relation to the scaling property of receptive field sizes with eccentricity. It is shown how the notion of a fovea can be incorporated into conventional scale-space theory leading to a foveal log-polar scale-space. Natural assumptions about uniform treatment of structures over scales and finite processing capacity imply a linear increase of minimum receptive field size as a function of eccentricity. These assumptions are similar to the ones used for deriving linear scale-space theory and the Gaussian receptive field model for an idealized visual front-end.

  • 275.
    Lindeberg, Tony
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Florack, Luc
    Utrecht University.
    On the decrease of resolution as a function of eccentricity for a foveal vision system1992Rapport (Annet vitenskapelig)
  • 276.
    Lindeberg, Tony
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Gårding, Jonas
    Shape from Texture from a Multi-Scale Perspective1993Inngår i: Fourth International Conference on Computer Vision, 1993. Proceedings: ICCV'93 / [ed] H.-H. Nagel, IEEE conference proceedings, 1993, s. 683-691Konferansepaper (Fagfellevurdert)
    Abstract [en]

    The problem of scale in shape from texture is addressed. The need for (at least) two scale parameters is emphasized; a local scale describing the amount of smoothing used for suppressing noise and irrelevant details when computing primitive texture descriptors from image data, and an integration scale describing the size of the region in space over which the statistics of the local descriptors is accumulated.

    A novel mechanism for automatic scale selection is used, based on normalized derivatives. It is used for adaptive determination of the two scale parameters in a multi-scale texture descriptor, thewindowed second moment matrix, which is defined in terms of Gaussian smoothing, first order derivatives, and non-linear pointwise combinations of these. The same scale-selection method can be used for multi-scale blob detection without any tuning parameters or thresholding.

    The resulting texture description can be combined with various assumptions about surface texture in order to estimate local surface orientation. Two specific assumptions, ``weak isotropy'' and ``constant area'', are explored in more detail. Experiments on real and synthetic reference data with known geometry demonstrate the viability of the approach.

  • 277.
    Lindeberg, Tony
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Gårding, Jonas
    Shape-adapted smoothing in estimation of 3-D depth cues from affine distortions of local 2-D brightness structure1997Inngår i: Image and Vision Computing, ISSN 0262-8856, E-ISSN 1872-8138, Vol. 15, nr 6, s. 415-434Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    This article describes a method for reducing the shape distortions due to scale-space smoothing that arise in the computation of 3-D shape cues using operators (derivatives) defined from scale-space representation. More precisely, we are concerned with a general class of methods for deriving 3-D shape cues from a 2-D image data based on the estimation of locally linearized deformations of brightness patterns. This class constitutes a common framework for describing several problems in computer vision (such as shape-from-texture, shape-from disparity-gradients, and motion estimation) and for expressing different algorithms in terms of similar types of visual front-end-operations. It is explained how surface orientation estimates will be biased due to the use of rotationally symmetric smoothing in the image domain. These effects can be reduced by extending the linear scale-space concept into an affine Gaussian scalespace representation and by performing affine shape adaptation of the smoothing kernels. This improves the accuracy of the surface orientation estimates, since the image descriptors, on which the methods are based, will be relative invariant under affine transformations, and the error thus confined to the higher-order terms in the locally linearized perspective transformation. A straightforward algorithm is presented for performing shape adaptation in practice. Experiments on real and synthetic images with known orientation demonstrate that in the presence of moderately high noise levels the accuracy is improved by typically one order of magnitude.

  • 278.
    Lindeberg, Tony
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Gårding, Jonas
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Shape-Adapted Smoothing in Estimation of 3-D Depth Cues from Affine Distortions of Local 2-D Brightness Structure1994Inngår i: Computer Vision — ECCV '94: Third European Conference on Computer Vision Stockholm, Sweden, May 2–6, 1994 Proceedings, Volume I, 1994, s. 389-400Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Rotationally symmetric operations in the image domain may give rise to shape distortions. This article describes a way of reducing this effect for a general class of methods for deriving 3-D shape cues from 2-D image data, which are based on the estimation of locally linearized distortion of brightness patterns. By extending the linear scale-space concept into an affine scale-spacerepresentation and performing affine shape adaption of the smoothing kernels, the accuracy of surface orientation estimates derived from texture and disparity cues can be improved by typically one order of magnitude. The reason for this is that the image descriptors, on which the methods are based, will be relative invariant under affine transformations, and the error will thus be confined to the higher-order terms in the locally linearized perspective mapping.

  • 279.
    Lindeberg, Tony
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Li, Meng-Xiang
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Automatic generation of break points for MDL based curve classification1995Inngår i: Scandinavian Conference on Image Analysis: SCIA'95 / [ed] G. Borgefors, 1995, s. 767-776Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This article presents a method for segmenting and classifying edges using minimum description length (MDL) approximation with automatically generated break points. A scheme is proposed where junction candidates are first detected in a multi-scale pre-processing step, which generates junction candidates with associated regions of interest. These junction features are matched to edges based on spatial coincidence. For each matched pair, a tentative break point is introduced at the edge point closest to the junction. Finally, these feature combinations serve as input for an MDL approximation method which tests the validity of the break point hypotheses and classifies the resulting edge segments as either ``straight'' or ``curved''. Experiments on real world image data demonstrate the viability of the approach.

  • 280.
    Lindeberg, Tony
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Li, Meng-Xiang
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Segmentation and classification of edges using minimum description length approximation and complementary junction cues1997Inngår i: Computer Vision and Image Understanding, ISSN 1077-3142, E-ISSN 1090-235X, Vol. 67, nr 1, s. 88-98Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    This article presents a method for segmenting and classifying edges using minimum description length (MDL) approximation with automatically generated break points. A scheme is proposed where junction candidates are first detected in a multiscale preprocessing step, which generates junction candidates with associated regions of interest. These junction features are matched to edges based on spatial coincidence. For each matched pair, a tentative break point is introduced at the edge point closest to the junction. Finally, these feature combinations serve as input for an MDL approximation method which tests the validity of the break point hypotheses and classifies the resulting edge segments as either “straight” or “curved.” Experiments on real world image data demonstrate the viability of the approach.

  • 281.
    Lindeberg, Tony
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Li, Meng-Xiang
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Segmentation and classification of edges using minimum description length approximation and complementary junction cues1995Inngår i: Theory and Applications of Image Analysis II: Selected Papers from the 9th Scandinavian Conference on Image Analysis, Uppsala, Sweden, 1995 / [ed] Gunilla Borgefors, World Scientific, 1995Kapittel i bok, del av antologi (Fagfellevurdert)
    Abstract [en]

    This article presents a method for segmenting and classifying edges using minimum description length (MDL) approximation with automatically generated break points. A scheme is proposed where junction candidates are first detected in a multi-scale pre-processing step, which generates junction candidates with associated regions of interest. These junction features are matched to edges based on spatial coincidence. For each matched pair, a tentative break point is introduced at the edge point closest to the junction. Finally, these feature combinations serve as input for an MDL approximation method which tests the validity of the break point hypotheses and classifies the resulting edge segments as either ``straight'' or ``curved''. Experiments on real world image data demonstrate the viability of the approach.

  • 282.
    Lindeberg, Tony
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Lidberg, Per
    Roland, Per
    Karolinska Institutet.
    Analysis of Brain Activation Patterns Using A 3-D Scale-Space Primal Sketch1997Inngår i: : HBM'97, published in Neuroimage, volume 5, number 4, 1997, s. 393-393Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This paper presents a method for automatically determining the spatial extent and the significance ofrCBF changes.

  • 283.
    Lindeberg, Tony
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Lidberg, Pär
    Roland, Per
    Analysis of brain activation patterns using a 3-D scale-space primal sketch1999Inngår i: Human Brain Mapping, ISSN 1065-9471, E-ISSN 1097-0193, Vol. 7, nr 3, s. 166-94Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    A fundamental problem in brain imaging concerns how to define functional areas consisting of neurons that are activated together as populations. We propose that this issue can be ideally addressed by a computer vision tool referred to as the scale-space primal sketch. This concept has the attractive properties that it allows for automatic and simultaneous extraction of the spatial extent and the significance of regions with locally high activity. In addition, a hierarchical nested tree structure of activated regions and subregions is obtained. The subject in this article is to show how the scale-space primal sketch can be used for automatic determination of the spatial extent and the significance of rCBF changes. Experiments show the result of applying this approach to functional PET data, including a preliminary comparison with two more traditional clustering techniques. Compared to previous approaches, the method overcomes the limitations of performing the analysis at a single scale or assuming specific models of the data.

  • 284.
    Lindeberg, Tony
    et al.
    KTH, Tidigare Institutioner (före 2005), Numerisk analys och datalogi, NADA.
    ter Haar Romeny, B.
    Utrecht University.
    Linear Scale-Space I: Basic Theory1994Inngår i: Geometry-Driven Diffusion in Computer Vision, Kluwer Academic Publishers, 1994, s. 1-41Kapittel i bok, del av antologi (Annet vitenskapelig)
    Abstract [en]

    Vision deals with the problem of deriving information about the world from the light reflected from it. Although the active and task-oriented nature of vision is only implicit in this formulation, this view captures several of the essential aspects of vision. As Marr (1982) phrased it in his book Vision, vision is an information processing task, in which an internal representation of information is of utmost importance. Only by representation information can be captured and made available to decision processes. The purpose of a representation is to make certain aspects of the information content explicit, that is, immediately accessible without any need for additional processing.

    This introductory chapter deals with a fundamental aspect of early image representation---the notion of scale. As Koenderink (1984) emphasizes, the problem of scale must be faced in any imaging situation. An inherent property of objects in the world and details in images is that they only exist as meaningful entities over certain ranges of scale. A simple example of this is the concept of a branch of a tree, which makes sense only at a scale from, say, a few centimeters to at most a few meters. It is meaningless to discuss the tree concept at the nanometer or the kilometer level. At those scales it is more relevant to talk about the molecules that form the leaves of the tree, or the forest in which the tree grows. Consequently, a multi-scale representation is of crucial importance if one aims at describing the structure of the world, or more specifically the structure of projections of the three-dimensional world onto two-dimensional images.

    The need for multi-scale representation is well understood, for example, in cartography; maps are produced at different degrees of abstraction. A map of the world contains the largest countries and islands, and possibly, some of the major cities, whereas towns and smaller islands appear at first in a map of a country. In a city guide, the level of abstraction is changed considerably to include streets and buildings etc. In other words, maps constitute symbolic multi-scale representations of the world around us, although constructed manually and with very specific purposes in mind.

    To compute any type of representation from image data, it is necessary to extract information, and hence interact with the data using certain operators. Some of the most fundamental problems in low-level vision and image analysis concern: what operators to use, where to apply them, and how large they should be. If these problems are not appropriately addressed, the task of interpreting the output results can be very hard. Ultimately, the task of extracting information from real image data is severely influenced by the inherent measurement problem that real-world structures, in contrast to certain ideal mathematical entities, such as ``points'' or ``lines'', appear in different ways depending upon the scale of observation.

    Phrasing the problem in this way shows the intimate relation to physics. Any physical observation by necessity has to be done through some finite aperture, and the result will, in general, depend on the aperture of observation. This holds for any device that registers physical entities from the real world including a vision system based on brightness data. Whereas constant size aperture functions may be sufficient in many (controlled) physical applications, e.g., fixed measurement devices, and also the aperture functions of the basic sensors in a camera (or retina) may have to determined a priori because of practical design constraints, it is far from clear that registering data at a fixed level of resolution is sufficient. A vision system for handling objects of different sizes and at difference distances needs a way to control the scale(s) at which the world is observed.

    The goal of this chapter is to review some fundamental results concerning a framework known as scale-space that has been developed by the computer vision community for controlling the scale of observation and representing the multi-scale nature of image data. Starting from a set of basic constraints (axioms) on the first stages of visual processing it will be shown that under reasonable conditions it is possible to substantially restrict the class of possible operations and to derive a (unique) set of weighting profiles for the aperture functions. In fact, the operators that are obtained bear qualitative similarities to receptive fields at the very earliest stages of (human) visual processing (Koenderink 1992). We shall mainly be concerned with the operations that are performed directly on raw image data by the processing modules are collectively termed the visual front-end. The purpose of this processing is to register the information on the retina, and to make important aspects of it explicit that are to be used in later stage processes. If the operations are to be local, they have to preserve the topology at the retina; for this reason the processing can be termed retinotopic processing.

    Early visual operationsAn obvious problem concerns what information should be extracted and what computations should be performed at these levels. Is any type of operation feasible? An axiomatic approach that has been adopted in order to restrict the space of possibilities is to assume that the very first stages of visual processing should be able to function without any direct knowledge about what can be expected to be in the scene. As a consequence, the first stages of visual processing should be as uncommitted and make as few irreversible decisions or choices as possible.

    The Euclidean nature of the world around us and the perspective mapping onto images impose natural constraints on a visual system. Objects move rigidly, the illumination varies, the size of objects at the retina changes with the depth from the eye, view directions may change etc. Hence, it is natural to require early visual operations to be unaffected by certain primitive transformations (e.g. translations, rotations, and grey-scale transformations). In other words, the visual system should extract properties that are invariant with respect to these transformations.

    As we shall see below, these constraints leads to operations that correspond to spatio-temporal derivatives which are then used for computing (differential) geometric descriptions of the incoming data flow. Based on the output of these operations, in turn, a large number of feature detectors can be expressed as well as modules for computing surface shape.

    The subject of this chapter is to present a tutorial overview on the historical and current insights of linear scale-space theories as a paradigm for describing the structure of scalar images and as a basis for early vision. For other introductory texts on scale-space; see the monographs by Lindeberg (1991, 1994) and Florack (1993) as well as the overview articles by ter Haar Romeny and Florack (1993) and Lindeberg (1994).

  • 285.
    Lindeberg, Tony
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    ter Haar Romeny, B.
    Utrecht University,.
    Linear Scale-Space II: Early visual operations1994Inngår i: Geometry-Driven Diffusion in Vision, Kluwer Academic Publishers, 1994, s. 43-77Kapittel i bok, del av antologi (Annet vitenskapelig)
    Abstract [en]

    Vision deals with the problem of deriving information about the world from the light reflected from it. Although the active and task-oriented nature of vision is only implicit in this formulation, this view captures several of the essential aspects of vision. As Marr (1982) phrased it in his book Vision, vision is an information processing task, in which an internal representation of information is of utmost importance. Only by representation information can be captured and made available to decision processes. The purpose of a representation is to make certain aspects of the information content explicit, that is, immediately accessible without any need for additional processing.

    This introductory chapter deals with a fundamental aspect of early image representation---the notion of scale. As Koenderink (1984) emphasizes, the problem of scale must be faced in any imaging situation. An inherent property of objects in the world and details in images is that they only exist as meaningful entities over certain ranges of scale. A simple example of this is the concept of a branch of a tree, which makes sense only at a scale from, say, a few centimeters to at most a few meters. It is meaningless to discuss the tree concept at the nanometer or the kilometer level. At those scales it is more relevant to talk about the molecules that form the leaves of the tree, or the forest in which the tree grows. Consequently, a multi-scale representation is of crucial importance if one aims at describing the structure of the world, or more specifically the structure of projections of the three-dimensional world onto two-dimensional images.

    The need for multi-scale representation is well understood, for example, in cartography; maps are produced at different degrees of abstraction. A map of the world contains the largest countries and islands, and possibly, some of the major cities, whereas towns and smaller islands appear at first in a map of a country. In a city guide, the level of abstraction is changed considerably to include streets and buildings etc. In other words, maps constitute symbolic multi-scale representations of the world around us, although constructed manually and with very specific purposes in mind.

    To compute any type of representation from image data, it is necessary to extract information, and hence interact with the data using certain operators. Some of the most fundamental problems in low-level vision and image analysis concern: what operators to use, where to apply them, and how large they should be. If these problems are not appropriately addressed, the task of interpreting the output results can be very hard. Ultimately, the task of extracting information from real image data is severely influenced by the inherent measurement problem that real-world structures, in contrast to certain ideal mathematical entities, such as ``points'' or ``lines'', appear in different ways depending upon the scale of observation.

    Phrasing the problem in this way shows the intimate relation to physics. Any physical observation by necessity has to be done through some finite aperture, and the result will, in general, depend on the aperture of observation. This holds for any device that registers physical entities from the real world including a vision system based on brightness data. Whereas constant size aperture functions may be sufficient in many (controlled) physical applications, e.g., fixed measurement devices, and also the aperture functions of the basic sensors in a camera (or retina) may have to determined a priori because of practical design constraints, it is far from clear that registering data at a fixed level of resolution is sufficient. A vision system for handling objects of different sizes and at difference distances needs a way to control the scale(s) at which the world is observed.

    The goal of this chapter is to review some fundamental results concerning a framework known as scale-space that has been developed by the computer vision community for controlling the scale of observation and representing the multi-scale nature of image data. Starting from a set of basic constraints (axioms) on the first stages of visual processing it will be shown that under reasonable conditions it is possible to substantially restrict the class of possible operations and to derive a (unique) set of weighting profiles for the aperture functions. In fact, the operators that are obtained bear qualitative similarities to receptive fields at the very earliest stages of (human) visual processing (Koenderink 1992). We shall mainly be concerned with the operations that are performed directly on raw image data by the processing modules are collectively termed the visual front-end. The purpose of this processing is to register the information on the retina, and to make important aspects of it explicit that are to be used in later stage processes. If the operations are to be local, they have to preserve the topology at the retina; for this reason the processing can be termed retinotopic processing.

    Early visual operationsAn obvious problem concerns what information should be extracted and what computations should be performed at these levels. Is any type of operation feasible? An axiomatic approach that has been adopted in order to restrict the space of possibilities is to assume that the very first stages of visual processing should be able to function without any direct knowledge about what can be expected to be in the scene. As a consequence, the first stages of visual processing should be as uncommitted and make as few irreversible decisions or choices as possible.

    The Euclidean nature of the world around us and the perspective mapping onto images impose natural constraints on a visual system. Objects move rigidly, the illumination varies, the size of objects at the retina changes with the depth from the eye, view directions may change etc. Hence, it is natural to require early visual operations to be unaffected by certain primitive transformations (e.g. translations, rotations, and grey-scale transformations). In other words, the visual system should extract properties that are invariant with respect to these transformations.

    As we shall see below, these constraints leads to operations that correspond to spatio-temporal derivatives which are then used for computing (differential) geometric descriptions of the incoming data flow. Based on the output of these operations, in turn, a large number of feature detectors can be expressed as well as modules for computing surface shape.

    The subject of this chapter is to present a tutorial overview on the historical and current insights of linear scale-space theories as a paradigm for describing the structure of scalar images and as a basis for early vision. For other introductory texts on scale-space; see the monographs by Lindeberg (1991, 1994) and Florack (1993) as well as the overview articles by ter Haar Romeny and Florack (1993) and Lindeberg (1994).

  • 286.
    Liu, Du
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Teknisk informationsvetenskap.
    Flierl, Markus
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling (Stängd 130101). KTH, Skolan för elektro- och systemteknik (EES), Centra, ACCESS Linnaeus Centre. KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    Video coding using multi-reference motion-adaptive transforms based on graphs2016Inngår i: 2016 IEEE 12th Image, Video, and Multidimensional Signal Processing Workshop, IVMSP 2016, IEEE, 2016Konferansepaper (Fagfellevurdert)
    Abstract [en]

    The purpose of the work is to produce jointly coded frames for efficient video coding. We use motion-adaptive transforms in the temporal domain to generate the temporal subbands. The motion information is used to form graphs for transform construction. In our previous work, the motion-adaptive transform allows only one reference pixel to be the lowband coefficient. In this paper, we extend the motion-adaptive transform such that it permits multiple references and produces multiple lowband coefficients, which can be used in the case of bidirectional or multihypothesis motion estimation. The multi-reference motion-adaptive transform (MRMAT) is always orthonormal, thus, the energy is preserved by the transform. We compare MRMAT and the motion-compensated orthogonal transform (MCOT) [1], while HEVC intra coding is used to encode the temporal subbands. The experimental results show that MRMAT outperforms MCOT by about 0.6dB.

  • 287.
    Liu, Jia
    et al.
    KTH, Skolan för industriell teknik och management (ITM), Industriell ekonomi och organisation (Inst.), Industriell Management.
    Li, Z.
    Hemani, Ahmed
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Elektroniksystem.
    Design of evaluation platform of machine vision for portable wireless terminals2011Konferansepaper (Fagfellevurdert)
    Abstract [en]

    An evaluation platform for Machine vision algorithm is designed in this paper. The platform is constructed with DM6437 DSP processor and image input-output circuit models. An image process algorithm used for machine vision can be performed on the platform. With DFG model of the algorithm, the algorithm architecture can be built for programming and analyzing expediently. As an example the image segmentation algorithm has been modeled and executed with the platform. The result shows that the platform is useful for algorithm analysis and could be compared with other implementation system as design reference.

  • 288. Loianno, G.
    et al.
    Lippiello, V.
    Fischione, Carlo
    KTH, Skolan för elektro- och systemteknik (EES), Reglerteknik. KTH, Skolan för elektro- och systemteknik (EES), Centra, ACCESS Linnaeus Centre.
    Siciliano, B.
    Visual and inertial multi-rate data fusion for motion estimation via Pareto-optimization2013Inngår i: 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE , 2013, s. 3993-3999Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Motion estimation is an open research field in control and robotic applications. Sensor fusion algorithms are generally used to achieve an accurate estimation of the vehicle motion by combining heterogeneous sensors measurements with different statistical characteristics. In this paper, a new method that combines measurements provided by an inertial sensor and a vision system is presented. Compared to classical modelbased techniques, the method relies on a Pareto optimization that trades off the statistical properties of the measurements. The proposed technique is evaluated with simulations in terms of computational requirements and estimation accuracy with respect to a classical Kalman filter approach. It is shown that the proposed method gives an improved estimation accuracy at the cost of a slightly increased computational complexity.

  • 289. Lu, G.
    et al.
    He, J.
    Yan, J.
    Li, Haibo
    KTH, Skolan för datavetenskap och kommunikation (CSC), Medieteknik och interaktionsdesign, MID. Nanjing University of Posts and Telecommunications.
    Convolutional neural network for facial expression recognition2016Inngår i: Journal of Nanjing University of Posts and Telecommunications, ISSN 1673-5439, Vol. 36, nr 1, s. 16-22Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    To avoid the complex explicit feature extraction process in traditional expression recognition, a convolutional neural network (CNN) for the facial expression recognition is proposed. Firstly, the facial expression image is normalized and the implicit features are extracted by using the trainable convolution kernel. Then, the maximum pooling is used to reduce the dimensions of the extracted implicit features. Finally, the Softmax classifier is used to classify the facial expressions of the test samples. The experiment is carried out on the CK+ facial expression database using the graphics processing unit (GPU). Experimental results show the performance and the generalization ability of the CNN for facial expression recognition.

  • 290.
    Lundberg, Carl
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Numerisk Analys och Datalogi, NADA. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Christensen, Henrik I.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Numerisk Analys och Datalogi, NADA. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Hedström, Andreas
    KTH, Skolan för datavetenskap och kommunikation (CSC), Numerisk Analys och Datalogi, NADA. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    The use of robots in harsh and unstructured field applications2005Inngår i: 2005 IEEE International Workshop on Robot and Human Interactive Communication (RO-MAN), NEW YORK, NY: IEEE , 2005, s. 143-150Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Robots have a potential to be a significant aid in high risk, unstructured and stressing situations such as experienced by police, fire brigade, rescue workers and military. In this project we have explored the abilities of today's robot technology in the mentioned fields. This was done by, studying the user, identifying scenarios where a robot could be used and implementing a robot system for these cases. We have concluded that highly portable field robots are emerging to be an available technology but that the human-robot interaction is currently a major limiting factor of today's systems. Further we have found that operational protocols, stating how to use the robots, have to be designed in order to make robots an effective tool in harsh and unstructured field environments.

  • 291. Lundberg, I.
    et al.
    Björkman, Mårten
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Ögren, Petter
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Intrinsic camera and hand-eye calibration for a robot vision system using a point marker2015Inngår i: IEEE-RAS International Conference on Humanoid Robots, IEEE Computer Society, 2015, s. 59-66Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Accurate robot camera calibration is a requirement for vision guided robots to perform precision assembly tasks. In this paper, we address the problem of doing intrinsic camera and hand-eye calibration on a robot vision system using a single point marker. This removes the need for using bulky special purpose calibration objects, and also facilitates on line accuracy checking and re-calibration when needed, without altering the robots production environment. The proposed solution provides a calibration routine that produces high quality results on par with the robot accuracy and completes a calibration in 3 minutes without need of manual intervention. We also present a method for automatic testing of camera calibration accuracy. Results from experimental verification on the dual arm concept robot FRIDA are presented.

  • 292.
    Luo, Guoliang
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Bergström, Niklas
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Ek, Carl Henrik
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Kragic, Danica
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Representing actions with Kernels2011Inngår i: IEEE/RSJ International Conference on Intelligent Robots and Systems, 2011, s. 2028-2035Konferansepaper (Fagfellevurdert)
    Abstract [en]

    A long standing research goal is to create robots capable of interacting with humans in dynamic environments.To realise this a robot needs to understand and interpret the underlying meaning and intentions of a human action through a model of its sensory data. The visual domain provides a rich description of the environment and data is readily available in most system through inexpensive cameras. However, such data is very high-dimensional and extremely redundant making modeling challenging.Recently there has been a significant interest in semantic modeling from visual stimuli. Even though results are encouraging available methods are unable to perform robustly in realworld scenarios.In this work we present a system for action modeling from visual data by proposing a new and principled interpretation for representing semantic information. The representation is integrated with a real-time segmentation. The method is robust and flexible making it applicable for modeling in a realistic interaction scenario which demands handling noisy observations and require real-time performance. We provide extensive evaluation and show significant improvements compared to the state-of-the-art.

  • 293.
    Maboudi Afkham, Heydar
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Animal Recognition Using Joint Visual Vocabulary2009Independent thesis Advanced level (degree of Master (One Year)), 20 poäng / 30 hpOppgave
    Abstract [en]

    This thesis presents a series of experiments on recognizing animals in complex

    scenes. Unlike usual objects used for the recognition task (cars, airplanes, ...)

    animals appear in a variety of poses and shapes in outdoor images. To perform

    this task a dataset of outdoor images should be provided. Among the available

    datasets there are some animal classes but as discussed in this thesis these

    datasets do not capture the necessary variations needed for realistic analysis.

    To overcome this problem a new extensive dataset,

    KTH-animals

    , containing

    realistic images of animals in complex natural environments. The methods

    designed on the other datasets do not preform well on the animals dataset

    due to the larger variations in this dataset. One of the methods that showed

    promising results on one of these datasets on the animals dataset was applied

    on

    KTH-animals

    and showed how it failed to encode the large variations in

    this dataset.

    To familiarize the reader with the concept of computer vision and the

    mathematics backgrounds a chapter of this thesis is dedicated to this matter.

    This section presents a brief review of the texture descriptors and several

    classification methods together with mathematical and statistical algorithms

    needed by them.

    To analyze the images of the dataset two different methodologies are introduced

    in this thesis. In the first methodology

    fuzzy classifiers

    we analyze

    the images solely based on the animals skin texture of the animals. To do so an

    accurate manual segmentation of the images is provided. Here the skin texture

    is judged using many different features and the results are combined with each

    other with

    fuzzy classifiers

    . Since the assumption of neglecting the background

    information in unrealistic the joint visual vocabularies are introduced.

    Joint visual vocabularies

    is a method for visual object categorization based

    on encoding the joint textural information in objects and the surrounding background,

    and requiring no segmentation during recognition. The framework can

    be used together with various learning techniques and model representations.

    Here we use this framework with simple probabilistic models and more complex

    representations obtained using Support Vector Machines. We prove that

    our approach provides good recognition performance for complex problems

    for which some of the existing methods have difficulties.

    The achievements of this thesis are a challenging database for animal

    recognition. A review of the previous work and related mathematical background.

    Texture feature evaluation on the "KTH-animal" dataset. Introduction

    a method for object recognition based on joint statistics over the image.

    Applying

    different model representation of different complexity within the same

    classification framework, simple probabilistic models and more complex ones

    based on Support Vector Machines.

  • 294.
    Maboudi Afkham, Heydar
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Improving Image Classification Performance using Joint Feature Selection2014Doktoravhandling, monografi (Annet vitenskapelig)
    Abstract [en]

    In this thesis, we focus on the problem of image classification and investigate how its performance can be systematically improved. Improving the performance of different computer vision methods has been the subject of many studies. While different studies take different approaches to achieve this improvement, in this thesis we address this problem by investigating the relevance of the statistics collected from the image.

    We propose a framework for gradually improving the quality of an already existing image descriptor. In our studies, we employ a descriptor which is composed the response of a series of discriminative components for summarizing each image. As we will show, this descriptor has an ideal form in which all categories become linearly separable. While, reaching this form is not possible, we will argue how by replacing a small fraction of these components, it is possible to obtain a descriptor which is, on average, closer to this ideal form. To do so, we initially identify which components do not contribute to the quality of the descriptor and replace them with more robust components. As we will show, this replacement has a positive effect on the quality of the descriptor.

    While there are many ways of obtaining more robust components, we introduce a joint feature selection problem to obtain image features that retains class discriminative properties while simultaneously generalising between within class variations. Our approach is based on the concept of a joint feature where several small features are combined in a spatial structure. The proposed framework automatically learns the structure of the joint constellations in a class dependent manner improving the generalisation and discrimination capabilities of the local descriptor while still retaining a low-dimensional representations.

    The joint feature selection problem discussed in this thesis belongs to a specific class of latent variable models that assumes each labeled sample is associated with a set of different features, with no prior knowledge of which feature is the most relevant feature to be used. Deformable-Part Models (DPM) can be seen as good examples of such models. These models are usually considered to be expensive to train and very sensitive to the initialization. Here, we focus on the learning of such models by introducing a topological framework and show how it is possible to both reduce the learning complexity and produce more robust decision boundaries. We will also argue how our framework can be used for producing robust decision boundaries without exploiting the dataset bias or relying on accurate annotations.

    To examine the hypothesis of this thesis, we evaluate different parts of our framework on several challenging datasets and demonstrate how our framework is capable of gradually improving the performance of image classification by collecting more robust statistics from the image and improving the quality of the descriptor.

  • 295.
    Madry, Marianna
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Maboudi Afkham, Heydar
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Ek, Carl Henrik
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Carlsson, Stefan
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Kragic, Danica
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Extracting essential local object characteristics for 3D object categorization2013Inngår i: 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE conference proceedings, 2013, s. 2240-2247Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Most object classes share a considerable amount of local appearance and often only a small number of features are discriminative. The traditional approach to represent an object is based on a summarization of the local characteristics by counting the number of feature occurrences. In this paper we propose the use of a recently developed technique for summarizations that, rather than looking into the quantity of features, encodes their quality to learn a description of an object. Our approach is based on extracting and aggregating only the essential characteristics of an object class for a task. We show how the proposed method significantly improves on previous work in 3D object categorization. We discuss the benefits of the method in other scenarios such as robot grasping. We provide extensive quantitative and qualitative experiments comparing our approach to the state of the art to justify the described approach.

  • 296.
    Madry, Marianna
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Song, Dan
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Ek, Carl Henrik
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Kragic, Danica
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    "Robot, bring me something to drink from": object representation for transferring task specific grasps2013Inngår i: In IEEE International Conference on Robotics and Automation (ICRA 2012), Workshop on Semantic Perception, Mapping and Exploration (SPME),  St. Paul, MN, USA, May 13, 2012, 2013Konferansepaper (Fagfellevurdert)
    Abstract [en]

    In this paper, we present an approach for taskspecificobject representation which facilitates transfer of graspknowledge from a known object to a novel one. Our representation encompasses: (a) several visual object properties,(b) object functionality and (c) task constrains in order to provide a suitable goal-directed grasp. We compare various features describing complementary object attributes to evaluate the balance between the discrimination and generalization properties of the representation. The experimental setup is a scene containing multiple objects. Individual object hypotheses are first detected, categorized and then used as the input to a grasp reasoning system that encodes the task information. Our approach not only allows to find objects in a real world scene that afford a desired task, but also to generate and successfully transfer task-based grasp within and across object categories.

  • 297.
    Madry, Marianna
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Song, Dan
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Kragic, Danica
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    From object categories to grasp transfer using probabilistic reasoning2012Inngår i: 2012 IEEE International Conference on Robotics and Automation (ICRA), IEEE Computer Society, 2012, s. 1716-1723Konferansepaper (Fagfellevurdert)
    Abstract [en]

    In this paper we address the problem of grasp generation and grasp transfer between objects using categorical knowledge. The system is built upon an i) active scene segmentation module, able of generating object hypotheses and segmenting them from the background in real time, ii) object categorization system using integration of 2D and 3D cues, and iii) probabilistic grasp reasoning system. Individual object hypotheses are first generated, categorized and then used as the input to a grasp generation and transfer system that encodes task, object and action properties. The experimental evaluation compares individual 2D and 3D categorization approaches with the integrated system, and it demonstrates the usefulness of the categorization in task-based grasping and grasp transfer.

  • 298.
    Mahbod, A.
    et al.
    Romania.
    Ellinger, I.
    Romania.
    Ecker, R.
    Romania.
    Smedby, Örjan
    KTH, Skolan för kemi, bioteknologi och hälsa (CBH), Medicinteknik och hälsosystem, Medicinsk avbildning.
    Wang, Chunliang
    KTH, Skolan för kemi, bioteknologi och hälsa (CBH), Medicinteknik och hälsosystem, Medicinsk avbildning.
    Breast Cancer Histological Image Classification Using Fine-Tuned Deep Network Fusion2018Inngår i: 15th International Conference on Image Analysis and Recognition, ICIAR 2018, Springer, 2018, s. 754-762Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Breast cancer is the most common cancer type in women worldwide. Histological evaluation of the breast biopsies is a challenging task even for experienced pathologists. In this paper, we propose a fully automatic method to classify breast cancer histological images to four classes, namely normal, benign, in situ carcinoma and invasive carcinoma. The proposed method takes normalized hematoxylin and eosin stained images as input and gives the final prediction by fusing the output of two residual neural networks (ResNet) of different depth. These ResNets were first pre-trained on ImageNet images, and then fine-tuned on breast histological images. We found that our approach outperformed a previous published method by a large margin when applied on the BioImaging 2015 challenge dataset yielding an accuracy of 97.22%. Moreover, the same approach provided an excellent classification performance with an accuracy of 88.50% when applied on the ICIAR 2018 grand challenge dataset using 5-fold cross validation.

  • 299.
    Maki, Atsuto
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Bretzner, Lars
    Eklundh, Jan-Olof
    Local Fourir Phase and Disparity Estimates: An Analytical Study1995Inngår i: International Conference on Computer Analysis of Images and Patterns, 1995, Vol. 970, s. 868-873Konferansepaper (Fagfellevurdert)
  • 300.
    Maki, Atsuto
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Nordlund, Peter
    Eklundh, Jan-Olof
    A computational model of depth-based attention1996Inngår i: International Conference on Pattern Recognition, 1996, s. 734-739Konferansepaper (Fagfellevurdert)
3456789 251 - 300 of 467
RefereraExporteraLink til resultatlisten
Permanent link
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf