Endre søk
Begrens søket
123 1 - 50 of 123
RefereraExporteraLink til resultatlisten
Permanent link
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Treff pr side
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Forfatter A-Ø
  • Forfatter Ø-A
  • Tittel A-Ø
  • Tittel Ø-A
  • Type publikasjon A-Ø
  • Type publikasjon Ø-A
  • Eldste først
  • Nyeste først
  • Skapad (Eldste først)
  • Skapad (Nyeste først)
  • Senast uppdaterad (Eldste først)
  • Senast uppdaterad (Nyeste først)
  • Disputationsdatum (tidligste først)
  • Disputationsdatum (siste først)
  • Standard (Relevans)
  • Forfatter A-Ø
  • Forfatter Ø-A
  • Tittel A-Ø
  • Tittel Ø-A
  • Type publikasjon A-Ø
  • Type publikasjon Ø-A
  • Eldste først
  • Nyeste først
  • Skapad (Eldste først)
  • Skapad (Nyeste først)
  • Senast uppdaterad (Eldste først)
  • Senast uppdaterad (Nyeste først)
  • Disputationsdatum (tidligste først)
  • Disputationsdatum (siste først)
Merk
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1. Almansa, A.
    et al.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Fingerprint enhancement by shape adaptation of scale-space operators with automatic scale selection2000Inngår i: IEEE Transactions on Image Processing, ISSN 1057-7149, E-ISSN 1941-0042, Vol. 9, nr 12, s. 2027-2042Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    This work presents two mechanisms for processing fingerprint images; shape-adapted smoothing based on second moment descriptors and automatic scale selection based on normalized derivatives. The shape adaptation procedure adapts the smoothing operation to the local ridge structures, which allows interrupted ridges to be joined without destroying essential singularities such as branching points and enforces continuity of their directional fields. The Scale selection procedure estimates local ridge width and adapts the amount of smoothing to the local amount of noise. In addition, a ridgeness measure is defined, which reflects how well the local image structure agrees with a qualitative ridge model, and is used for spreading the results of shape adaptation into noisy areas. The combined approach makes it possible to resolve fine scale structures in clear areas while reducing the risk of enhancing noise in blurred or fragmented areas. The result is a reliable and adaptively detailed estimate of the ridge orientation field and ridge width, as well as a Smoothed grey-level version of the input image. We propose that these general techniques should be of interest to developers of automatic fingerprint identification systems as well as in other applications of processing related types of imagery.

  • 2. Almansa, Andrés
    et al.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Enhancement of Fingerprint Images by Shape-Adapted Scale-Space Operators1996Inngår i: Gaussian Scale-Space Theory. Part I: Proceedings of PhD School on Scale-Space Theory (Copenhagen, Denmark) May 1996 / [ed] J. Sporring, M. Nielsen, L. Florack, and P. Johansen, Springer Science+Business Media B.V., 1996, s. 21-30Kapittel i bok, del av antologi (Fagfellevurdert)
    Abstract [en]

    This work presents a novel technique for preprocessing fingerprint images. The method is based on the measurements of second moment descriptors and shape adaptation of scale-space operators with automatic scale selection (Lindeberg 1994). This procedure, which has been successfully used in the context of shape-from-texture and shape from disparity gradients, has several advantages when applied to fingerprint image enhancement, as observed by (Weickert 1995). For example, it is capable of joining interrupted ridges, and enforces continuity of their directional fields.

    In this work, these abovementioned general ideas are applied and extended in the following ways: Two methods for estimating local ridge width are explored and tuned to the problem of fingerprint enhancement. A ridgeness measure is defined, which reflects how well the local image structure agrees with a qualitative ridge model. This information is used for guiding a scale-selection mechanism, and for spreading the results of shape adaptation into noisy areas.

    The combined approach makes it possible to resolve fine scale structures in clear areas while reducing the risk of enhancing noise in blurred or fragmented areas. To a large extent, the scheme has the desirable property of joining interrupted lines without destroying essential singularities such as branching points. Thus, the result is a reliable and adaptively detailed estimate of the ridge orientation field and ridge width, as well as a smoothed grey-level version of the input image.

    A detailed experimental evaluation is presented, including a comparison with other techniques. We propose that the techniques presented provide mechanisms of interest to developers of automatic fingerprint identification systems.

  • 3. Björkman, Eva
    et al.
    Zagal, Juan Cristobal
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Roland, Per E.
    Evaluation of design options for the scale-space primal sketch analysis of brain activation images2000Inngår i: : HBM'00, published in Neuroimage, volume 11, number 5, 2000, 2000, Vol. 11, s. 656-656Konferansepaper (Fagfellevurdert)
    Abstract [en]

    A key issue in brain imaging concerns how to detect the functionally activated regions from PET and fMRI images. In earlier work, it has been shown that the scale-space primal sketch provides a useful tool for such analysis [1]. The method includes presmoothing with different filter widths and automatic estimation of the spatial extent of the activated regions (blobs).

    The purpose is to present two modifications of the scale-space primal sketch, as well as a quantitative evaluation which shows that these modifications improve the performance, measured as the separation between blob descriptors extracted from PET images and from noise images. This separation is essential for future work of associating a statistical p-value with the scale-space blob descriptors.

  • 4.
    Bretzner, Lars
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Laptev, Ivan
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Hand-gesture recognition using multi-scale colour features, hierarchical features and particle filtering2002Inngår i: Fifth IEEE International Conference on Automatic Face and Gesture Recognition, 2002. Proceedings, IEEE conference proceedings, 2002, s. 63-74Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This paper presents algorithms and a prototype systemfor hand tracking and hand posture recognition. Hand posturesare represented in terms of hierarchies of multi-scalecolour image features at different scales, with qualitativeinter-relations in terms of scale, position and orientation. Ineach image, detection of multi-scale colour features is performed.Hand states are then simultaneously detected andtracked using particle filtering, with an extension of layeredsampling referred to as hierarchical layered sampling. Experimentsare presented showing that the performance ofthe system is substantially improved by performing featuredetection in colour space and including a prior with respectto skin colour. These components have been integrated intoa real-time prototype system, applied to a test problem ofcontrolling consumer electronics using hand gestures. In asimplified demo scenario, this system has been successfullytested by participants at two fairs during 2001.

  • 5.
    Bretzner, Lars
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Laptev, Ivan
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Lenman, S.
    Sundblad, Y.
    A Prototype System for Computer Vision Based Human Computer Interaction2001Rapport (Annet vitenskapelig)
  • 6.
    Bretzner, Lars
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Feature Tracking with Automatic Selection of Spatial Scales1998Inngår i: Computer Vision and Image Understanding, ISSN 1077-3142, E-ISSN 1090-235X, Vol. 71, nr 3, s. 385-393Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    When observing a dynamic world, the size of image structures may vary over time. This article emphasizes the need for including explicit mechanisms for automatic scale selection in feature tracking algorithms in order to: (i) adapt the local scale of processing to the local image structure, and (ii) adapt to the size variations that may occur over time. The problems of corner detection and blob detection are treated in detail, and a combined framework for feature tracking is presented. The integrated tracking algorithm overcomes some of the inherent limitations of exposing fixed-scale tracking methods to image sequences in which the size variations are large. It is also shown how the stability over time of scale descriptors can be used as a part of a multi-cue similarity measure for matching. Experiments on real-world sequences are presented showing the performance of the algorithm when applied to (individual) tracking of corners and blobs.

  • 7.
    Bretzner, Lars
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Feature tracking with automatic selection of spatial scales1998Rapport (Annet vitenskapelig)
    Abstract [en]

    When observing a dynamic world, the size of image structures may vary over nada. This article emphasizes the need for including explicit mechanisms for automatic scale selection in feature tracking algorithms in order to: (i) adapt the local scale of processing to the local image structure, and (ii) adapt to the size variations that may occur over time.

    The problems of corner detection and blob detection are treated in detail, and a combined framework for feature tracking is presented in which the image features at every time moment are detected at locally determined and automatically selected nadaes. A useful property of the scale selection method is that the scale levels selected in the feature detection step reflect the spatial extent of the image structures. Thereby, the integrated tracking algorithm has the ability to adapt to spatial as well as temporal size variations, and can in this way overcome some of the inherent limitations of exposing fixed-scale tracking methods to image sequences in which the size variations are large.

    In the composed tracking procedure, the scale information is used for two additional major purposes: (i) for defining local regions of interest for searching for matching candidates as well as setting the window size for correlation when evaluating matching candidates, and (ii) stability over time of the scale and significance descriptors produced by the scale selection procedure are used for formulating a multi-cue similarity measure for matching.

    Experiments on real-world sequences are presented showing the performance of the algorithm when applied to (individual) tracking of corners and blobs. Specifically, comparisons with fixed-scale tracking methods are included as well as illustrations of the increase in performance obtained by using multiple cues in the feature matching step.

  • 8.
    Bretzner, Lars
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    On the handling of spatial and temporal scales in feature tracking1997Inngår i: Scale-Space Theory in Computer Vision: First International Conference, Scale-Space'97 Utrecht, The Netherlands, July 2–4, 1997 Proceedings, Springer Berlin/Heidelberg, 1997, s. 128-139Konferansepaper (Fagfellevurdert)
  • 9.
    Bretzner, Lars
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Qualitative Multi-Scale Feature Hierarchies for Object Tracking2000Inngår i: Journal of Visual Communication and Image Representation, ISSN 1047-3203, E-ISSN 1095-9076, Vol. 11, s. 115-129Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    This paper shows how the performance of feature trackers can be improved by building a view-based object representation consisting of qualitative relations between image structures at different scales. The idea is to track all image features individually, and to use the qualitative feature relations for resolving ambiguous matches and for introducing feature hypotheses whenever image features are mismatched or lost. Compared to more traditional work on view-based object tracking, this methodology has the ability to handle semi-rigid objects and partial occlusions. Compared to trackers based on three-dimensional object models, this approach is much simpler and of a more generic nature. A hands-on example is presented showing how an integrated application system can be constructed from conceptually very simple operations.

  • 10.
    Bretzner, Lars
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Qualitative multiscale feature hierarchies for object tracking2000Rapport (Fagfellevurdert)
    Abstract [en]

    This paper shows how the performance of feature trackers can be improved by building a hierarchical view-based object representation consisting of qualitative relations between image structures at different scales. The idea is to track all image features individually and to use the qualitative feature relations for avoiding mismatches, for resolving ambiguous matches, and for introducing feature hypotheses whenever image features are lost. Compared to more traditional work on view-based object tracking, this methodology has the ability to handle semirigid objects and partial occlusions. Compared to trackers based on three-dimensional object models, this approach is much simpler and of a more generic nature. A hands-on example is presented showing how an integrated application system can be constructed from conceptually very simple operations.

  • 11.
    Bretzner, Lars
    et al.
    KTH, Tidigare Institutioner (före 2005), Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Tidigare Institutioner (före 2005), Numerisk analys och datalogi, NADA.
    Qualitative multi-scale feature hierarchies for object tracking1999Inngår i: Proc Scale-Space Theories in Computer Vision Med, Elsevier, 1999, s. 117-128Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This paper shows how the performance of feature trackers can be improved by building a view-based object representation consisting of qualitative relations between image structures at different scales. The idea is to track all image features individually, and to use the qualitative feature relations for resolving ambiguous matches and for introducing feature hypotheses whenever image features are mismatched or lost. Compared to more traditional work on view-based object tracking, this methodology has the ability to handle semi-rigid objects and partial occlusions. Compared to trackers based on three-dimensional object models, this approach is much simpler and of a more generic nature. A hands-on example is presented showing how an integrated application system can be constructed from conceptually very simple operations.

  • 12.
    Bretzner, Lars
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Structure and Motion Estimation using Sparse Point and Line Correspondences in Multiple Affine Views1999Rapport (Annet vitenskapelig)
    Abstract [en]

    This paper addresses the problem of computing three-dimen\-sional structure and motion from an unknown rigid configuration of points and lines viewed by an affine projection model. An algebraic structure, analogous to the trilinear tensor for three perspective cameras, is defined for configurations of three centered affine cameras. This centered affine trifocal tensor contains 12 non-zero coefficients and involves linear relations between point correspondences and trilinear relations between line correspondences. It is shown how the affine trifocal tensor relates to the perspective trilinear tensor, and how three-dimensional motion can be computed from this tensor in a straightforward manner. A factorization approach is developed to handle point features and line features simultaneously in image sequences, and degenerate feature configurations are analysed. This theory is applied to a specific problem in human-computer interaction of capturing three-dimensional rotations from gestures of a human hand. This application to quantitative gesture analyses illustrates the usefulness of the affine trifocal tensor in a situation where sufficient information is not available to compute the perspective trilinear tensor, while the geometry requires point correspondences as well as line correspondences over at least three views.

  • 13.
    Bretzner, Lars
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Use your hand as a 3-D mouse or relative orientation from extended sequences of sparse point and line correspondances using the affine trifocal tensor1998Inngår i: Computer Vision — ECCV'98: 5th European Conference on Computer Vision Freiburg, Germany, June, 2–6, 1998 Proceedings, Volume I, Springer Berlin/Heidelberg, 1998, Vol. 1406, s. 141-157Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This paper addresses the problem of computing three-dimensional structure and motion from an unknown rigid configuration of point and lines viewed by an affine projection model. An algebraic structure, analogous to the trilinear tensor for three perspective cameras, is defined for configurations of three centered affine cameras. This centered affine trifocal tensor contains 12 coefficients and involves linear relations between point correspondences and trilinear relations between line correspondences It is shown how the affine trifocal tensor relates to the perspective trilinear tensor, and how three-dimensional motion can be computed from this tensor in a straightforward manner. A factorization approach is also developed to handle point features and line features simultaneously in image sequences.

    This theory is applied to a specific problem of human-computer interaction of capturing three-dimensional rotations from gestures of a human hand. A qualitative model is presented, in which three fingers are represented by their position and orientation, and it is shown how three point correspondences (blobs at the finger tips) and three line correspondences (ridge features at the fingers) allow the affine trifocal tensor to be determined, from which the rotation is computed. Besides the obvious application, this test problem illustrates the usefulness of the affine trifocal tensor in a situation where sufficient information is not available to compute the perspective trilinear tensor, while the geometry requires point correspondences as well as line correspondences over at least three views.

  • 14.
    Brunnström, Kjell
    et al.
    KTH, Tidigare Institutioner (före 2005), Numerisk analys och datalogi, NADA.
    Eklundh, Jan-Olof
    KTH, Tidigare Institutioner (före 2005), Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    On Scale and Resolution in the Analysis of Local Image Structure1990Inngår i: Proc. 1st European Conf. on Computer Vision, 1990, Vol. 427, s. 3-12Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Focus-of-attention is extremely important in human visual perception. If computer vision systems are to perform tasks in a complex, dynamic world they will have to be able to control processing in a way that is analogous to visual attention in humans.

    In this paper we will investigate problems in connection with foveation, that is examining selected regions of the world at high resolution. We will especially consider the problem of finding and classifying junctions from this aspect. We will show that foveation as simulated by controlled, active zooming in conjunction with scale-space techniques allows robust detection and classification of junctions.

  • 15.
    Brunnström, Kjell
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Eklundh, Jan-Olof
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Scale and Resolution in Active Analysis of Local Image Structure1990Inngår i: Image and Vision Computing, Vol. 8, s. 289-296Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Focus-of-attention is extremely important in human visual perception. If computer vision systems are to perform tasks in a complex, dynamic world they will have to be able to control processing in a way that is analogous to visual attention in humans. Problems connected to foveation (examination of selected regions of the world at high resolution) are examined. In particular, the problem of finding and classifying junctions from this aspect is considered. It is shown that foveation as simulated by controlled, active zooming in conjunction with scale-space techniques allows for robust detection and classification of junctions.

  • 16.
    Brunnström, Kjell
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Eklundh, Jan-Olof
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Active detection and classification of junctions by foveation with a head-eye system guided by the scale-space primal sketch1992Inngår i: Computer Vision — ECCV'92: Second European Conference on Computer Vision Santa Margherita Ligure, Italy, May 19–22, 1992 Proceedings / [ed] Guilo Sandini, Springer Berlin/Heidelberg, 1992, s. 701-709Konferansepaper (Fagfellevurdert)
    Abstract [en]

    We consider how junction detection and classification can be performed in an active visual system. This is to exemplify that feature detection and classification in general can be done by both simple and robust methods, if the vision system is allowed to look at the world rather than at prerecorded images. We address issues on how to attract the attention to salient local image structures, as well as on how to characterize those.

  • 17.
    Ekeberg, Örjan
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Fransén, Erik
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Hellgren Kotaleski, Jeanette
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Herman, Pawel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Kumar, Arvind
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Lansner, Anders
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Computational Brain Science at CST, CSC, KTH2016Annet (Annet vitenskapelig)
    Abstract [en]

    Mission and Vision - Computational Brain Science Lab at CST, CSC, KTH

    The scientific mission of the Computational Brain Science Lab at CSC is to be at the forefront of mathematical modelling, quantitative analysis and mechanistic understanding of brain function. We perform research on (i) computational modelling of biological brain function and on (ii) developing theory, algorithms and software for building computer systems that can perform brain-like functions. Our research answers scientific questions and develops methods in these fields. We integrate results from our science-driven brain research into our work on brain-like algorithms and likewise use theoretical results about artificial brain-like functions as hypotheses for biological brain research.

    Our research on biological brain function includes sensory perception (vision, hearing, olfaction, pain), cognition (action selection, memory, learning) and motor control at different levels of biological detail (molecular, cellular, network) and mathematical/functional description. Methods development for investigating biological brain function and its dynamics as well as dysfunction comprises biomechanical simulation engines for locomotion and voice, machine learning methods for analysing functional brain images, craniofacial morphology and neuronal multi-scale simulations. Projects are conducted in close collaborations with Karolinska Institutet and Karolinska Hospital in Sweden as well as other laboratories in Europe, U.S., Japan and India.

    Our research on brain-like computing concerns methods development for perceptual systems that extract information from sensory signals (images, video and audio), analysis of functional brain images and EEG data, learning for autonomous agents as well as development of computational architectures (both software and hardware) for neural information processing. Our brain-inspired approach to computing also applies more generically to other computer science problems such as pattern recognition, data analysis and intelligent systems. Recent industrial collaborations include analysis of patient brain data with MentisCura and the startup company 13 Lab bought by Facebook.

    Our long term vision is to contribute to (i) deeper understanding of the computational mechanisms underlying biological brain function and (ii) better theories, methods and algorithms for perceptual and intelligent systems that perform artificial brain-like functions by (iii) performing interdisciplinary and cross-fertilizing research on both biological and artificial brain-like functions. 

    On one hand, biological brains provide existence proofs for guiding our research on artificial perceptual and intelligent systems. On the other hand, applying Richard Feynman’s famous statement ”What I cannot create I do not understand” to brain science implies that we can only claim to fully understand the computational mechanisms underlying biological brain function if we can build and implement corresponding computational mechanisms on a computerized system that performs similar brain-like functions.

  • 18.
    Friberg, Anders
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    Lindeberg, Tony
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Hellwagner, Martin
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    Helgason, Pétur
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    Salomão, Gláucia Laís
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    Elovsson, Anders
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    Lemaitre, Guillaume
    Institute for Research and Coordination in Acoustics and Music, Paris, France.
    Ternström, Sten
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH.
    Prediction of three articulatory categories in vocal sound imitations using models for auditory receptive fields2018Inngår i: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 144, nr 3, s. 1467-1483Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Vocal sound imitations provide a new challenge for understanding the coupling between articulatory mechanisms and the resulting audio. In this study, we have modeled the classification of three articulatory categories, phonation, supraglottal myoelastic vibrations, and turbulence from audio recordings. Two data sets were assembled, consisting of different vocal imitations by four professional imitators and four non-professional speakers in two different experiments. The audio data were manually annotated by two experienced phoneticians using a detailed articulatory description scheme. A separate set of audio features was developed specifically for each category using both time-domain and spectral methods. For all time-frequency transformations, and for some secondary processing, the recently developed Auditory Receptive Fields Toolbox was used. Three different machine learning methods were applied for predicting the final articulatory categories. The result with the best generalization was found using an ensemble of multilayer perceptrons. The cross-validated classification accuracy was 96.8 % for phonation, 90.8 % for supraglottal myoelastic vibrations, and 89.0 % for turbulence using all the 84 developed features. A final feature reduction to 22 features yielded similar results.

  • 19.
    Gårding, Jonas
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    CanApp: The Candela Application Library1989Rapport (Annet vitenskapelig)
    Abstract [en]

    This paper describes CanApp, the Candela Application Library. CanApp is a software package for image processing and image analysis. Most of the subroutines in CanApp are available both as stand-alone programs and C subroutines.

    CanApp currently comprises some 50 programs and 75 subroutines, and these numbers are expected to grow continuously as a result of joint efforts of the members of the CVAP group at the Royal Institute of Technology in Stockholm.

    CanApp is currently installed and running under UNIX on Sun workstations

  • 20.
    Gårding, Jonas
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Direct computation of shape cues using scale-adapted spatial derivative operators1996Inngår i: International Journal of Computer Vision, ISSN 0920-5691, E-ISSN 1573-1405, Vol. 17, nr 2, s. 163-191Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    This paper addresses the problem of computing cues to the three-dimensional structure of surfaces in the world directly from the local structure of the brightness pattern of either a single monocular image or a binocular image pair.It is shown that starting from Gaussian derivatives of order up to two at a range of scales in scale-space, local estimates of (i) surface orientation from monocular texture foreshortening, (ii) surface orientation from monocular texture gradients, and (iii) surface orientation from the binocular disparity gradient can be computed without iteration or search, and by using essentially the same basic mechanism.The methodology is based on a multi-scale descriptor of image structure called the windowed second moment matrix, which is computed with adaptive selection of both scale levels and spatial positions. Notably, this descriptor comprises two scale parameters; a local scale parameter describing the amount of smoothing used in derivative computations, and an integration scale parameter determining over how large a region in space the statistics of regional descriptors is accumulated.Experimental results for both synthetic and natural images are presented, and the relation with models of biological vision is briefly discussed.

  • 21.
    Gårding, Jonas
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Direct estimation of local surface shape in a fixating binocular vision system1994Inngår i: Computer Vision — ECCV '94: Third European Conference on Computer Vision Stockholm, Sweden, May 2–6, 1994 Proceedings, Volume I, Springer Berlin/Heidelberg, 1994, s. 365-376Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This paper addresses the problem of computing cues to the three-dimensional structure of surfaces in the world directly from the local structure of the brightness pattern of a binocular image pair. The geometric information content of the gradient of binocular disparity is analyzed for the general case of a fixating vision system with symmetric or asymmetric vergence, and with either known or unknown viewing geometry. A computationally inexpensive technique which exploits this analysis is proposed. This technique allows a local estimate of surface orientation to be computed directly from the local statistics of the left and right image brightness gradients, without iterations or search. The viability of the approach is demonstrated with experimental results for both synthetic and natural gray-level images.

  • 22.
    Jansson, Ylva
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Lindeberg, Tony
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Beräkningsvetenskap och beräkningsteknik (CST).
    Dynamic texture recognition using time-causal and time-recursive spatio-temporal receptive fields2018Inngår i: Journal of Mathematical Imaging and Vision, ISSN 0924-9907, E-ISSN 1573-7683, Vol. 60, nr 9, s. 1369-1398Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    This work presents a first evaluation of using spatio-temporal receptive fields from a recently proposed time-causal spatiotemporal scale-space framework as primitives for video analysis. We propose a new family of video descriptors based on regional statistics of spatio-temporal receptive field responses and evaluate this approach on the problem of dynamic texture recognition. Our approach generalises a previously used method, based on joint histograms of receptive field responses, from the spatial to the spatio-temporal domain and from object recognition to dynamic texture recognition. The time-recursive formulation enables computationally efficient time-causal recognition. The experimental evaluation demonstrates competitive performance compared to state of the art. In particular, it is shown that binary versions of our dynamic texture descriptors achieve improved performance compared to a large range of similar methods using different primitives either handcrafted or learned from data. Further, our qualitative and quantitative investigation into parameter choices and the use of different sets of receptive fields highlights the robustness and flexibility of our approach. Together, these results support the descriptive power of this family of time-causal spatio-temporal receptive fields, validate our approach for dynamic texture recognition and point towards the possibility of designing a range of video analysis methods based on these new time-causal spatio-temporal primitives.

  • 23. Laptev, I.
    et al.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    A distance measure and a feature likelihood map concept for scale-invariant model matching2003Rapport (Fagfellevurdert)
    Abstract [en]

    This paper presents two approaches for evaluating multi-scale feature-based object models. Within the first approach, a scale-invariant distance measure is proposed for comparing two image representations in terms of multi-scale features. Based on this measure, the maximisation of the likelihood of parameterised feature models allows for simultaneous model selection and parameter estimation. The idea of the second approach is to avoid an explicit feature extraction step and to evaluate models using a function defined directly from the image data. For this purpose, we propose the concept of a feature likelihood map, which is a function normalised to the interval [0, 1], and that approximates the likelihood of image features at all points in scale-space. To illustrate the applicability of both methods, we consider the area of hand gesture analysis and show how the proposed evaluation schemes can be integrated within a particle filtering approach for performing simultaneous tracking and recognition of hand models under variations in the position, orientation, size and posture of the hand. The experiments demonstrate the feasibility of the approach, and that real time performance can be obtained by pyramid implementations of the proposed concepts.

  • 24.
    Laptev, Ivan
    et al.
    IRISA/INRIA.
    Caputo, Barbara
    Schüldt, Christian
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Local velocity-adapted motion events for spatio-temporal recognition2007Inngår i: Computer Vision and Image Understanding, ISSN 1077-3142, E-ISSN 1090-235X, Vol. 108, nr 3, s. 207-229Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    In this paper, we address the problem of motion recognition using event-based local motion representations. We assume that similar patterns of motion contain similar events with consistent motion across image sequences. Using this assumption, we formulate the problem of motion recognition as a matching of corresponding events in image sequences. To enable the matching, we present and evaluate a set of motion descriptors that exploit the spatial and the temporal coherence of motion measurements between corresponding events in image sequences. As the motion measurements may depend on the relative motion of the camera, we also present a mechanism for local velocity adaptation of events and evaluate its influence when recognizing image sequences subjected to different camera motions. When recognizing motion patterns, we compare the performance of a nearest neighbor (NN) classifier with the performance of a support vector machine (SVM). We also compare event-based motion representations to motion representations in terms of global histograms. A systematic experimental evaluation on a large video database with human actions demonstrates that (i) local spatio-temporal image descriptors can be defined to carry important information of space-time events for subsequent recognition, and that (ii) local velocity adaptation is an important mechanism in situations when the relative motion between the camera and the interesting events in the scene is unknown. The particular advantage of event-based representations and velocity adaptation is further emphasized when recognizing human actions in unconstrained scenes with complex and non-stationary backgrounds.

  • 25.
    Laptev, Ivan
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    A Distance Measure and a Feature Likelihood Map Concept for Scale-Invariant Model Matching2003Inngår i: International Journal of Computer Vision, ISSN 0920-5691, E-ISSN 1573-1405, Vol. 52, nr 2, s. 97-120Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    This paper presents two approaches for evaluating multi-scale feature-based object models. Within the first approach, a scale-invariant distance measure is proposed for comparing two image representations in terms of multi-scale features. Based on this measure, the maximisation of the likelihood of parameterised feature models allows for simultaneous model selection and parameter estimation.

    The idea of the second approach is to avoid an explicit feature extraction step and to evaluate models using a function defined directly from the image data. For this purpose, we propose the concept of a feature likelihood map, which is a function normalised to the interval [0, 1], and that approximates the likelihood of image features at all points in scale-space.

    To illustrate the applicability of both methods, we consider the area of hand gesture analysis and show how the proposed evaluation schemes can be integrated within a particle filtering approach for performing simultaneous tracking and recognition of hand models under variations in the position, orientation, size and posture of the hand. The experiments demonstrate the feasibility of the approach, and that real time performance can be obtained by pyramid implementations of the proposed concepts.

  • 26.
    Laptev, Ivan
    et al.
    KTH, Tidigare Institutioner (före 2005), Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Tidigare Institutioner (före 2005), Numerisk analys och datalogi, NADA.
    A multi-scale feature likelihood map for direct evaluation of object hypotheses2001Inngår i: Proc Scale-Space and Morphology in Computer Vision, Springer Berlin/Heidelberg, 2001, Vol. 2106, s. 98-110Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This paper develops and investigates a new approach for evaluating feature based object hypotheses in a direct way. The idea is to compute a feature likelihood map (FLM), which is a function normalized to the interval [0, 1], and which approximates the likelihood of image features at all points in scale-space. In our case, the FLM is defined from Gaussian derivative operators and in such a way that it assumes its strongest responses near the centers of symmetric blob-like or elongated ridge-like structures and at scales that reflect the size of these structures in the image domain. While the FLM inherits several advantages of feature based image representations, it also (i) avoids the need for explicit search when matching features in object models to image data, and (ii) eliminates the need for thresholds present in most traditional feature based approaches. In an application presented in this paper, the FLM is applied to simultaneous tracking and recognition of hand models based on particle filtering. The experiments demonstrate the feasibility of the approach, and that real time performance can be obtained by a pyramid implementation of the proposed concept.

  • 27.
    Laptev, Ivan
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Interest point detection and scale selection in space-time2003Inngår i: Scale Space Methods in Computer Vision: 4th International Conference, Scale Space 2003 Isle of Skye, UK, June 10–12, 2003 Proceedings, Springer Berlin/Heidelberg, 2003, Vol. 2695, s. 372-387Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Several types of interest point detectors have been proposed for spatial images. This paper investigates how this notion can be generalised to the detection of interesting events in space-time data. Moreover, we develop a mechanism for spatio-temporal scale selection and detect events at scales corresponding to their extent in both space and time. To detect spatio-temporal events, we build on the idea of the Harris and Forstner interest point operators and detect regions in space-time where the image structures have significant local variations in both space and time. In this way, events that correspond to curved space-time structures are emphasised, while structures with locally constant motion are disregarded. To construct this operator, we start from a multi-scale windowed second moment matrix in space-time, and combine the determinant and the trace in a similar way as for the spatial Harris operator. All space-time maxima of this operator are then adapted to characteristic scales by maximising a scale-normalised space-time Laplacian operator over both spatial scales and temporal scales. The motivation for performing temporal scale selection as a complement to previous approaches of spatial scale selection is to be able to robustly capture spatio-temporal events of different temporal extent. It is shown that the resulting approach is truly scale invariant with respect to both spatial scales and temporal scales. The proposed concept is tested on synthetic and real image sequences. It is shown that the operator responds to distinct and stable points in space-time that often correspond to interesting events. The potential applications of the method are discussed.

  • 28.
    Laptev, Ivan
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Local descriptors for spatio-temporal recognition2006Inngår i: Spatial Coherence For Visual Motion Analysis: First International Workshop, SCVMA 2004, Prague, Czech Republic, May 15, 2004. Revised Papers / [ed] MacLean, WJ, Springer Berlin/Heidelberg, 2006, Vol. 3667, s. 91-103Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This paper presents and investigates a set of local space-time descriptors for representing and recognizing motion patterns in video. Following the idea of local features in the spatial domain, we use the notion of space-time interest points and represent video data in terms of local space-time events. To describe such events, we define several types of image descriptors over local spatio-temporal neighborhoods and evaluate these descriptors in the context of recognizing human activities. In particular, we compare motion representations in terms of spatio-temporal jets, position dependent histograms, position independent histograms, and principal component analysis computed for either spatio-temporal gradients or optic flow. An experimental evaluation on a video database with human actions shows that high classification performance can be achieved, and that there is a clear advantage of using local position dependent histograms, consistent with previously reported findings regarding spatial recognition.

  • 29.
    Laptev, Ivan
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    On Space-Time Interest Points2003Rapport (Annet vitenskapelig)
    Abstract [en]

    Local image features or interest points provide compact and abstract representations of patterns in an image. In this paper, we extend the notion of spatial interest points into the spatio-temporal domain and show how the resulting features capture interesting events in video and can be used for a compact representation and for interpretation of video data.

    To detect spatio-temporal events, we build on the idea of the Harris and Forstner interest point operators and detect local structures in space-time where the image values have significant local variations in both space and time. We estimate the spatio-temporal extents of the detected events by maximizing a normalized spatio-temporal Laplacian operator over spatial and temporal scales. To represent the detected events we then compute local, spatio-temporal, scale-invariant N-jets and classify each event with respect to its jet descriptor. For the problem of human motion analysis, we illustrate how video representation in terms of local space-time features allows for detection of walking people in scenes with occlusions and dynamic cluttered backgrounds.

  • 30.
    Laptev, Ivan
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Space-time interest points2003Inngår i: Proceedings of Ninth IEEE International Conference on Computer Vision, 2003: ICCV'03, IEEE conference proceedings, 2003, s. 432-439Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Local image features or interest points provide compact and abstract representations of patterns in an image. We propose to extend the notion of spatial interest points into the spatio-temporal domain and show how the resulting features often reflect interesting events that can be used for a compact representation of video data as well as for its interpretation. To detect spatio-temporal events, we build on the idea of the Harris and Forstner interest point operators and detect local structures in space-time where the image values have significant local variations in both space and time. We then estimate the spatio-temporal extents of the detected events and compute their scale-invariant spatio-temporal descriptors. Using such descriptors, we classify events and construct video representation in terms of labeled space-time points. For the problem of human motion analysis, we illustrate how the proposed method allows for detection of walking people in scenes with occlusions and dynamic backgrounds.

  • 31.
    Laptev, Ivan
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Tracking of multi-state hand models using particle filtering and a hierarchy of multi-scale image features2001Rapport (Fagfellevurdert)
    Abstract [en]

    This paper presents an approach for simultaneous tracking and recognition of hierarchical object representations in terms of multiscale image features. A scale-invariant dissimilarity measure is proposed for comparing scale-space features at different positions and scales. Based on this measure, the likelihood of hierarchical, parameterized models can be evaluated in such a way that maximization of the measure over different models and their parameters allows for both model selection and parameter estimation. Then, within the framework of particle filtering, we consider the area of hand gesture analysis, and present a method for simultaneous tracking and recognition of hand models under variations in the position, orientation, size and posture of the hand. In this way, qualitative hand states and quantitative hand motions can be captured, and be used for controlling different types of computerised equipment.

  • 32.
    Laptev, Ivan
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Tracking of multi-state hand models using particle filtering and a hierarchy of multi-scale image features2001Inngår i: Scale-Space and Morphology in Computer Vision: Third International Conference, Scale-Space 2001 Vancouver, Canada, July 7–8, 2001 Proceedings, Springer Berlin/Heidelberg, 2001, Vol. 2106, s. 63-74Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This paper presents an approach for simultaneous tracking and recognition of hierarchical object representations in terms of multiscale image features. A scale-invariant dissimilarity measure is proposed for comparing scale-space features at different positions and scales. Based on this measure, the likelihood of hierarchical, parameterized models can be evaluated in such a way that maximization of the measure over different models and their parameters allows for both model selection and parameter estimation. Then, within the framework of particle filtering, we consider the area of hand gesture analysis, and present a method for simultaneous tracking and recognition of hand models under variations in the position, orientation, size and posture of the hand. In this way, qualitative hand states and quantitative hand motions can be captured, and be used for controlling different types of computerised equipment.

  • 33.
    Laptev, Ivan
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Velocity adaptation of space-time interest points2004Inngår i: Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004 / [ed] Kittler, J; Petrou, M; Nixon, M, IEEE conference proceedings, 2004, s. 52-56Konferansepaper (Fagfellevurdert)
    Abstract [en]

    The notion of local features in space-time has recently been proposed to capture and describe local events in video. When computing space-time descriptors, however, the result may strongly depend on the relative motion between the object and the camera. To compensate for this variation, we present a method that automatically adapts the features to the local velocity of the image pattern and, hence, results in a video representation that is stable with respect to different amounts of camera motion. Experimentally we show that the use of velocity adaptation substantially increases the repeatability of interest points as well as the stability of their associated descriptors. Moreover for an application to human action recognition we demonstrate how velocity adapted features enable recognition of human actions in situations with unknown camera motion and complex, nonstationary backgrounds.

  • 34.
    Laptev, Ivan
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Velocity adaptation of spatio-temporal receptive fields for direct recognition of activities: an experimental study2004Inngår i: Image and Vision Computing, ISSN 0262-8856, E-ISSN 1872-8138, Vol. 22, nr 2, s. 105-116Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    This article presents an experimental study of the influence of velocity adaptation when recognizing spatio-temporal patterns using a histogram-based statistical framework. The basic idea consists of adapting the shapes of the filter kernels to the local direction of motion, so as to allow the computation of image descriptors that are invariant to the relative motion in the image plane between the camera and the objects or events that are studied. Based on a framework of recursive spatio-temporal scale-space, we first outline how a straightforward mechanism for local velocity adaptation can be expressed. Then, for a test problem of recognizing activities, we present an experimental evaluation, which shows the advantages of using velocity-adapted spatio-temporal receptive fields, compared to directional derivatives or regular partial derivatives for which the filter kernels have not been adapted to the local image motion.

  • 35.
    Laptev, Ivan
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Velocity-adapted spatio-temporal receptive fields for direct recognition of activities2002Inngår i: Proc. ECCV’02 Workshop on Statistical Methods in Video Processing, 2002, s. 61-66Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This article presents an experimental study of the influence of velocity adaptation when recognizing spatio-temporal patterns using a histogram-based statistical framework. The basic idea consists of adapting the shapes of the filter kernels to the local direction of motion, so as to allow the computation of image descriptors that are invariant to the relative motion in the image plane between the camera and the objects or events that are studied. Based on a framework of recursive spatio-temporal scale-space, we first outline how a straightforward mechanism for local velocity adaptation can be expressed. Then, for a test problem of recognizing activities, we present an experimental evaluation, which shows the advantages of using velocity-adapted spatio-temporal receptive fields, compared to directional derivatives or regular partial derivatives for which the filter kernels have not been adapted to the local image motion.

  • 36.
    Laptev, Ivan
    et al.
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Mayer, H.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Eckstein, W.
    Steger, C.
    Baumgartner, A.
    Automatic extraction of roads from aerial images based on scale space and snakes2000Inngår i: Machine Vision and Applications, ISSN 0932-8092, E-ISSN 1432-1769, Vol. 12, nr 1, s. 23-31Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    We propose a new approach for automatic road extraction from aerial imagery with a model and a strategy mainly based on the multi-scale detection of roads in combination with geometry-constrained edge extraction using snakes. A main advantage of our approach is, that it allows for the first time a bridging of shadows and partially occluded areas using the heavily disturbed evidence in the image. Additionally, it has only few parameters to be adjusted. The road network is constructed after extracting crossings with varying shape and topology. We show the feasibility of the approach not only by presenting reasonable results but also by evaluating them quantitatively based on ground truth.

  • 37.
    Linde, Oskar
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Composed Complex-Cue Histograms: An Investigation of the Information Content in Receptive Field Based Image Descriptors for Object Recognition2012Inngår i: Computer Vision and Image Understanding, ISSN 1077-3142, E-ISSN 1090-235X, Vol. 116, nr 4, s. 538-560Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Recent work has shown that effective methods for recognizing objects and spatio-temporal events can be constructed based on histograms of receptive field like image operations.

    This paper presents the results of an extensive study of the performance of different types of receptive field like image descriptors for histogram-based object recognition, based on different combinations of image cues in terms of Gaussian derivatives or differential invariants applied to either intensity information, colour-opponent channels or both. A rich set of composed complex-cue image descriptors is introduced and evaluated with respect to the problems of (i) recognizing previously seen object instances from previously unseen views, and (ii) classifying previously unseen objects into visual categories.

    It is shown that there exist novel histogram descriptors with significantly better recognition performance compared to previously used histogram features within the same class. Specifically, the experiments show that it is possible to obtain more discriminative features by combining lower-dimensional scale-space features into composed complex-cue histograms. Furthermore, different types of image descriptors have different relative advantages with respect to the problems of object instance recognition vs. object category classification. These conclusions are obtained from extensive experimental evaluations on two mutually independent data sets.

    For the task of recognizing specific object instances, combined histograms of spatial and spatio-chromatic derivatives are highly discriminative, and several image descriptors in terms rotationally invariant (intensity and spatio-chromatic) differential invariants up to order two lead to very high recognition rates.

    For the task of category classification, primary information is contained in both first- and second-order derivatives, where second-order partial derivatives constitute the most discriminative cue.

    Dimensionality reduction by principal component analysis and variance normalization prior to training and recognition can in many cases lead to a significant increase in recognition or classification performance. Surprisingly high recognition rates can even be obtained with binary histograms that reveal the polarity of local scale-space features, and which can be expected to be particularly robust to illumination variations.

    An overall conclusion from this study is that compared to previously used lower-dimensional histograms, the use of composed complex-cue histograms of higher dimensionality reveals the co-variation of multiple cues and enables much better recognition performance, both with regard to the problems of recognizing previously seen objects from novel views and for classifying previously unseen objects into visual categories.

  • 38.
    Linde, Oskar
    et al.
    KTH, Tidigare Institutioner (före 2005), Numerisk analys och datalogi, NADA.
    Lindeberg, Tony
    KTH, Tidigare Institutioner (före 2005), Numerisk analys och datalogi, NADA.
    Object recognition using composed receptive field histograms of higher dimensionality2004Inngår i: Proceedings of the 17th International Conference on Pattern Recognition / [ed] Kittler, J; Petrou, M; Nixon, M, IEEE conference proceedings, 2004, s. 1-6Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Recent work has shown that effective methods for recognising objects or spatio-temporal events can be constructed based on receptive field responses summarised into histograms or other histogram-like image descriptors. This paper presents a set Of composed histogram features of higher dimensionality, which give significantly better recognition performance compared to the histogram descriptors of lower dimensionality that were used in the original papers by Swain & Ballard (1991) or Schiele & Crowley (2000). The use of histograms of higher dimensionality is made possible by a sparse representation for efficient computation and handling of higher-dimensional histograms. Results of extensive experiments are reported, showing how the performance of histogram-based recognition schemes depend upon different combinations of cues, in terms of Gaussian derivatives or differential invariants applied to either intensity information, chromatic information or both. It is shown that there exist composed higher-dimensional histogram descriptors with much better performance for recognising known objects than previously used histogram features. Experiments are also reported of classifying unknown objects into visual categories.

  • 39.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    A computational theory of visual receptive fields2013Inngår i: Biological Cybernetics, ISSN 0340-1200, E-ISSN 1432-0770, Vol. 107, nr 6, s. 589-635Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    A receptive field constitutes a region in the visual field where a visual cell or a visual operator responds to visual stimuli. This paper presents a theory for what types of receptive field profiles can be regarded as natural for an idealized vision system, given a set of structural requirements on the first stages of visual processing that reflect symmetry properties of the surrounding world.

    These symmetry properties include (i) covariance properties under scale changes, affine image deformations, and Galilean transformations of space–time as occur for real-world image data as well as specific requirements of (ii) temporal causality implying that the future cannot be accessed and (iii) a time-recursive updating mechanism of a limited temporal buffer of the past as is necessary for a genuine real-time system. Fundamental structural requirements are also imposed to ensure (iv) mutual consistency and a proper handling of internal representations at different spatial and temporal scales.

    It is shown how a set of families of idealized receptive field profiles can be derived by necessity regarding spatial, spatio-chromatic, and spatio-temporal receptive fields in terms of Gaussian kernels, Gaussian derivatives, or closely related operators. Such image filters have been successfully used as a basis for expressing a large number of visual operations in computer vision, regarding feature detection, feature classification, motion estimation, object recognition, spatio-temporal recognition, and shape estimation. Hence, the associated so-called scale-space theory constitutes a both theoretically well-founded and general framework for expressing visual operations.

    There are very close similarities between receptive field profiles predicted from this scale-space theory and receptive field profiles found by cell recordings in biological vision. Among the family of receptive field profiles derived by necessity from the assumptions, idealized models with very good qualitative agreement are obtained for (i) spatial on-center/off-surround and off-center/on-surround receptive fields in the fovea and the LGN, (ii) simple cells with spatial directional preference in V1, (iii) spatio-chromatic double-opponent neurons in V1, (iv) space–time separable spatio-temporal receptive fields in the LGN and V1, and (v) non-separable space–time tilted receptive fields in V1, all within the same unified theory. In addition, the paper presents a more general framework for relating and interpreting these receptive fields conceptually and possibly predicting new receptive field profiles as well as for pre-wiring covariance under scaling, affine, and Galilean transformations into the representations of visual stimuli.

    This paper describes the basic structure of the necessity results concerning receptive field profiles regarding the mathematical foundation of the theory and outlines how the proposed theory could be used in further studies and modelling of biological vision. It is also shown how receptive field responses can be interpreted physically, as the superposition of relative variations of surface structure and illumination variations, given a logarithmic brightness scale, and how receptive field measurements will be invariant under multiplicative illumination variations and exposure control mechanisms.

  • 40.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    A framework for invariant visual operations based on receptive field responses2013Inngår i: SSVM 2013: Fourth International Conference on Scale Space and Variational Methods in Computer Vision, June 2-6, Schloss Seggau, Graz region, Austria: Invited keynote address / [ed] Arjan Kuijper, 2013Konferansepaper (Annet vitenskapelig)
    Abstract [en]

    The brain is able to maintain a stable perception although the visual stimuli vary substantially on the retina due to geometric transformations and lighting variations in the environment. This talk presents a unified theory for achieving basic invariance properties of visual operations already at the level of receptive fields.

    This generalized framework for invariant receptive field responses comprises:

    • local scaling transformations caused by objects of different size and at different distances to the observer,
    • locally linearized image deformations caused by variations in the viewing direction in relation to the object,
    • locally linearized relative motions between the object and the observer and
    • local multiplicative intensity transformations caused by illumination variations.

    The receptive field model can be derived by necessity from symmetry properties of the environment and leads to predictions about receptive field profiles in good agreement with receptive field profiles measured by cell recordings in mammalian vision. Indeed, the receptive field profiles in the retina, LGN and V1 can be seen as close to ideal to what is motivated by the idealized requirements.

    By complementing receptive field measurements with selection mechanisms over the parameters in the receptive field families, it is shown how true invariance of receptive field responses can be obtained under scaling transformations, affine transformations and Galilean transformations. Thereby, the framework provides a mathematically well-founded and biologically plausible model for how basic invariance properties can be achieved already at the level of receptive fields and support invariant recognition of objects and events under variations in viewpoint, retinal size, object motion and illumination.

    The theory can explain the different shapes of receptive field profiles found in biological vision, which are tuned to different sizes and orientations in the image domain as well as to different image velocities in space-time, from a requirement that the visual system should be invariant to the natural types of image transformations that occur in its environment.

    References:

    • T. Lindeberg (2011) "Generalized Gaussian scale-space axiomatics comprising linear scale-space, affine scale-space and spatio-temporal scale-space". Journal of Mathematical Imaging and Vision, volume 40, number 1, pages 36-81, May 2011.
    • T. Lindeberg (2013) “Invariance of visual operations at the level of receptive fields”, PLoS ONE 8(7): e66990, doi:10.1371/journal.pone.0066990, preprint available from arXiv:1210.0754.
    • T. Lindeberg (2013) "Generalized axiomatic scale-space theory", Advances in Imaging and Electron Physics, (P. Hawkes, ed.), Elsevier, volume 178, pages 1-96, Academic Press: Elsevier Inc., doi: 10.1016/B978-0-12-407701-0.00001-7
  • 41.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    A scale selection principle for estimating image deformations1998Inngår i: Image and Vision Computing, ISSN 0262-8856, E-ISSN 1872-8138, Vol. 16, s. 961-977Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    A basic functionality of a vision system concerns the ability to compute deformation fields between different images of the same physical structure. This article advocates the need for incorporating explicit mechanisms for scale selection in this context, in algorithms for computing descriptors such as optic flow and for performing stereo matching. A basic reason why such a mechanism is essential is the fact that in a coarse-to-fine propagation of disparity or flow information, it is not necessarily the case that the most accurate estimates are obtained at the finest scales. The existence of interfering structures at fine scales may make it impossible to accurately match the image data at fine scales. selecting deformation estimates from the scales that minimize the (suitably normalized) uncertainty over scales. A specific implementation of this idea is presented for a region based differential flow estimation scheme. It is shown that the integrated scale selection and flow estimation algorithm has the qualitative properties of leading to the selection of coarser scales for larger size image structures and increasing noise level, whereas it leads to the selection of finer scales in the neighbourhood of flow field discontinuities

  • 42.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Automatic scale selection as a pre-processing stage for interpreting the visual world1999Inngår i: Proc. Fundamental StructuralProperties in Image and Pattern Analysis FSPIPA'99 , (Budapest, Hungary), September 6-7, 1999, Österreichischen Computer Gesellschaft , 1999, Vol. 130, s. 9-23Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This paper reviews a systematic methodology for formulating mechanisms for automatic scale selection when performing feature detection in scale-space. An important property of the proposed approach is that the notion of scale is included already in the definition of image features

  • 43.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Automatic Scale Selection as a Pre-Processing Stage to Interpreting Real-World Data1996Inngår i: Proceedings Eighth IEEE International Conference on Tools with Artificial Intelligence (Toulouse, France): Invited keynote address, 1996, s. 490-490Konferansepaper (Annet vitenskapelig)
    Abstract [en]

    e perceive objects in the world as meaningful entities only over certain ranges of scale. A simple example is the concept of a branch of a tree, which makes sense only at a scale from, say, a few centimeters to at most a few meters, It is meaningless to discuss the tree concept at the nanometer or kilometer level. At those scales, it is more relevant to talk about the molecules that form the leaves of the tree, and the forest in which the tree grows, respectively.

    This fact that objects in the world appear in different ways depending on the scale of observation has important implications if one aims at describing them. It shows that the notion of scale is of utmost importance when processing unknown measurement data by automatic methods. In their seminal works, Witkin (1983) and Koenderink (1984) proposed to approach this problem by representing image structures at different scales in a so-called scale-space representation. Traditional scale-space theory building on this work, however, does not address the problem of how to select local appropriate scales for further analysis.

    After a brief review of the main ideas behind a scale-space representation, I will in this talk describe a recently developed systematic methodology for generating hypotheses about interesting scale levels in image data---based on a general principle stating that local extrema over scales of different combinations of normalized derivatives are likely candidates to correspond to interesting image structures. Specifically, it will be shown how this idea can be used for formulating feature detectors which automatically adapt their local scales of processing to the local image structure.

    Support for the proposed methodology will be presented in terms of general study of the scale selection method under rescalings of the input data, as well as more detailed analysis of how the scale selection method performs when integrated with various types of feature detection modules and then applied to characteristic image patterns. Moreover, it will be illustrated by a rich set of experiments how this scale selection approach applies to various types of feature detection problems in early vision.

    In many computer vision applications, the poor performance of the low-level vision modules constitutes a major bottle-neck. It will be argued that the inclusion of mechanisms for automatic scale selection is essential if we are to construct vision systems to analyse complex unknown environments.

  • 44.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Corner detection2001Inngår i: Encyclopaedia of Mathematics / [ed] Michiel Hazewinkel, Springer , 2001Kapittel i bok, del av antologi (Fagfellevurdert)
  • 45.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Detecting salient blob-like image structures and their scales with a scale-space primal sketch: A method for focus-of-attention1993Inngår i: International Journal of Computer Vision, ISSN 0920-5691, E-ISSN 1573-1405, Vol. 11, nr 3, s. 283-318Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    This article presents: (i) a multiscale representation of grey-level shape called the scale-space primal sketch, which makes explicit both features in scale-space and the relations between structures at different scales, (ii) a methodology for extracting significant blob-like image structures from this representation, and (iii) applications to edge detection, histogram analysis, and junction classification demonstrating how the proposed method can be used for guiding later-stage visual processes. The representation gives a qualitative description of image structure, which allows for detection of stable scales and associated regions of interest in a solely bottom-up data-driven way. In other words, it generates coarse segmentation cues, and can hence be seen as preceding further processing, which can then be properly tuned. It is argued that once such information is available, many other processing tasks can become much simpler. Experiments on real imagery demonstrate that the proposed theory gives intuitive results.

  • 46.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Direct estimation of affine image deformations using visual front-end operations with automatic scale selection1995Inngår i: Proc. 5th International Conference on Computer Vision: ICCV'95 (Boston, MA), IEEE Computer Society, 1995, s. 134-141Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This article deals with the problem of estimating deformations of brightness patterns using visual front-end operations. Estimating such deformations constitutes an important subtask in several computer vision problems relating to image correspondence and shape estimation. The following subjects are treated: The problem of decomposing affine flow fields into simpler components is analysed in detail. A canonical parametrization is presented based on singular value decomposition, which naturally separates the rotationally invariant components of the flow field from the rotationally variant ones. A novel mechanism is presented for automatic selection of scale levels when estimating local affine deformations. This mechanism is expressed within a multi-scale framework where disparity estimates are computed in a hierarchical coarse-to-fine manner and corrected using iterative techniques. Then, deformation estimates are selected from the scales that minimize a certain normalized residual over scales. Finally, the descriptors so obtained serve as initial data for computing refined estimates of the local deformations.

  • 47.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Discrete Derivative Approximations with Scale-Space Properties: A Basis for Low-Level Feature Extraction1993Inngår i: Journal of Mathematical Imaging and Vision, ISSN 0924-9907, E-ISSN 1573-7683, Vol. 3, nr 4, s. 349-376Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    This article shows how discrete derivative approximations can be defined so thatscale-space properties hold exactly also in the discrete domain. Starting from a set of natural requirements on the first processing stages of a visual system,the visual front end, it gives an axiomatic derivation of how a multiscale representation of derivative approximations can be constructed from a discrete signal, so that it possesses analgebraic structure similar to that possessed by the derivatives of the traditional scale-space representation in the continuous domain. A family of kernels is derived that constitutediscrete analogues to the continuous Gaussian derivatives.The representation has theoretical advantages over other discretizations of the scale-space theory in the sense that operators that commute before discretizationcommute after discretization. Some computational implications of this are that derivative approximations can be computeddirectly from smoothed data and that this will giveexactly the same result as convolution with the corresponding derivative approximation kernel. Moreover, a number ofnormalization conditions are automatically satisfied.The proposed methodology leads to a scheme of computations of multiscale low-level feature extraction that is conceptually very simple and consists of four basic steps: (i)large support convolution smoothing, (ii)small support difference computations, (iii)point operations for computing differential geometric entities, and (iv)nearest-neighbour operations for feature detection.Applications demonstrate how the proposed scheme can be used for edge detection and junction detection based on derivatives up to order three.

  • 48.
    Lindeberg, Tony
    KTH, Tidigare Institutioner, Numerisk analys och datalogi, NADA.
    Discrete Scale-Space Theory and the Scale-Space Primal Sketch1991Doktoravhandling, monografi (Annet vitenskapelig)
    Abstract [en]

    This thesis, within the subfield of computer science known as computer vision, deals with the use of scale-space analysis in early low-level processing of visual information. The main contributions comprise the following five subjects:

    • The formulation of a scale-space theory for discrete signals. Previously, the scale-space concept has been expressed for continuous signals only. We propose that the canonical way to construct a scale-space for discrete signals is by convolution with a kernel called the discrete analogue of the Gaussian kernel, or equivalently by solving a semi-discretized version of the diffusion equation. Both the one-dimensional and two-dimensional cases are covered. An extensive analysis of discrete smoothing kernels is carried out for one-dimensional signals and the discrete scale-space properties of the most common discretizations to the continuous theory are analysed.

    • A representation, called the scale-space primal sketch, which gives a formal description of the hierarchical relations between structures at different levels of scale. It is aimed at making information in the scale-space representation explicit. We give a theory for its construction and an algorithm for computing it.

    • A theory for extracting significant image structures and determining the scales of these structures from this representation in a solely bottom-up data-driven way.

    • Examples demonstrating how such qualitative information extracted from the scale-space primal sketch can be used for guiding and simplifying other early visual processes. Applications are given to edge detection, histogram analysis and classification based on local features. Among other possible applications one can mention perceptual grouping, texture analysis, stereo matching, model matching and motion.

    • A detailed theoretical analysis of the evolution properties of critical points and blobs in scale-space, comprising drift velocity estimates under scale-space smoothing, a classification of the possible types of generic events at bifurcation situations and estimates of how the number of local extrema in a signal can be expected to decrease as function of the scale parameter. For two-dimensional signals the generic bifurcation events are annihilations and creations of extremum-saddle point pairs. Interpreted in terms of blobs, these transitions correspond to annihilations, merges, splits and creations.

    Experiments on different types of real imagery demonstrate that the proposed theory gives perceptually intuitive results.

  • 49.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Edge detection2001Inngår i: Encyclopaedia of Mathematics / [ed] Michiel Hazewinkel, Springer , 2001Kapittel i bok, del av antologi (Fagfellevurdert)
    Abstract [en]

    Edge detection

    An early processing stage in image processing and computer vision, aimed at detecting and characterizing discontinuities in the image domain.

    The importance of edge detection for early machine vision is usually motivated from the observation that under rather general assumptions about the image formation process, a discontinuity in image brightness can be assumed to correspond to a discontinuity in either depth, surface orientation, reflectance, or illumination. In this respect, edges in the image domain constitute a strong link to physical properties of the world. A representation of image information in terms of edges is also compact in the sense that the two-dimensional image pattern is represented by a set of one-dimensional curves. For these reasons, edges have been used as main features in a large number of computer vision algorithms.

    A non-trivial aspect of edge-based analysis of image data, however, concerns what should be meant by a discontinuity in image brightness. Real-world image data are inherently discrete, and for a function defined on a discrete domain, there is no natural notion of "discontinuity" , and there is no inherent way to judge what are the edges in a given discrete image.

    An early approach to edge detection involved the convolution of the image  by a Gaussian kernel , followed by the detection of zero-crossings in the Laplacian response [a1] (cf. also Scale-space theory). However, such edge curves satisfying

    give rise to false edges and have poor localization at curved edges.

    A more refined approach is the notion of non-maximum suppression [a2], [a3], [a4], where edges are defined as points at which the gradient magnitude assumes a local maximum in the gradient direction. In differential-geometric terms, such edge points can be characterized as points at which [a5]:

    i) the second-order directional derivative in the gradient direction is zero; and

    ii) the third-order directional derivative in the gradient direction is negative.

    In terms of partial derivatives, for a two-dimensional image  this edge definition can be written as

    Again, the computation of discrete derivative approximations is preceded by smoothing the image  with a Gaussian kernel, and the choice of different standard deviations of the Gaussian kernel gives rise to edges at different scales (see Scale-space theory or [a5]). While other choices of linear smoothing kernels have also been advocated, their shapes can often be well approximated by Gaussians [a3], [a6], [a7].

    Other approaches to edge detection involve the thresholding of edge strength measures, the computation of intensity derivatives from local least squares fitting, and functional minimization (see also [a8]).

    A subject which has been given large attention during the 1990s is the replacement of the linear smoothing operation by a non-linear smoothing step, with the