kth.sePublications
Change search
Refine search result
1234567 1 - 50 of 785
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1. Abbeloos, W.
    et al.
    Caccamo, Sergio
    KTH, School of Computer Science and Communication (CSC), Robotics, perception and learning, RPL.
    Ataer-Cansizoglu, E.
    Taguchi, Y.
    Feng, C.
    Lee, T. -Y
    Detecting and Grouping Identical Objects for Region Proposal and Classification2017In: 2017 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, IEEE Computer Society, 2017, Vol. 2017, p. 501-502, article id 8014810Conference paper (Refereed)
    Abstract [en]

    Often multiple instances of an object occur in the same scene, for example in a warehouse. Unsupervised multi-instance object discovery algorithms are able to detect and identify such objects. We use such an algorithm to provide object proposals to a convolutional neural network (CNN) based classifier. This results in fewer regions to evaluate, compared to traditional region proposal algorithms. Additionally, it enables using the joint probability of multiple instances of an object, resulting in improved classification accuracy. The proposed technique can also split a single class into multiple sub-classes corresponding to the different object types, enabling hierarchical classification.

  • 2. Abeywardena, D.
    et al.
    Wang, Zhan
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP. KTH, School of Computer Science and Communication (CSC), Centres, Centre for Autonomous Systems, CAS.
    Dissanayake, G.
    Waslander, S. L.
    Kodagoda, S.
    Model-aided state estimation for quadrotor micro air vehicles amidst wind disturbances2014Conference paper (Refereed)
    Abstract [en]

    This paper extends the recently developed Model-Aided Visual-Inertial Fusion (MA-VIF) technique for quadrotor Micro Air Vehicles (MAV) to deal with wind disturbances. The wind effects are explicitly modelled in the quadrotor dynamic equations excluding the unobservable wind velocity component. This is achieved by a nonlinear observability of the dynamic system with wind effects. We show that using the developed model, the vehicle pose and two components of the wind velocity vector can be simultaneously estimated with a monocular camera and an inertial measurement unit. We also show that the MA-VIF is reasonably tolerant to wind disturbances, even without explicit modelling of wind effects and explain the reasons for this behaviour. Experimental results using a Vicon motion capture system are presented to demonstrate the effectiveness of the proposed method and validate our claims.

  • 3.
    Abraham, Johannes
    et al.
    KTH, School of Engineering Sciences in Chemistry, Biotechnology and Health (CBH), Biomedical Engineering and Health Systems, Health Informatics and Logistics.
    Romano, Robin
    KTH, School of Engineering Sciences in Chemistry, Biotechnology and Health (CBH), Biomedical Engineering and Health Systems, Health Informatics and Logistics.
    Automatisk kvalitetssäkring av information för järnvägsanläggningar: Automatic quality assurance of information for railway infrastructure2019Independent thesis Basic level (university diploma), 10 credits / 15 HE creditsStudent thesis
    Abstract [en]

    With increased expectations for the expansion of the future railway, this entails an increased load on the current railway network. The result of the expansion can be an increasing number of cancellations and delays. By taking advantage of technological innovations such as digitalization and automation, the existing system and work  processes can be developed for more efficient management.   The Swedish Transport Administration sets requirements for Building Information Modeling (BIM) in procurements. The planning of signal installations within the railway takes place in Sweco using the CAD program Promis.e. From the program, lists containing the information of the objects (BIS-lists) can be retrieved. The  Swedish Transport Administration requires that the attributes must consist of a  certain format or have specific values. In this thesis project, methods for automatic quality assurance of infrastructure information and the implementation of the method for rail projects were examined. The investigated methods include the  calculation program Excel, the query programming language SQL and the process of ETL.  After analyzing the methods, the ETL process was chosen. The result was that a  program was created to automatically select the type of BIS list that would be  reviewed and to verify that the examined attributes contained allowed values. In  order to investigate whether the cost of the programs would benefit the company in addition to the quality assurance, an economic analysis was carried out. According to the calculations, the choice of method could also be justified from an economic  perspective.

    Download full text (pdf)
    Examensarbete
  • 4.
    Adler, Jonas
    KTH, School of Engineering Sciences (SCI), Mathematics (Dept.), Mathematics (Div.).
    Learned Iterative Reconstruction2023In: Handbook of Mathematical Models and Algorithms in Computer Vision and Imaging: Mathematical Imaging and Vision, Springer Nature , 2023, p. 751-771Chapter in book (Other academic)
    Abstract [en]

    Learned iterative reconstruction methods have recently emerged as a powerful tool to solve inverse problems. These deep learning techniques for image reconstruction achieve remarkable speed and accuracy by combining hard knowledge about the physics of the image formation process, represented by the forward operator, with soft knowledge about how the reconstructions should look like, represented by deep neural networks. A diverse set of such methods have been proposed, and this chapter seeks to give an overview of their similarities and differences, as well as discussing some of the commonly used methods to improve their performance.

  • 5.
    Adler, Jonas
    et al.
    KTH, School of Engineering Sciences (SCI), Mathematics (Dept.), Mathematics (Div.). Elekta Instrument AB, Stockholm, Sweden.
    Öktem, Ozan
    KTH, School of Engineering Sciences (SCI), Mathematics (Dept.), Mathematics (Div.).
    Learned Primal-Dual Reconstruction2018In: IEEE Transactions on Medical Imaging, ISSN 0278-0062, E-ISSN 1558-254X, Vol. 37, no 6, p. 1322-1332Article in journal (Refereed)
    Abstract [en]

    We propose the Learned Primal-Dual algorithm for tomographic reconstruction. The algorithm accounts for a (possibly non-linear) forward operator in a deep neural network by unrolling a proximal primal-dual optimization method, but where the proximal operators have been replaced with convolutional neural networks. The algorithm is trained end-to-end, working directly from raw measured data and it does not depend on any initial reconstruction such as filtered back-projection (FBP). We compare performance of the proposed method on low dose computed tomography reconstruction against FBP, total variation (TV), and deep learning based post-processing of FBP. For the Shepp-Logan phantom we obtain >6 dB peak signal to noise ratio improvement against all compared methods. For human phantoms the corresponding improvement is 6.6 dB over TV and 2.2 dB over learned post-processing along with a substantial improvement in the structural similarity index. Finally, our algorithm involves only ten forward-back-projection computations, making the method feasible for time critical clinical applications.

  • 6.
    Aghazadeh, Omid
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Data Driven Visual Recognition2014Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    This thesis is mostly about supervised visual recognition problems. Based on a general definition of categories, the contents are divided into two parts: one which models categories and one which is not category based. We are interested in data driven solutions for both kinds of problems.

    In the category-free part, we study novelty detection in temporal and spatial domains as a category-free recognition problem. Using data driven models, we demonstrate that based on a few reference exemplars, our methods are able to detect novelties in ego-motions of people, and changes in the static environments surrounding them.

    In the category level part, we study object recognition. We consider both object category classification and localization, and propose scalable data driven approaches for both problems. A mixture of parametric classifiers, initialized with a sophisticated clustering of the training data, is demonstrated to adapt to the data better than various baselines such as the same model initialized with less subtly designed procedures. A nonparametric large margin classifier is introduced and demonstrated to have a multitude of advantages in comparison to its competitors: better training and testing time costs, the ability to make use of indefinite/invariant and deformable similarity measures, and adaptive complexity are the main features of the proposed model.

    We also propose a rather realistic model of recognition problems, which quantifies the interplay between representations, classifiers, and recognition performances. Based on data-describing measures which are aggregates of pairwise similarities of the training data, our model characterizes and describes the distributions of training exemplars. The measures are shown to capture many aspects of the difficulty of categorization problems and correlate significantly to the observed recognition performances. Utilizing these measures, the model predicts the performance of particular classifiers on distributions similar to the training data. These predictions, when compared to the test performance of the classifiers on the test sets, are reasonably accurate.

    We discuss various aspects of visual recognition problems: what is the interplay between representations and classification tasks, how can different models better adapt to the training data, etc. We describe and analyze the aforementioned methods that are designed to tackle different visual recognition problems, but share one common characteristic: being data driven.

    Download full text (pdf)
    Thesis
  • 7.
    Aghazadeh, Omid
    et al.
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Azizpour, Hossein
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Sullivan, Josephine
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Carlsson, Stefan
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Mixture component identification and learning for visual recognition2012In: Computer Vision – ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part VI, Springer, 2012, p. 115-128Conference paper (Refereed)
    Abstract [en]

    The non-linear decision boundary between object and background classes - due to large intra-class variations - needs to be modelled by any classifier wishing to achieve good results. While a mixture of linear classifiers is capable of modelling this non-linearity, learning this mixture from weakly annotated data is non-trivial and is the paper's focus. Our approach is to identify the modes in the distribution of our positive examples by clustering, and to utilize this clustering in a latent SVM formulation to learn the mixture model. The clustering relies on a robust measure of visual similarity which suppresses uninformative clutter by using a novel representation based on the exemplar SVM. This subtle clustering of the data leads to learning better mixture models, as is demonstrated via extensive evaluations on Pascal VOC 2007. The final classifier, using a HOG representation of the global image patch, achieves performance comparable to the state-of-the-art while being more efficient at detection time.

  • 8.
    Aghazadeh, Omid
    et al.
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Carlsson, Stefan
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Large Scale, Large Margin Classification using Indefinite Similarity MeasurensManuscript (preprint) (Other academic)
  • 9.
    Aghazadeh, Omid
    et al.
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Carlsson, Stefan
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Properties of Datasets Predict the Performance of Classifiers2013Manuscript (preprint) (Other academic)
  • 10.
    Aghazadeh, Omid
    et al.
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Carlsson, Stefan
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Properties of Datasets Predict the Performance of Classifiers2013In: BMVC 2013 - Electronic Proceedings of the British Machine Vision Conference 2013, British Machine Vision Association, BMVA , 2013Conference paper (Refereed)
    Abstract [en]

    It has been shown that the performance of classifiers depends not only on the number of training samples, but also on the quality of the training set [10, 12]. The purpose of this paper is to 1) provide quantitative measures that determine the quality of the training set and 2) provide the relation between the test performance and the proposed measures. The measures are derived from pairwise affinities between training exemplars of the positive class and they have a generative nature. We show that the performance of the state of the art methods, on the test set, can be reasonably predicted based on the values of the proposed measures on the training set. These measures open up a wide range of applications to the recognition community enabling us to analyze the behavior of the learning algorithms w.r.t the properties of the training data. This will in turn enable us to devise rules for the automatic selection of training data that maximize the quantified quality of the training set and thereby improve recognition performance.

  • 11.
    Aghazadeh, Omid
    et al.
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Sullivan, Josephine
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Carlsson, Stefan
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Multi view registration for novelty/background separation2012In: Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, IEEE Computer Society, 2012, p. 757-764Conference paper (Refereed)
    Abstract [en]

    We propose a system for the automatic segmentation of novelties from the background in scenarios where multiple images of the same environment are available e.g. obtained by wearable visual cameras. Our method finds the pixels in a query image corresponding to the underlying background environment by comparing it to reference images of the same scene. This is achieved despite the fact that all the images may have different viewpoints, significantly different illumination conditions and contain different objects cars, people, bicycles, etc. occluding the background. We estimate the probability of each pixel, in the query image, belonging to the background by computing its appearance inconsistency to the multiple reference images. We then, produce multiple segmentations of the query image using an iterated graph cuts algorithm, initializing from these estimated probabilities and consecutively combine these segmentations to come up with a final segmentation of the background. Detection of the background in turn highlights the novel pixels. We demonstrate the effectiveness of our approach on a challenging outdoors data set.

  • 12.
    Agrawal, Alekh
    et al.
    Microsoft Research.
    Kragic, Danica
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.
    Wu, Cathy
    Massachusetts Institute of Technology.
    et al.,
    The Second Annual Conference on Learning for Dynamics and Control: Editorial2020In: Proceedings of Machine Learning Research, ML Research Press , 2020, Vol. 120Conference paper (Refereed)
  • 13.
    Ahlberg, Sofie
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Decision and Control Systems (Automatic Control).
    Dimarogonas, Dimos V.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Decision and Control Systems (Automatic Control). KTH, School of Electrical Engineering and Computer Science (EECS), Centres, Centre for Autonomous Systems, CAS. KTH, School of Electrical Engineering and Computer Science (EECS), Centres, ACCESS Linnaeus Centre.
    Mixed-Initiative Control Synthesis: Estimating an Unknown Task Based on Human Control Input2020In: Proceedings of the 3rd IFAC Workshop on Cyber-Physical & Human Systems,, 2020Conference paper (Refereed)
    Abstract [en]

    In this paper we consider a mobile platform controlled by two entities; an autonomousagent and a human user. The human aims for the mobile platform to complete a task, whichwe will denote as the human task, and will impose a control input accordingly, while not beingaware of any other tasks the system should or must execute. The autonomous agent will in turnplan its control input taking in consideration all safety requirements which must be met, sometask which should be completed as much as possible (denoted as the robot task), as well aswhat it believes the human task is based on previous human control input. A framework for theautonomous agent and a mixed initiative controller are designed to guarantee the satisfaction ofthe safety requirements while both the human and robot tasks are violated as little as possible.The framework includes an estimation algorithm of the human task which will improve witheach cycle, eventually converging to a task which is similar to the actual human task. Hence, theautonomous agent will eventually be able to find the optimal plan considering all tasks and thehuman will have no need to interfere again. The process is illustrated with a simulated example

    Download full text (pdf)
    fulltext
  • 14.
    Al Hakim, Ezeddin
    KTH, School of Electrical Engineering and Computer Science (EECS).
    3D YOLO: End-to-End 3D Object Detection Using Point Clouds2018Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    For safe and reliable driving, it is essential that an autonomous vehicle can accurately perceive the surrounding environment. Modern sensor technologies used for perception, such as LiDAR and RADAR, deliver a large set of 3D measurement points known as a point cloud. There is a huge need to interpret the point cloud data to detect other road users, such as vehicles and pedestrians.

    Many research studies have proposed image-based models for 2D object detection. This thesis takes it a step further and aims to develop a LiDAR-based 3D object detection model that operates in real-time, with emphasis on autonomous driving scenarios. We propose 3D YOLO, an extension of YOLO (You Only Look Once), which is one of the fastest state-of-the-art 2D object detectors for images. The proposed model takes point cloud data as input and outputs 3D bounding boxes with class scores in real-time. Most of the existing 3D object detectors use hand-crafted features, while our model follows the end-to-end learning fashion, which removes manual feature engineering.

    3D YOLO pipeline consists of two networks: (a) Feature Learning Network, an artificial neural network that transforms the input point cloud to a new feature space; (b) 3DNet, a novel convolutional neural network architecture based on YOLO that learns the shape description of the objects.

    Our experiments on the KITTI dataset shows that the 3D YOLO has high accuracy and outperforms the state-of-the-art LiDAR-based models in efficiency. This makes it a suitable candidate for deployment in autonomous vehicles.

    Download full text (pdf)
    fulltext
  • 15.
    Alexanderson, Simon
    et al.
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.
    O'Sullivan, Carol
    Neff, Michael
    Beskow, Jonas
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.
    Mimebot—Investigating the Expressibility of Non-Verbal Communication Across Agent Embodiments2017In: ACM Transactions on Applied Perception, ISSN 1544-3558, E-ISSN 1544-3965, Vol. 14, no 4, article id 24Article in journal (Refereed)
    Abstract [en]

    Unlike their human counterparts, artificial agents such as robots and game characters may be deployed with a large variety of face and body configurations. Some have articulated bodies but lack facial features, and others may be talking heads ending at the neck. Generally, they have many fewer degrees of freedom than humans through which they must express themselves, and there will inevitably be a filtering effect when mapping human motion onto the agent. In this article, we investigate filtering effects on three types of embodiments: (a) an agent with a body but no facial features, (b) an agent with a head only, and (c) an agent with a body and a face. We performed a full performance capture of a mime actor enacting short interactions varying the non-verbal expression along five dimensions (e.g., level of frustration and level of certainty) for each of the three embodiments. We performed a crowd-sourced evaluation experiment comparing the video of the actor to the video of an animated robot for the different embodiments and dimensions. Our findings suggest that the face is especially important to pinpoint emotional reactions but is also most volatile to filtering effects. The body motion, on the other hand, had more diverse interpretations but tended to preserve the interpretation after mapping and thus proved to be more resilient to filtering.

    Download full text (pdf)
    fulltext
  • 16.
    Aliabad, Fahime Arabi
    et al.
    Yazd Univ, Fac Nat Resources & Desert Studies, Dept Arid Land Management, Yazd 8915818411, Iran..
    Malamiri, Hamid Reza Ghafarian
    Yazd Univ, Dept Geog, Yazd 8915818411, Iran.;Delft Univ Technol, Dept Geosci & Engn, NL-2628 CD Delft, Netherlands..
    Shojaei, Saeed
    Univ Tehran, Fac Nat Resources, Dept Arid & Mt Reg Reclamat, Tehran 1417935840, Iran..
    Sarsangi, Alireza
    Univ Tehran, Fac Geog, Dept Remote Sensing & GIS, Tehran 1417935840, Iran..
    Ferreira, Carla Sofia Santos
    Stockholm Univ, Bolin Ctr Climate Res, Dept Phys Geog, S-10691 Stockholm, Sweden.;Polytech Inst Coimbra, Agr Sch Coimbra, Res Ctr Nat Resources Environm & Soc CERNAS, P-3045601 Coimbra, Portugal..
    Kalantari, Zahra
    KTH, School of Architecture and the Built Environment (ABE), Sustainable development, Environmental science and Engineering, Water and Environmental Engineering. Stockholm Univ, Bolin Ctr Climate Res, Dept Phys Geog, S-10691 Stockholm, Sweden..
    Investigating the Ability to Identify New Constructions in Urban Areas Using Images from Unmanned Aerial Vehicles, Google Earth, and Sentinel-22022In: Remote Sensing, E-ISSN 2072-4292, Vol. 14, no 13, article id 3227Article in journal (Refereed)
    Abstract [en]

    One of the main problems in developing countries is unplanned urban growth and land use change. Timely identification of new constructions can be a good solution to mitigate some environmental and social problems. This study examined the possibility of identifying new constructions in urban areas using images from unmanned aerial vehicles (UAV), Google Earth and Sentinel-2. The accuracy of the land cover map obtained using these images was investigated using pixel-based processing methods (maximum likelihood, minimum distance, Mahalanobis, spectral angle mapping (SAM)) and object-based methods (Bayes, support vector machine (SVM), K-nearest-neighbor (KNN), decision tree, random forest). The use of DSM to increase the accuracy of classification of UAV images and the use of NDVI to identify vegetation in Sentinel-2 images were also investigated. The object-based KNN method was found to have the greatest accuracy in classifying UAV images (kappa coefficient = 0.93), and the use of DSM increased the classification accuracy by 4%. Evaluations of the accuracy of Google Earth images showed that KNN was also the best method for preparing a land cover map using these images (kappa coefficient = 0.83). The KNN and SVM methods showed the highest accuracy in preparing land cover maps using Sentinel-2 images (kappa coefficient = 0.87 and 0.85, respectively). The accuracy of classification was not increased when using NDVI due to the small percentage of vegetation cover in the study area. On examining the advantages and disadvantages of the different methods, a novel method for identifying new rural constructions was devised. This method uses only one UAV imaging per year to determine the exact position of urban areas with no constructions and then examines spectral changes in related Sentinel-2 pixels that might indicate new constructions in these areas. On-site observations confirmed the accuracy of this method.

  • 17. Almansa, A.
    et al.
    Lindeberg, Tony
    KTH, School of Computer Science and Communication (CSC), Computational Biology, CB.
    Fingerprint enhancement by shape adaptation of scale-space operators with automatic scale selection2000In: IEEE Transactions on Image Processing, ISSN 1057-7149, E-ISSN 1941-0042, Vol. 9, no 12, p. 2027-2042Article in journal (Refereed)
    Abstract [en]

    This work presents two mechanisms for processing fingerprint images; shape-adapted smoothing based on second moment descriptors and automatic scale selection based on normalized derivatives. The shape adaptation procedure adapts the smoothing operation to the local ridge structures, which allows interrupted ridges to be joined without destroying essential singularities such as branching points and enforces continuity of their directional fields. The Scale selection procedure estimates local ridge width and adapts the amount of smoothing to the local amount of noise. In addition, a ridgeness measure is defined, which reflects how well the local image structure agrees with a qualitative ridge model, and is used for spreading the results of shape adaptation into noisy areas. The combined approach makes it possible to resolve fine scale structures in clear areas while reducing the risk of enhancing noise in blurred or fragmented areas. The result is a reliable and adaptively detailed estimate of the ridge orientation field and ridge width, as well as a Smoothed grey-level version of the input image. We propose that these general techniques should be of interest to developers of automatic fingerprint identification systems as well as in other applications of processing related types of imagery.

    Download full text (pdf)
    fulltext
  • 18. Almansa, Andrés
    et al.
    Lindeberg, Tony
    KTH, School of Computer Science and Communication (CSC), Computational Biology, CB.
    Enhancement of Fingerprint Images by Shape-Adapted Scale-Space Operators1996In: Gaussian Scale-Space Theory. Part I: Proceedings of PhD School on Scale-Space Theory (Copenhagen, Denmark) May 1996 / [ed] J. Sporring, M. Nielsen, L. Florack, and P. Johansen, Springer Science+Business Media B.V., 1996, p. 21-30Chapter in book (Refereed)
    Abstract [en]

    This work presents a novel technique for preprocessing fingerprint images. The method is based on the measurements of second moment descriptors and shape adaptation of scale-space operators with automatic scale selection (Lindeberg 1994). This procedure, which has been successfully used in the context of shape-from-texture and shape from disparity gradients, has several advantages when applied to fingerprint image enhancement, as observed by (Weickert 1995). For example, it is capable of joining interrupted ridges, and enforces continuity of their directional fields.

    In this work, these abovementioned general ideas are applied and extended in the following ways: Two methods for estimating local ridge width are explored and tuned to the problem of fingerprint enhancement. A ridgeness measure is defined, which reflects how well the local image structure agrees with a qualitative ridge model. This information is used for guiding a scale-selection mechanism, and for spreading the results of shape adaptation into noisy areas.

    The combined approach makes it possible to resolve fine scale structures in clear areas while reducing the risk of enhancing noise in blurred or fragmented areas. To a large extent, the scheme has the desirable property of joining interrupted lines without destroying essential singularities such as branching points. Thus, the result is a reliable and adaptively detailed estimate of the ridge orientation field and ridge width, as well as a smoothed grey-level version of the input image.

    A detailed experimental evaluation is presented, including a comparison with other techniques. We propose that the techniques presented provide mechanisms of interest to developers of automatic fingerprint identification systems.

    Download full text (pdf)
    fulltext
  • 19.
    Ambrus, Rares
    KTH, School of Computer Science and Communication (CSC), Centres, Centre for Autonomous Systems, CAS. KTH, School of Computer Science and Communication (CSC), Robotics, perception and learning, RPL.
    Unsupervised construction of 4D semantic maps in a long-term autonomy scenario2017Doctoral thesis, monograph (Other academic)
    Abstract [en]

    Robots are operating for longer times and collecting much more data than just a few years ago. In this setting we are interested in exploring ways of modeling the environment, segmenting out areas of interest and keeping track of the segmentations over time, with the purpose of building 4D models (i.e. space and time) of the relevant parts of the environment.

    Our approach relies on repeatedly observing the environment and creating local maps at specific locations. The first question we address is how to choose where to build these local maps. Traditionally, an operator defines a set of waypoints on a pre-built map of the environment which the robot visits autonomously. Instead, we propose a method to automatically extract semantically meaningful regions from a point cloud representation of the environment. The resulting segmentation is purely geometric, and in the context of mobile robots operating in human environments, the semantic label associated with each segment (i.e. kitchen, office) can be of interest for a variety of applications. We therefore also look at how to obtain per-pixel semantic labels given the geometric segmentation, by fusing probabilistic distributions over scene and object types in a Conditional Random Field.

    For most robotic systems, the elements of interest in the environment are the ones which exhibit some dynamic properties (such as people, chairs, cups, etc.), and the ability to detect and segment such elements provides a very useful initial segmentation of the scene. We propose a method to iteratively build a static map from observations of the same scene acquired at different points in time. Dynamic elements are obtained by computing the difference between the static map and new observations. We address the problem of clustering together dynamic elements which correspond to the same physical object, observed at different points in time and in significantly different circumstances. To address some of the inherent limitations in the sensors used, we autonomously plan, navigate around and obtain additional views of the segmented dynamic elements. We look at methods of fusing the additional data and we show that both a combined point cloud model and a fused mesh representation can be used to more robustly recognize the dynamic object in future observations. In the case of the mesh representation, we also show how a Convolutional Neural Network can be trained for recognition by using mesh renderings.

    Finally, we present a number of methods to analyse the data acquired by the mobile robot autonomously and over extended time periods. First, we look at how the dynamic segmentations can be used to derive a probabilistic prior which can be used in the mapping process to further improve and reinforce the segmentation accuracy. We also investigate how to leverage spatial-temporal constraints in order to cluster dynamic elements observed at different points in time and under different circumstances. We show that by making a few simple assumptions we can increase the clustering accuracy even when the object appearance varies significantly between observations. The result of the clustering is a spatial-temporal footprint of the dynamic object, defining an area where the object is likely to be observed spatially as well as a set of time stamps corresponding to when the object was previously observed. Using this data, predictive models can be created and used to infer future times when the object is more likely to be observed. In an object search scenario, this model can be used to decrease the search time when looking for specific objects.

    Download full text (pdf)
    Rares_Ambrus_PhD_Thesis
  • 20.
    Ambrus, Rares
    et al.
    KTH, School of Computer Science and Communication (CSC), Robotics, perception and learning, RPL. KTH, School of Computer Science and Communication (CSC), Centres, Centre for Autonomous Systems, CAS.
    Bore, Nils
    KTH, School of Computer Science and Communication (CSC), Centres, Centre for Autonomous Systems, CAS.
    Folkesson, John
    KTH, School of Computer Science and Communication (CSC), Centres, Centre for Autonomous Systems, CAS. KTH, School of Computer Science and Communication (CSC), Robotics, perception and learning, RPL.
    Jensfelt, Patric
    KTH, School of Computer Science and Communication (CSC), Centres, Centre for Autonomous Systems, CAS. KTH, School of Computer Science and Communication (CSC), Robotics, perception and learning, RPL.
    Autonomous meshing, texturing and recognition of objectmodels with a mobile robot2017Conference paper (Refereed)
    Abstract [en]

    We present a system for creating object modelsfrom RGB-D views acquired autonomously by a mobile robot.We create high-quality textured meshes of the objects byapproximating the underlying geometry with a Poisson surface.Our system employs two optimization steps, first registering theviews spatially based on image features, and second aligningthe RGB images to maximize photometric consistency withrespect to the reconstructed mesh. We show that the resultingmodels can be used robustly for recognition by training aConvolutional Neural Network (CNN) on images rendered fromthe reconstructed meshes. We perform experiments on datacollected autonomously by a mobile robot both in controlledand uncontrolled scenarios. We compare quantitatively andqualitatively to previous work to validate our approach.

    Download full text (pdf)
    fulltext
  • 21.
    Ambrus, Rares
    et al.
    KTH, School of Computer Science and Communication (CSC), Robotics, perception and learning, RPL. KTH, School of Computer Science and Communication (CSC), Centres, Centre for Autonomous Systems, CAS.
    Claici, Sebastian
    Wendt, Axel
    Automatic Room Segmentation From Unstructured 3-D Data of Indoor Environments2017In: IEEE Robotics and Automation Letters, E-ISSN 2377-3766, Vol. 2, no 2, p. 749-756Article in journal (Refereed)
    Abstract [en]

    We present an automatic approach for the task of reconstructing a 2-D floor plan from unstructured point clouds of building interiors. Our approach emphasizes accurate and robust detection of building structural elements and, unlike previous approaches, does not require prior knowledge of scanning device poses. The reconstruction task is formulated as a multiclass labeling problem that we approach using energy minimization. We use intuitive priors to define the costs for the energy minimization problem and rely on accurate wall and opening detection algorithms to ensure robustness. We provide detailed experimental evaluation results, both qualitative and quantitative, against state-of-the-art methods and labeled ground-truth data.

  • 22.
    Ambrus, Rares
    et al.
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP. KTH, School of Computer Science and Communication (CSC), Centres, Centre for Autonomous Systems, CAS.
    Ekekrantz, Johan
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP. KTH, School of Computer Science and Communication (CSC), Centres, Centre for Autonomous Systems, CAS.
    Folkesson, John
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP. KTH, School of Computer Science and Communication (CSC), Centres, Centre for Autonomous Systems, CAS.
    Jensfelt, Patric
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP. KTH, School of Computer Science and Communication (CSC), Centres, Centre for Autonomous Systems, CAS.
    Unsupervised learning of spatial-temporal models of objects in a long-term autonomy scenario2015In: 2015 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), IEEE , 2015, p. 5678-5685Conference paper (Refereed)
    Abstract [en]

    We present a novel method for clustering segmented dynamic parts of indoor RGB-D scenes across repeated observations by performing an analysis of their spatial-temporal distributions. We segment areas of interest in the scene using scene differencing for change detection. We extend the Meta-Room method and evaluate the performance on a complex dataset acquired autonomously by a mobile robot over a period of 30 days. We use an initial clustering method to group the segmented parts based on appearance and shape, and we further combine the clusters we obtain by analyzing their spatial-temporal behaviors. We show that using the spatial-temporal information further increases the matching accuracy.

  • 23.
    Ambrus, Rares
    et al.
    KTH, School of Computer Science and Communication (CSC), Robotics, perception and learning, RPL. KTH, School of Computer Science and Communication (CSC), Centres, Centre for Autonomous Systems, CAS.
    Folkesson, John
    KTH, School of Computer Science and Communication (CSC), Robotics, perception and learning, RPL. KTH, School of Computer Science and Communication (CSC), Centres, Centre for Autonomous Systems, CAS.
    Jensfelt, Patric
    KTH, School of Computer Science and Communication (CSC), Robotics, perception and learning, RPL. KTH, School of Computer Science and Communication (CSC), Centres, Centre for Autonomous Systems, CAS.
    Unsupervised object segmentation through change detection in a long term autonomy scenario2016In: IEEE-RAS International Conference on Humanoid Robots, IEEE, 2016, p. 1181-1187Conference paper (Refereed)
    Abstract [en]

    In this work we address the problem of dynamic object segmentation in office environments. We make no prior assumptions on what is dynamic and static, and our reasoning is based on change detection between sparse and non-uniform observations of the scene. We model the static part of the environment, and we focus on improving the accuracy and quality of the segmented dynamic objects over long periods of time. We address the issue of adapting the static structure over time and incorporating new elements, for which we train and use a classifier whose output gives an indication of the dynamic nature of the segmented elements. We show that the proposed algorithms improve the accuracy and the rate of detection of dynamic objects by comparing with a labelled dataset.

  • 24.
    Antonova, Rika
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL. KTH, School of Electrical Engineering and Computer Science (EECS), Centres, Centre for Autonomous Systems, CAS.
    Transfer-Aware Kernels, Priors and Latent Spaces from Simulation to Real Robots2020Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Consider challenging sim-to-real cases lacking high-fidelity simulators and allowing only 10-20 hardware trials. This work shows that even imprecise simulation can be beneficial if used to build transfer-aware representations.

    First, the thesis introduces an informed kernel that embeds the space of simulated trajectories into a lower-dimensional space of latent paths. It uses a sequential variational autoencoder (sVAE) to handle large-scale training from simulated data. Its modular design enables quick adaptation when used for Bayesian optimization (BO) on hardware. The thesis and the included publications demonstrate that this approach works for different areas of robotics: locomotion and manipulation. Furthermore, a variant of BO that ensures recovery from negative transfer when using corrupted kernels is introduced. An application to task-oriented grasping validates its performance on hardware.

    For the case of parametric learning, simulators can serve as priors or regularizers. This work describes how to use simulation to regularize a VAE's decoder to bind the VAE's latent space to simulator parameter posterior. With that, training on a small number of real trajectories can quickly shift the posterior to reflect reality. The included publication demonstrates that this approach can also help reinforcement learning (RL) quickly overcome the sim-to-real gap on a manipulation task on hardware.

    A longer-term vision is to shape latent spaces without needing to mandate a particular simulation scenario. A first step is to learn general relations that hold on sequences of states from a set of related domains. This work introduces a unifying mathematical formulation for learning independent analytic relations. Relations are learned from source domains, then used to help structure the latent space when learning on target domains. This formulation enables a more general, flexible and principled way of shaping the latent space. It formalizes the notion of learning independent relations, without imposing restrictive simplifying assumptions or requiring domain-specific information. This work presents mathematical properties, concrete algorithms and experimental validation of successful learning and transfer of latent relations.

    Download full text (pdf)
    phd_thesis_rika_antonova
  • 25.
    Arndt, Karol
    et al.
    Aalto Univ, Espoo, Finland..
    Hazara, Murtaza
    Aalto Univ, Espoo, Finland.;Katholieke Univ Leuven, Dept Mech Engn, Leuven, Belgium.;Flanders Make, Robot Core Lab, Lommel, Belgium..
    Ghadirzadeh, Ali
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL. Aalto Univ, Espoo, Finland.
    Kyrki, Ville
    Aalto Univ, Espoo, Finland..
    Meta Reinforcement Learning for Sim-to-real Domain Adaptation2020In: 2020 IEEE International Conference On Robotics And Automation (ICRA), IEEE , 2020, p. 2725-2731Conference paper (Refereed)
    Abstract [en]

    Modern reinforcement learning methods suffer from low sample efficiency and unsafe exploration, making it infeasible to train robotic policies entirely on real hardware. In this work, we propose to address the problem of sim-to-real domain transfer by using meta learning to train a policy that can adapt to a variety of dynamic conditions, and using a task-specific trajectory generation model to provide an action space that facilitates quick exploration. We evaluate the method by performing domain adaptation in simulation and analyzing the structure of the latent space during adaptation. We then deploy this policy on a KUKA LBR 4+ robot and evaluate its performance on a task of hitting a hockey puck to a target. Our method shows more consistent and stable domain adaptation than the baseline, resulting in better overall performance.

  • 26.
    Arnekvist, Isac
    KTH, School of Computer Science and Communication (CSC).
    Reinforcement learning for robotic manipulation2017Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Reinforcement learning was recently successfully used for real-world robotic manipulation tasks, without the need for human demonstration, usinga normalized advantage function-algorithm (NAF). Limitations on the shape of the advantage function however poses doubts to what kind of policies can be learned using this method. For similar tasks, convolutional neural networks have been used for pose estimation from images taken with fixed position cameras. For some applications however, this might not be a valid assumption. It was also shown that the quality of policies for robotic tasks severely deteriorates from small camera offsets. This thesis investigates the use of NAF for a pushing task with clear multimodal properties. The results are compared with using a deterministic policy with minimal constraints on the Q-function surface. Methods for pose estimation using convolutional neural networks are further investigated, especially with regards to randomly placed cameras with unknown offsets. By defining the coordinate frame of objects with respect to some visible feature, it is hypothesized that relative pose estimation can be accomplished even when the camera is not fixed and the offset is unknown. NAF is successfully implemented to solve a simple reaching task on a real robotic system where data collection is distributed over several robots, and learning is done on a separate server. Using NAF to learn a pushing task fails to converge to a good policy, both on the real robots and in simulation. Deep deterministic policy gradient (DDPG) is instead used in simulation and successfully learns to solve the task. The learned policy is then applied on the real robots and accomplishes to solve the task in the real setting as well. Pose estimation from fixed position camera images is learned and the policy is still able to solve the task using these estimates. By defining a coordinate frame from an object visible to the camera, in this case the robot arm, a neural network learns to regress the pushable objects pose in this frame without the assumption of a fixed camera. However, the precision of the predictions were too inaccurate to be used for solving the pushing task. Further modifications to this approach could however show to be a feasible solution to randomly placed cameras with unknown poses.

    Download full text (pdf)
    fulltext
  • 27.
    Arnekvist, Isac
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.
    Kragic, Danica
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.
    Stork, Johannes A.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL. Center for Applied Autonomous Sensor Systems, Örebro University, Sweden.
    Vpe: Variational policy embedding for transfer reinforcement learning2019In: 2019 International Conference on Robotics And Automation (ICRA), Institute of Electrical and Electronics Engineers (IEEE), 2019, p. 36-42Conference paper (Refereed)
    Abstract [en]

    Reinforcement Learning methods are capable of solving complex problems, but resulting policies might perform poorly in environments that are even slightly different. In robotics especially, training and deployment conditions often vary and data collection is expensive, making retraining undesirable. Simulation training allows for feasible training times, but on the other hand suffer from a reality-gap when applied in real-world settings. This raises the need of efficient adaptation of policies acting in new environments. We consider the problem of transferring knowledge within a family of similar Markov decision processes. We assume that Q-functions are generated by some low-dimensional latent variable. Given such a Q-function, we can find a master policy that can adapt given different values of this latent variable. Our method learns both the generative mapping and an approximate posterior of the latent variables, enabling identification of policies for new tasks by searching only in the latent space, rather than the space of all policies. The low-dimensional space, and master policy found by our method enables policies to quickly adapt to new environments. We demonstrate the method on both a pendulum swing-up task in simulation, and for simulation-to-real transfer on a pushing task.

    Download full text (pdf)
    fulltext
  • 28.
    Arriola-Rios, Veronica E.
    et al.
    Univ Nacl Autonoma Mexico, UNAM, Fac Sci, Dept Math, Mexico City, DF, Mexico..
    Guler, Puren
    Örebro Univ, Ctr Appl Autonomous Sensor Syst, Autonomous Mobile Manipulat Lab, Örebro, Sweden..
    Ficuciello, Fanny
    Univ Naples Federico II, PRISMA Lab, Dept Elect Engn & Informat Technol, Naples, Italy..
    Kragic, Danica
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL. KTH, School of Electrical Engineering and Computer Science (EECS), Centres, Centre for Autonomous Systems, CAS.
    Siciliano, Bruno
    Univ Naples Federico II, PRISMA Lab, Dept Elect Engn & Informat Technol, Naples, Italy..
    Wyatt, Jeremy L.
    Univ Birmingham, Sch Comp Sci, Birmingham, W Midlands, England..
    Modeling of Deformable Objects for Robotic Manipulation: A Tutorial and Review2020In: Frontiers in Robotics and AI, E-ISSN 2296-9144, Vol. 7, article id 82Article, review/survey (Refereed)
    Abstract [en]

    Manipulation of deformable objects has given rise to an important set of open problems in the field of robotics. Application areas include robotic surgery, household robotics, manufacturing, logistics, and agriculture, to name a few. Related research problems span modeling and estimation of an object's shape, estimation of an object's material properties, such as elasticity and plasticity, object tracking and state estimation during manipulation, and manipulation planning and control. In this survey article, we start by providing a tutorial on foundational aspects of models of shape and shape dynamics. We then use this as the basis for a review of existing work on learning and estimation of these models and on motion planning and control to achieve desired deformations. We also discuss potential future lines of work.

  • 29. Aslund, Magnus
    et al.
    Fredenberg, Erik
    KTH, School of Engineering Sciences (SCI), Physics, Physics of Medical Imaging.
    Telman, M.
    Danielsson, Mats
    KTH, School of Engineering Sciences (SCI), Physics, Physics of Medical Imaging.
    Detectors for the future of X-ray imaging2010In: Radiation Protection Dosimetry, ISSN 0144-8420, E-ISSN 1742-3406, Vol. 139, no 1-3, p. 327-333Article in journal (Refereed)
    Abstract [en]

    In recent decades, developments in detectors for X-ray imaging have improved dose efficiency. This has been accomplished with for example, structured scintillators such as columnar CsI, or with direct detectors where the X rays are converted to electric charge carriers in a semiconductor. Scattered radiation remains a major noise source, and fairly inefficient anti-scatter grids are still a gold standard. Hence, any future development should include improved scatter rejection. In recent years, photon-counting detectors have generated significant interest by several companies as well as academic research groups. This method eliminates electronic noise, which is an advantage in low-dose applications. Moreover, energy-sensitive photon-counting detectors allow for further improvements by optimising the signal-to-quantum-noise ratio, anatomical background subtraction or quantitative analysis of object constituents. This paper reviews state-of-the-art photon-counting detectors, scatter control and their application in diagnostic X-ray medical imaging. In particular, spectral imaging with photon-counting detectors, pitfalls such as charge sharing and high rates and various proposals for mitigation are discussed.

    Download full text (pdf)
    fulltext
  • 30.
    Astaraki, Mehdi
    et al.
    KTH, School of Engineering Sciences in Chemistry, Biotechnology and Health (CBH), Biomedical Engineering and Health Systems, Medical Imaging. Karolinska Inst, Dept Oncol Pathol, Stockholm, Sweden..
    Yang, Guang
    Royal Brompton Hosp, Cardiovasc Res Ctr, London, England.;Imperial Coll London, Natl Heart & Lung Inst, London, England..
    Zakko, Yousuf
    Karolinska Univ Hosp, Dept Radiol Imaging & Funct, Solna, Sweden..
    Toma-Dasu, Iuliana
    Karolinska Inst, Dept Oncol Pathol, Stockholm, Sweden.;Stockholm Univ, Dept Phys, Stockholm, Sweden..
    Smedby, Örjan
    KTH, School of Engineering Sciences in Chemistry, Biotechnology and Health (CBH), Biomedical Engineering and Health Systems, Medical Imaging.
    Wang, Chunliang
    KTH, School of Engineering Sciences in Chemistry, Biotechnology and Health (CBH), Biomedical Engineering and Health Systems, Medical Imaging.
    A Comparative Study of Radiomics and Deep-Learning Based Methods for Pulmonary Nodule Malignancy Prediction in Low Dose CT Images2021In: Frontiers in Oncology, E-ISSN 2234-943X, Vol. 11, article id 737368Article in journal (Refereed)
    Abstract [en]

    ObjectivesBoth radiomics and deep learning methods have shown great promise in predicting lesion malignancy in various image-based oncology studies. However, it is still unclear which method to choose for a specific clinical problem given the access to the same amount of training data. In this study, we try to compare the performance of a series of carefully selected conventional radiomics methods, end-to-end deep learning models, and deep-feature based radiomics pipelines for pulmonary nodule malignancy prediction on an open database that consists of 1297 manually delineated lung nodules. MethodsConventional radiomics analysis was conducted by extracting standard handcrafted features from target nodule images. Several end-to-end deep classifier networks, including VGG, ResNet, DenseNet, and EfficientNet were employed to identify lung nodule malignancy as well. In addition to the baseline implementations, we also investigated the importance of feature selection and class balancing, as well as separating the features learned in the nodule target region and the background/context region. By pooling the radiomics and deep features together in a hybrid feature set, we investigated the compatibility of these two sets with respect to malignancy prediction. ResultsThe best baseline conventional radiomics model, deep learning model, and deep-feature based radiomics model achieved AUROC values (mean +/- standard deviations) of 0.792 +/- 0.025, 0.801 +/- 0.018, and 0.817 +/- 0.032, respectively through 5-fold cross-validation analyses. However, after trying out several optimization techniques, such as feature selection and data balancing, as well as adding context features, the corresponding best radiomics, end-to-end deep learning, and deep-feature based models achieved AUROC values of 0.921 +/- 0.010, 0.824 +/- 0.021, and 0.936 +/- 0.011, respectively. We achieved the best prediction accuracy from the hybrid feature set (AUROC: 0.938 +/- 0.010). ConclusionThe end-to-end deep-learning model outperforms conventional radiomics out of the box without much fine-tuning. On the other hand, fine-tuning the models lead to significant improvements in the prediction performance where the conventional and deep-feature based radiomics models achieved comparable results. The hybrid radiomics method seems to be the most promising model for lung nodule malignancy prediction in this comparative study.

  • 31.
    Aviles, Marcos
    et al.
    GMV, Spain.
    Siozios, Kostas
    School of ECE, National Technical University of Athens, Greece.
    Diamantopoulos, Dionysios
    School of ECE, National Technical University of Athens, Greece.
    Nalpantidis, Lazaros
    Production and Management Engineering Dept., Democritus University of Thrace, Greece.
    Kostavelis, Ioannis
    Production and Management Engineering Dept., Democritus University of Thrace, Greece.
    Boukas, Evangelos
    Production and Management Engineering Dept., Democritus University of Thrace, Greece.
    Soudris, Dimitrios
    School of ECE, National Technical University of Athens, Greece.
    Gasteratos, Antonios
    Production and Management Engineering Dept., Democritus University of Thrace, Greece.
    A co-design methodology for implementing computer vision algorithms for rover navigation onto reconfigurable hardware2011In: Proceedings of the FPL2011 Workshop on Computer Vision on Low-Power Reconfigurable Architectures, 2011, p. 9-10Conference paper (Other academic)
    Abstract [en]

    Vision-based robotics applications have been widely studied in the last years. However, up to now solutions that have been proposed were affecting mostly software level. The SPARTAN project focuses in the tight and optimal implementation of computer vision algorithms targeting to rover navigation. For evaluation purposes, these algorithms will be implemented with a co-design methodology onto a Virtex-6 FPGA device.

  • 32.
    Axelsson, Nils
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Skantze, Gabriel
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Modelling Adaptive Presentations in Human-Robot Interaction using Behaviour Trees2019In: 20th Annual Meeting of the Special Interest Group on Discourse and Dialogue: Proceedings of the Conference / [ed] Satoshi Nakamura, Stroudsburg, PA: Association for Computational Linguistics (ACL) , 2019, p. 345-352Conference paper (Refereed)
    Abstract [en]

    In dialogue, speakers continuously adapt their speech to accommodate the listener, based on the feedback they receive. In this paper, we explore the modelling of such behaviours in the context of a robot presenting a painting. A Behaviour Tree is used to organise the behaviour on different levels, and allow the robot to adapt its behaviour in real-time; the tree organises engagement, joint attention, turn-taking, feedback and incremental speech processing. An initial implementation of the model is presented, and the system is evaluated in a user study, where the adaptive robot presenter is compared to a non-adaptive version. The adaptive version is found to be more engaging by the users, although no effects are found on the retention of the presented material.

  • 33.
    Axelsson, Nils
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Skantze, Gabriel
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Using knowledge graphs and behaviour trees for feedback-aware presentation agents2020In: Proceedings of Intelligent Virtual Agents 2020, Association for Computing Machinery (ACM) , 2020Conference paper (Refereed)
    Abstract [en]

    In this paper, we address the problem of how an interactive agent (such as a robot) can present information to an audience and adaptthe presentation according to the feedback it receives. We extend a previous behaviour tree-based model to generate the presentation from a knowledge graph (Wikidata), which allows the agent to handle feedback incrementally, and adapt accordingly. Our main contribution is using this knowledge graph not just for generating the system’s dialogue, but also as the structure through which short-term user modelling happens. In an experiment using simulated users and third-party observers, we show that referring expressions generated by the system are rated more highly when they adapt to the type of feedback given by the user, and when they are based on previously grounded information as opposed to new information.

  • 34.
    Azizpour, Hossein
    et al.
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Laptev, I.
    Object detection using strongly-supervised deformable part models2012In: Computer Vision – ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part I / [ed] Andrew Fitzgibbon, Svetlana Lazebnik, Pietro Perona, Yoichi Sato, Cordelia Schmid, Springer, 2012, no PART 1, p. 836-849Conference paper (Refereed)
    Abstract [en]

    Deformable part-based models [1, 2] achieve state-of-the-art performance for object detection, but rely on heuristic initialization during training due to the optimization of non-convex cost function. This paper investigates limitations of such an initialization and extends earlier methods using additional supervision. We explore strong supervision in terms of annotated object parts and use it to (i) improve model initialization, (ii) optimize model structure, and (iii) handle partial occlusions. Our method is able to deal with sub-optimal and incomplete annotations of object parts and is shown to benefit from semi-supervised learning setups where part-level annotation is provided for a fraction of positive examples only. Experimental results are reported for the detection of six animal classes in PASCAL VOC 2007 and 2010 datasets. We demonstrate significant improvements in detection performance compared to the LSVM [1] and the Poselet [3] object detectors.

  • 35.
    Azizpour, Hossein
    et al.
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Razavian, Ali Sharif
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Sullivan, Josephine
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Maki, Atsuto
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Carlsson, Stefan
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    From Generic to Specific Deep Representations for Visual Recognition2015In: Proceedings of CVPR 2015, IEEE conference proceedings, 2015Conference paper (Refereed)
    Abstract [en]

    Evidence is mounting that ConvNets are the best representation learning method for recognition. In the common scenario, a ConvNet is trained on a large labeled dataset and the feed-forward units activation, at a certain layer of the network, is used as a generic representation of an input image. Recent studies have shown this form of representation to be astoundingly effective for a wide range of recognition tasks. This paper thoroughly investigates the transferability of such representations w.r.t. several factors. It includes parameters for training the network such as its architecture and parameters of feature extraction. We further show that different visual recognition tasks can be categorically ordered based on their distance from the source task. We then show interesting results indicating a clear correlation between the performance of tasks and their distance from the source task conditioned on proposed factors. Furthermore, by optimizing these factors, we achieve stateof-the-art performances on 16 visual recognition tasks.

    Download full text (pdf)
    fulltext
  • 36.
    Azizpour, Hossein
    et al.
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Sharif Razavian, Ali
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Sullivan, Josephine
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Maki, Atsuto
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Carlssom, Stefan
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Factors of Transferability for a Generic ConvNet Representation2016In: IEEE Transactions on Pattern Analysis and Machine Intelligence, ISSN 0162-8828, E-ISSN 1939-3539, Vol. 38, no 9, p. 1790-1802, article id 7328311Article in journal (Refereed)
    Abstract [en]

    Evidence is mounting that Convolutional Networks (ConvNets) are the most effective representation learning method for visual recognition tasks. In the common scenario, a ConvNet is trained on a large labeled dataset (source) and the feed-forward units activation of the trained network, at a certain layer of the network, is used as a generic representation of an input image for a task with relatively smaller training set (target). Recent studies have shown this form of representation transfer to be suitable for a wide range of target visual recognition tasks. This paper introduces and investigates several factors affecting the transferability of such representations. It includes parameters for training of the source ConvNet such as its architecture, distribution of the training data, etc. and also the parameters of feature extraction such as layer of the trained ConvNet, dimensionality reduction, etc. Then, by optimizing these factors, we show that significant improvements can be achieved on various (17) visual recognition tasks. We further show that these visual recognition tasks can be categorically ordered based on their similarity to the source task such that a correlation between the performance of tasks and their similarity to the source task w.r.t. the proposed factors is observed.

  • 37.
    Baisero, Andrea
    et al.
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Pokorny, Florian T.
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Kragic, Danica
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Ek, Carl Henrik
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    The Path Kernel2013In: ICPRAM 2013 - Proceedings of the 2nd International Conference on Pattern Recognition Applications and Methods, 2013, p. 50-57Conference paper (Refereed)
    Abstract [en]

    Kernel methods have been used very successfully to classify data in various application domains. Traditionally, kernels have been constructed mainly for vectorial data defined on a specific vector space. Much less work has been addressing the development of kernel functions for non-vectorial data. In this paper, we present a new kernel for encoding sequential data. We present our results comparing the proposed kernel to the state of the art, showing a significant improvement in classification and a much improved robustness and interpretability.

  • 38.
    Baldassarre, Federico
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.
    Structured Representations for Explainable Deep Learning2023Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Deep learning has revolutionized scientific research and is being used to take decisions in increasingly complex scenarios. With growing power comes a growing demand for transparency and interpretability. The field of Explainable AI aims to provide explanations for the predictions of AI systems. The state of the art of AI explainability, however, is far from satisfactory. For example, in Computer Vision, the most prominent post-hoc explanation methods produce pixel-wise heatmaps over the input domain, which are meant to visualize the importance of individual pixels of an image or video. We argue that such dense attribution maps are poorly interpretable to non-expert users because of the domain in which explanations are formed - we may recognize shapes in a heatmap but they are just blobs of pixels. In fact, the input domain is closer to the raw data of digital cameras than to the interpretable structures that humans use to communicate, e.g. objects or concepts. In this thesis, we propose to move beyond dense feature attributions by adopting structured internal representations as a more interpretable explanation domain. Conceptually, our approach splits a Deep Learning model in two: the perception step that takes as input dense representations and the reasoning step that learns to perform the task at hand. At the interface between the two are structured representations that correspond to well-defined objects, entities, and concepts. These representations serve as the interpretable domain for explaining the predictions of the model, allowing us to move towards more meaningful and informative explanations. The proposed approach introduces several challenges, such as how to obtain structured representations, how to use them for downstream tasks, and how to evaluate the resulting explanations. The works included in this thesis address these questions, validating the approach and providing concrete contributions to the field. For the perception step, we investigate how to obtain structured representations from dense representations, whether by manually designing them using domain knowledge or by learning them from data without supervision. For the reasoning step, we investigate how to use structured representations for downstream tasks, from Biology to Computer Vision, and how to evaluate the learned representations. For the explanation step, we investigate how to explain the predictions of models that operate in a structured domain, and how to evaluate the resulting explanations. Overall, we hope that this work inspires further research in Explainable AI and helps bridge the gap between high-performing Deep Learning models and the need for transparency and interpretability in real-world applications.

    Download full text (pdf)
    kappa
  • 39.
    Baldassarre, Federico
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.
    Azizpour, Hossein
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Explainability Techniques for Graph Convolutional Networks2019Conference paper (Refereed)
    Abstract [en]

    Graph Networks are used to make decisions in potentially complex scenarios but it is usually not obvious how or why they made them. In this work, we study the explainability of Graph Network decisions using two main classes of techniques, gradient-based and decomposition-based, on a toy dataset and a chemistry task. Our study sets the ground for future development as well as application to real-world problems.

    Download full text (pdf)
    fulltext
  • 40.
    Baldassarre, Federico
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.
    Azizpour, Hossein
    Towards Self-Supervised Learning of Global and Object-Centric Representations2022Conference paper (Refereed)
    Abstract [en]

    Self-supervision allows learning meaningful representations of natural images, which usually contain one central object. How well does it transfer to multi-entity scenes? We discuss key aspects of learning structured object-centric representations with self-supervision and validate our insights through several experiments on the CLEVR dataset. Regarding the architecture, we confirm the importance of competition for attention-based object discovery, where each image patch is exclusively attended by one object. For training, we show that contrastive losses equipped with matching can be applied directly in a latent space, avoiding pixel-based reconstruction. However, such an optimization objective is sensitive to false negatives (recurring objects) and false positives (matching errors). Careful consideration is thus required around data augmentation and negative sample selection.

  • 41.
    Baldassarre, Federico
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.
    Debard, Quentin
    Pontiveros, Gonzalo Fiz
    Wijaya, Tri Kurniawan
    Quantitative Metrics for Evaluating Explanations of Video DeepFake Detectors2022Conference paper (Refereed)
    Abstract [en]

    The proliferation of DeepFake technology is a rising challenge in today’s society, owing to more powerful and accessible generation methods. To counter this, the research community has developed detectors of ever-increasing accuracy. However, the ability to explain the decisions of such models to users lags behind performance and is considered an accessory in large-scale benchmarks, despite being a crucial requirement for the correct deployment of automated tools for moderation and censorship. We attribute the issue to the reliance on qualitative comparisons and the lack of established metrics. We describe a simple set of metrics to evaluate the visual quality and informativeness of explanations of video DeepFake classifiers from a human-centric perspective. With these metrics, we compare common approaches to improve explanation quality and discuss their effect on both classification and explanation performance on the recent DFDC and DFD datasets.

  • 42.
    Baldassarre, Federico
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.
    El-Nouby, Alaaeldin
    Jégou, Hervé
    Variable Rate Allocation for Vector-Quantized Autoencoders2023In: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Institute of Electrical and Electronics Engineers (IEEE) , 2023Conference paper (Refereed)
    Abstract [en]

    Vector-quantized autoencoders have recently gained interest in image compression, generation and self-supervised learning. However, as a neural compression method, they lack the possibility to allocate a variable number of bits to each image location, e.g. according to the semantic content or local saliency. In this paper, we address this limitation in a simple yet effective way. We adopt a product quantizer (PQ) that produces a set of discrete codes for each image patch rather than a single index. This PQ-autoencoder is trained end-to-end with a structured dropout that selectively masks a variable number of codes at each location. These mechanisms force the decoder to reconstruct the original image based on partial information and allow us to control the local rate. The resulting model can compress images on a wide range of operating points of the rate-distortion curve and can be paired with any external method for saliency estimation to control the compression rate at a local level. We demonstrate the effectiveness of our approach on the popular Kodak and ImageNet datasets by measuring both distortion and perceptual quality metrics.

  • 43.
    Barbosa, Fernando S.
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.
    Lacerda, Bruno
    Univ Oxford, Oxford Robot Inst, Oxford, England..
    Duckworth, Paul
    Univ Oxford, Oxford Robot Inst, Oxford, England..
    Tumova, Jana
    KTH, School of Electrical Engineering and Computer Science (EECS), Centres, ACCESS Linnaeus Centre. KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.
    Hawes, Nick
    Univ Oxford, Oxford Robot Inst, Oxford, England..
    Risk-Aware Motion Planning in Partially Known Environments2021In: 2021 60th IEEE  conference on decision and control (CDC), Institute of Electrical and Electronics Engineers (IEEE) , 2021, p. 5220-5226Conference paper (Refereed)
    Abstract [en]

    Recent trends envisage robots being deployed in areas deemed dangerous to humans, such as buildings with gas and radiation leaks. In such situations, the model of the underlying hazardous process might be unknown to the agent a priori, giving rise to the problem of planning for safe behaviour in partially known environments. We employ Gaussian process regression to create a probabilistic model of the hazardous process from local noisy samples. The result of this regression is then used by a risk metric, such as the Conditional Value-at-Risk, to reason about the safety at a certain state. The outcome is a risk function that can be employed in optimal motion planning problems. We demonstrate the use of the proposed function in two approaches. First is a sampling-based motion planning algorithm with an event-based trigger for online replanning. Second is an adaptation to the incremental Gaussian Process motion planner (iGPMP2), allowing it to quickly react and adapt to the environment. Both algorithms are evaluated in representative simulation scenarios, where they demonstrate the ability of avoiding high-risk areas.

  • 44. Barekatain, Mohammadamin
    et al.
    Martí Rabadán, Miquel
    KTH, School of Computer Science and Communication (CSC). Polytechnic University of Catalonia, Barcelona.
    Shih, Hsueh-Fu
    Murray, Samuel
    KTH, School of Computer Science and Communication (CSC), Robotics, perception and learning, RPL.
    Nakayama, Kotaro
    Matsuo, Yutaka
    Prendinger, Helmut
    Okutama-Action: An Aerial View Video Dataset for Concurrent Human Action Detection2017In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Institute of Electrical and Electronics Engineers (IEEE) , 2017, p. 2153-2160Conference paper (Refereed)
    Abstract [en]

    Despite significant progress in the development of human action detection datasets and algorithms, no current dataset is representative of real-world aerial view scenarios. We present Okutama-Action, a new video dataset for aerial view concurrent human action detection. It consists of 43 minute-long fully-annotated sequences with 12 action classes. Okutama-Action features many challenges missing in current datasets, including dynamic transition of actions, significant changes in scale and aspect ratio, abrupt camera movement, as well as multi-labeled actors. As a result, our dataset is more challenging than existing ones, and will help push the field forward to enable real-world applications.

  • 45. Baroffio, L.
    et al.
    Cesana, M.
    Redondi, A.
    Tagliasacchi, M.
    Ascenso, J.
    Monteiro, P.
    Eriksson, Emil
    KTH, School of Electrical Engineering (EES), Communication Networks.
    Dan, G.
    Fodor, Viktoria
    KTH, School of Electrical Engineering (EES), Communication Networks.
    GreenEyes: Networked energy-aware visual analysis2015In: 2015 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2015, IEEE conference proceedings, 2015Conference paper (Refereed)
    Abstract [en]

    The GreenEyes project aims at developing a comprehensive set of new methodologies, practical algorithms and protocols, to empower wireless sensor networks with vision capabilities. The key tenet of this research is that most visual analysis tasks can be carried out based on a succinct representation of the image, which entails both global and local features, while it disregards the underlying pixel-level representation. Specifically, GreenEyes will pursue the following goals: i) energy-constrained extraction of visual features; ii) rate-efficiency modelling and coding of visual feature; iii) networking streams of visual features. This will have a significant impact on several scenarios including, e.g., smart cities and environmental monitoring.

  • 46. Baudoin, Y.
    et al.
    Doroftei, D.
    De Cubber, G.
    Berrabah, S. A.
    Pinzon, C.
    Warlet, F.
    Gancet, J.
    Motard, E.
    Ilzkovitz, M.
    Nalpantidis, Lazaros
    Production and Management Engineering Dept., Democritus University of Thrace, Greece.
    Gasteratos, Antonios
    Production and Management Engineering Dept., Democritus University of Thrace, Greece.
    View-finder: Robotics assistance to fire-fighting services and crisis management2009In: Safety, Security & Rescue Robotics (SSRR), 2009 IEEE International Workshop on, 2009, p. 1-6Conference paper (Refereed)
    Abstract [en]

    In the event of an emergency due to a fire or other crisis, a necessary but time consuming pre-requisite, that could delay the real rescue operation, is to establish whether the ground or area can be entered safely by human emergency workers. The objective of the VIEW-FINDER project is to develop robots which have the primary task of gathering data. The robots are equipped with sensors that detect the presence of chemicals and, in parallel, image data is collected and forwarded to an advanced Control station (COC). The robots will be equipped with a wide array of chemical sensors, on-board cameras, Laser and other sensors to enhance scene understanding and reconstruction. At the Base Station (BS) the data is processed and combined with geographical information originating from a web of sources; thus providing the personnel leading the operation with in-situ processed data that can improve decision making. This paper will focus on the Crisis Management Information System that has been developed for improving a Disaster Management Action Plan and for linking the Control Station with a out-site Crisis Management Centre, and on the software tools implemented on the mobile robot gathering data in the outdoor area of the crisis.

  • 47.
    Bauer, Stefan
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Decision and Control Systems (Automatic Control).
    Redmond, Stephen J.
    University College Dublin, University College Dublin.
    et al.,
    Real Robot Challenge: A Robotics Competition in the Cloud2022In: Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track, ML Research Press , 2022, p. 190-204Conference paper (Refereed)
    Abstract [en]

    Dexterous manipulation remains an open problem in robotics. To coordinate efforts of the research community towards tackling this problem, we propose a shared benchmark. We designed and built robotic platforms that are hosted at the MPI-IS1 and can be accessed remotely. Each platform consists of three robotic fingers that are capable of dexterous object manipulation. Users are able to control the platforms remotely by submitting code that is executed automatically, akin to a computational cluster. Using this setup, i) we host robotics competitions, where teams from anywhere in the world access our platforms to tackle challenging tasks ii) we publish the datasets collected during these competitions (consisting of hundreds of robot hours), and iii) we give researchers access to these platforms for their own projects.

  • 48.
    Bechlioulis, Charalampos P.
    et al.
    Natl Tech Univ Athens, Sch Mech Engn, Control Syst Lab, Zografos 15780, Greece..
    Heshmati-alamdari, Shahab
    KTH, School of Electrical Engineering and Computer Science (EECS).
    Karras, George C.
    Natl Tech Univ Athens, Sch Mech Engn, Control Syst Lab, Zografos 15780, Greece..
    Kyriakopoulos, Kostas J.
    Natl Tech Univ Athens, Sch Mech Engn, Control Syst Lab, Zografos 15780, Greece..
    Robust Image-Based Visual Servoing With Prescribed Performance Under Field of View Constraints2019In: IEEE Transactions on robotics, ISSN 1552-3098, E-ISSN 1941-0468, Vol. 35, no 4, p. 1063-1070Article in journal (Refereed)
    Abstract [en]

    In this paper, we propose a visual servoing scheme that imposes predefined performance specifications on the image feature coordinate errors and satisfies the visibility constraints that inherently arise owing to the camera's limited field of view, despite the inevitable calibration and depth measurement errors. Its efficiency is demonstrated via comparative experimental and simulation studies.

  • 49.
    Bekiroglu, Yasemin
    et al.
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Detry, Renaud
    Kragic, Danica
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Grasp Stability from Vision and Touch2012Conference paper (Refereed)
  • 50.
    Bergholm, Fredrik
    et al.
    KTH, School of Computer Science and Communication (CSC), Numerical Analysis and Computer Science, NADA.
    Adler, Jeremy
    Parmryd, Ingela
    Analysis of Bias in the Apparent Correlation Coefficient Between Image Pairs Corrupted by Severe Noise2010In: Journal of Mathematical Imaging and Vision, ISSN 0924-9907, E-ISSN 1573-7683, Vol. 37, no 3, p. 204-219Article in journal (Refereed)
    Abstract [en]

    The correlation coefficient r is a measure of similarity used to compare regions of interest in image pairs. In fluorescence microscopy there is a basic tradeoff between the degree of image noise and the frequency with which images can be acquired and therefore the ability to follow dynamic events. The correlation coefficient r is commonly used in fluorescence microscopy for colocalization measurements, when the relative distributions of two fluorophores are of interest. Unfortunately, r is known to be biased understating the true correlation when noise is present. A better measure of correlation is needed. This article analyses the expected value of r and comes up with a procedure for evaluating the bias of r, expected value formulas. A Taylor series of so-called invariant factors is analyzed in detail. These formulas indicate ways to correct r and thereby obtain a corrected value free from the influence of noise that is on average accurate (unbiased). One possible correction is the attenuated corrected correlation coefficient R, introduced heuristically by Spearman (in Am. J. Psychol. 15:72-101, 1904). An ideal correction formula in terms of expected values is derived. For large samples R tends towards the ideal correction formula and the true noise-free correlation. Correlation measurements using simulation based on the types of noise found in fluorescence microscopy images illustrate both the power of the method and the variance of R. We conclude that the correction formula is valid and is particularly useful for making correct analyses from very noisy datasets.

1234567 1 - 50 of 785
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf