kth.sePublikationer
Ändra sökning
Avgränsa sökresultatet
1234567 1 - 50 av 805
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Träffar per sida
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
Markera
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1. Abbeloos, W.
    et al.
    Caccamo, Sergio
    KTH, Skolan för datavetenskap och kommunikation (CSC), Robotik, perception och lärande, RPL.
    Ataer-Cansizoglu, E.
    Taguchi, Y.
    Feng, C.
    Lee, T. -Y
    Detecting and Grouping Identical Objects for Region Proposal and Classification2017Ingår i: 2017 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, IEEE Computer Society, 2017, Vol. 2017, s. 501-502, artikel-id 8014810Konferensbidrag (Refereegranskat)
    Abstract [en]

    Often multiple instances of an object occur in the same scene, for example in a warehouse. Unsupervised multi-instance object discovery algorithms are able to detect and identify such objects. We use such an algorithm to provide object proposals to a convolutional neural network (CNN) based classifier. This results in fewer regions to evaluate, compared to traditional region proposal algorithms. Additionally, it enables using the joint probability of multiple instances of an object, resulting in improved classification accuracy. The proposed technique can also split a single class into multiple sub-classes corresponding to the different object types, enabling hierarchical classification.

  • 2. Abeywardena, D.
    et al.
    Wang, Zhan
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Dissanayake, G.
    Waslander, S. L.
    Kodagoda, S.
    Model-aided state estimation for quadrotor micro air vehicles amidst wind disturbances2014Konferensbidrag (Refereegranskat)
    Abstract [en]

    This paper extends the recently developed Model-Aided Visual-Inertial Fusion (MA-VIF) technique for quadrotor Micro Air Vehicles (MAV) to deal with wind disturbances. The wind effects are explicitly modelled in the quadrotor dynamic equations excluding the unobservable wind velocity component. This is achieved by a nonlinear observability of the dynamic system with wind effects. We show that using the developed model, the vehicle pose and two components of the wind velocity vector can be simultaneously estimated with a monocular camera and an inertial measurement unit. We also show that the MA-VIF is reasonably tolerant to wind disturbances, even without explicit modelling of wind effects and explain the reasons for this behaviour. Experimental results using a Vicon motion capture system are presented to demonstrate the effectiveness of the proposed method and validate our claims.

  • 3.
    Abraham, Johannes
    et al.
    KTH, Skolan för kemi, bioteknologi och hälsa (CBH), Medicinteknik och hälsosystem, Hälsoinformatik och logistik.
    Romano, Robin
    KTH, Skolan för kemi, bioteknologi och hälsa (CBH), Medicinteknik och hälsosystem, Hälsoinformatik och logistik.
    Automatisk kvalitetssäkring av information för järnvägsanläggningar: Automatic quality assurance of information for railway infrastructure2019Självständigt arbete på grundnivå (högskoleexamen), 10 poäng / 15 hpStudentuppsats (Examensarbete)
    Abstract [sv]

    Järnvägsbranschen står i dagsläget inför stora utmaningar med planerade infrastrukturprojekt och underhåll av befintlig järnväg. Med ökade förväntningar på  utbyggnaden av den framtida järnvägen, medför det en ökad risk för belastning på det nuvarande nätet. Baksidan av utbyggnaden kan bli fler inställda resor och  förseningar. Genom att dra nytta av tekniska innovationer såsom digitalisering och  automatisering kan det befintliga system och arbetsprocesser utvecklas för en  effektivare hantering.  Trafikverket ställer krav på Byggnadsinformationsmodeller (BIM) i upphandlingar. Projektering för signalanläggningar sker hos Sweco med CAD-programmet  Promis.e. Från programmet kan Baninformationslistor (BIS-listor) innehållande  information om objekts attribut hämtas. Trafikverket ställer krav på att attributen ska bestå av ett visst format eller ha specifika värden. I detta examensarbete  undersöks metoder för att automatisk verifiera ifall objekt har tillåtna värden från projekteringsverktyget samt implementering av en metod. Undersökta metoder  innefattar kalkyleringsprogrammet Excel, frågespråket Structured Query Language (SQL) och processen Extract, Transform and Load (ETL).  Efter analys av metoder valdes processen ETL. Resultatet blev att ett program  skapades för att automatiskt välja vilken typ av BIS-lista som skulle granskas och för att verifiera om attributen innehöll tillåtna värden. För att undersöka om kostnaden för programmen skulle gynna företaget utöver kvalitetssäkringen utfördes en  ekonomisk analys. Enligt beräkningarna kunde valet av att automatisera  granskningen även motiveras ur ett ekonomiskt perspektiv.

    Ladda ner fulltext (pdf)
    Examensarbete
  • 4.
    Adler, Jonas
    KTH, Skolan för teknikvetenskap (SCI), Matematik (Inst.), Matematik (Avd.).
    Learned Iterative Reconstruction2023Ingår i: Handbook of Mathematical Models and Algorithms in Computer Vision and Imaging: Mathematical Imaging and Vision, Springer Nature , 2023, s. 751-771Kapitel i bok, del av antologi (Övrigt vetenskapligt)
    Abstract [en]

    Learned iterative reconstruction methods have recently emerged as a powerful tool to solve inverse problems. These deep learning techniques for image reconstruction achieve remarkable speed and accuracy by combining hard knowledge about the physics of the image formation process, represented by the forward operator, with soft knowledge about how the reconstructions should look like, represented by deep neural networks. A diverse set of such methods have been proposed, and this chapter seeks to give an overview of their similarities and differences, as well as discussing some of the commonly used methods to improve their performance.

  • 5.
    Adler, Jonas
    et al.
    KTH, Skolan för teknikvetenskap (SCI), Matematik (Inst.), Matematik (Avd.). Elekta Instrument AB, Stockholm, Sweden.
    Öktem, Ozan
    KTH, Skolan för teknikvetenskap (SCI), Matematik (Inst.), Matematik (Avd.).
    Learned Primal-Dual Reconstruction2018Ingår i: IEEE Transactions on Medical Imaging, ISSN 0278-0062, E-ISSN 1558-254X, Vol. 37, nr 6, s. 1322-1332Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We propose the Learned Primal-Dual algorithm for tomographic reconstruction. The algorithm accounts for a (possibly non-linear) forward operator in a deep neural network by unrolling a proximal primal-dual optimization method, but where the proximal operators have been replaced with convolutional neural networks. The algorithm is trained end-to-end, working directly from raw measured data and it does not depend on any initial reconstruction such as filtered back-projection (FBP). We compare performance of the proposed method on low dose computed tomography reconstruction against FBP, total variation (TV), and deep learning based post-processing of FBP. For the Shepp-Logan phantom we obtain >6 dB peak signal to noise ratio improvement against all compared methods. For human phantoms the corresponding improvement is 6.6 dB over TV and 2.2 dB over learned post-processing along with a substantial improvement in the structural similarity index. Finally, our algorithm involves only ten forward-back-projection computations, making the method feasible for time critical clinical applications.

  • 6.
    Aghazadeh, Omid
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Data Driven Visual Recognition2014Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
    Abstract [en]

    This thesis is mostly about supervised visual recognition problems. Based on a general definition of categories, the contents are divided into two parts: one which models categories and one which is not category based. We are interested in data driven solutions for both kinds of problems.

    In the category-free part, we study novelty detection in temporal and spatial domains as a category-free recognition problem. Using data driven models, we demonstrate that based on a few reference exemplars, our methods are able to detect novelties in ego-motions of people, and changes in the static environments surrounding them.

    In the category level part, we study object recognition. We consider both object category classification and localization, and propose scalable data driven approaches for both problems. A mixture of parametric classifiers, initialized with a sophisticated clustering of the training data, is demonstrated to adapt to the data better than various baselines such as the same model initialized with less subtly designed procedures. A nonparametric large margin classifier is introduced and demonstrated to have a multitude of advantages in comparison to its competitors: better training and testing time costs, the ability to make use of indefinite/invariant and deformable similarity measures, and adaptive complexity are the main features of the proposed model.

    We also propose a rather realistic model of recognition problems, which quantifies the interplay between representations, classifiers, and recognition performances. Based on data-describing measures which are aggregates of pairwise similarities of the training data, our model characterizes and describes the distributions of training exemplars. The measures are shown to capture many aspects of the difficulty of categorization problems and correlate significantly to the observed recognition performances. Utilizing these measures, the model predicts the performance of particular classifiers on distributions similar to the training data. These predictions, when compared to the test performance of the classifiers on the test sets, are reasonably accurate.

    We discuss various aspects of visual recognition problems: what is the interplay between representations and classification tasks, how can different models better adapt to the training data, etc. We describe and analyze the aforementioned methods that are designed to tackle different visual recognition problems, but share one common characteristic: being data driven.

    Ladda ner fulltext (pdf)
    Thesis
  • 7.
    Aghazadeh, Omid
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Azizpour, Hossein
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Sullivan, Josephine
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Carlsson, Stefan
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Mixture component identification and learning for visual recognition2012Ingår i: Computer Vision – ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part VI, Springer, 2012, s. 115-128Konferensbidrag (Refereegranskat)
    Abstract [en]

    The non-linear decision boundary between object and background classes - due to large intra-class variations - needs to be modelled by any classifier wishing to achieve good results. While a mixture of linear classifiers is capable of modelling this non-linearity, learning this mixture from weakly annotated data is non-trivial and is the paper's focus. Our approach is to identify the modes in the distribution of our positive examples by clustering, and to utilize this clustering in a latent SVM formulation to learn the mixture model. The clustering relies on a robust measure of visual similarity which suppresses uninformative clutter by using a novel representation based on the exemplar SVM. This subtle clustering of the data leads to learning better mixture models, as is demonstrated via extensive evaluations on Pascal VOC 2007. The final classifier, using a HOG representation of the global image patch, achieves performance comparable to the state-of-the-art while being more efficient at detection time.

  • 8.
    Aghazadeh, Omid
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Carlsson, Stefan
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Large Scale, Large Margin Classification using Indefinite Similarity MeasurensManuskript (preprint) (Övrigt vetenskapligt)
  • 9.
    Aghazadeh, Omid
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Carlsson, Stefan
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Properties of Datasets Predict the Performance of Classifiers2013Manuskript (preprint) (Övrigt vetenskapligt)
  • 10.
    Aghazadeh, Omid
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Carlsson, Stefan
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Properties of Datasets Predict the Performance of Classifiers2013Ingår i: BMVC 2013 - Electronic Proceedings of the British Machine Vision Conference 2013, British Machine Vision Association, BMVA , 2013Konferensbidrag (Refereegranskat)
    Abstract [en]

    It has been shown that the performance of classifiers depends not only on the number of training samples, but also on the quality of the training set [10, 12]. The purpose of this paper is to 1) provide quantitative measures that determine the quality of the training set and 2) provide the relation between the test performance and the proposed measures. The measures are derived from pairwise affinities between training exemplars of the positive class and they have a generative nature. We show that the performance of the state of the art methods, on the test set, can be reasonably predicted based on the values of the proposed measures on the training set. These measures open up a wide range of applications to the recognition community enabling us to analyze the behavior of the learning algorithms w.r.t the properties of the training data. This will in turn enable us to devise rules for the automatic selection of training data that maximize the quantified quality of the training set and thereby improve recognition performance.

  • 11.
    Aghazadeh, Omid
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Sullivan, Josephine
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Carlsson, Stefan
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Multi view registration for novelty/background separation2012Ingår i: Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, IEEE Computer Society, 2012, s. 757-764Konferensbidrag (Refereegranskat)
    Abstract [en]

    We propose a system for the automatic segmentation of novelties from the background in scenarios where multiple images of the same environment are available e.g. obtained by wearable visual cameras. Our method finds the pixels in a query image corresponding to the underlying background environment by comparing it to reference images of the same scene. This is achieved despite the fact that all the images may have different viewpoints, significantly different illumination conditions and contain different objects cars, people, bicycles, etc. occluding the background. We estimate the probability of each pixel, in the query image, belonging to the background by computing its appearance inconsistency to the multiple reference images. We then, produce multiple segmentations of the query image using an iterated graph cuts algorithm, initializing from these estimated probabilities and consecutively combine these segmentations to come up with a final segmentation of the background. Detection of the background in turn highlights the novel pixels. We demonstrate the effectiveness of our approach on a challenging outdoors data set.

  • 12.
    Agrawal, Alekh
    et al.
    Microsoft Research.
    Kragic, Danica
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Robotik, perception och lärande, RPL.
    Wu, Cathy
    Massachusetts Institute of Technology.
    et al.,
    The Second Annual Conference on Learning for Dynamics and Control: Editorial2020Ingår i: Proceedings of Machine Learning Research, ML Research Press , 2020, Vol. 120Konferensbidrag (Refereegranskat)
  • 13.
    Ahlberg, Sofie
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Reglerteknik.
    Dimarogonas, Dimos V.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Reglerteknik. KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Centrum för autonoma system, CAS. KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, ACCESS Linnaeus Centre.
    Mixed-Initiative Control Synthesis: Estimating an Unknown Task Based on Human Control Input2020Ingår i: Proceedings of the 3rd IFAC Workshop on Cyber-Physical & Human Systems,, 2020Konferensbidrag (Refereegranskat)
    Abstract [en]

    In this paper we consider a mobile platform controlled by two entities; an autonomousagent and a human user. The human aims for the mobile platform to complete a task, whichwe will denote as the human task, and will impose a control input accordingly, while not beingaware of any other tasks the system should or must execute. The autonomous agent will in turnplan its control input taking in consideration all safety requirements which must be met, sometask which should be completed as much as possible (denoted as the robot task), as well aswhat it believes the human task is based on previous human control input. A framework for theautonomous agent and a mixed initiative controller are designed to guarantee the satisfaction ofthe safety requirements while both the human and robot tasks are violated as little as possible.The framework includes an estimation algorithm of the human task which will improve witheach cycle, eventually converging to a task which is similar to the actual human task. Hence, theautonomous agent will eventually be able to find the optimal plan considering all tasks and thehuman will have no need to interfere again. The process is illustrated with a simulated example

    Ladda ner fulltext (pdf)
    fulltext
  • 14.
    Al Hakim, Ezeddin
    KTH, Skolan för elektroteknik och datavetenskap (EECS).
    3D YOLO: End-to-End 3D Object Detection Using Point Clouds2018Självständigt arbete på avancerad nivå (masterexamen), 20 poäng / 30 hpStudentuppsats (Examensarbete)
    Abstract [sv]

    För att autonoma fordon ska ha en god uppfattning av sin omgivning används moderna sensorer som LiDAR och RADAR. Dessa genererar en stor mängd 3-dimensionella datapunkter som kallas point clouds. Inom utvecklingen av autonoma fordon finns det ett stort behov av att tolka LiDAR-data samt klassificera medtrafikanter. Ett stort antal studier har gjorts om 2D-objektdetektering som analyserar bilder för att upptäcka fordon, men vi är intresserade av 3D-objektdetektering med hjälp av endast LiDAR data. Därför introducerar vi modellen 3D YOLO, som bygger på YOLO (You Only Look Once), som är en av de snabbaste state-of-the-art modellerna inom 2D-objektdetektering för bilder. 3D YOLO tar in ett point cloud och producerar 3D lådor som markerar de olika objekten samt anger objektets kategori. Vi har tränat och evaluerat modellen med den publika träningsdatan KITTI. Våra resultat visar att 3D YOLO är snabbare än dagens state-of-the-art LiDAR-baserade modeller med en hög träffsäkerhet. Detta gör den till en god kandidat för kunna användas av autonoma fordon.

    Ladda ner fulltext (pdf)
    fulltext
  • 15.
    Alexanderson, Simon
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    O'Sullivan, Carol
    Neff, Michael
    Beskow, Jonas
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH.
    Mimebot—Investigating the Expressibility of Non-Verbal Communication Across Agent Embodiments2017Ingår i: ACM Transactions on Applied Perception, ISSN 1544-3558, E-ISSN 1544-3965, Vol. 14, nr 4, artikel-id 24Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Unlike their human counterparts, artificial agents such as robots and game characters may be deployed with a large variety of face and body configurations. Some have articulated bodies but lack facial features, and others may be talking heads ending at the neck. Generally, they have many fewer degrees of freedom than humans through which they must express themselves, and there will inevitably be a filtering effect when mapping human motion onto the agent. In this article, we investigate filtering effects on three types of embodiments: (a) an agent with a body but no facial features, (b) an agent with a head only, and (c) an agent with a body and a face. We performed a full performance capture of a mime actor enacting short interactions varying the non-verbal expression along five dimensions (e.g., level of frustration and level of certainty) for each of the three embodiments. We performed a crowd-sourced evaluation experiment comparing the video of the actor to the video of an animated robot for the different embodiments and dimensions. Our findings suggest that the face is especially important to pinpoint emotional reactions but is also most volatile to filtering effects. The body motion, on the other hand, had more diverse interpretations but tended to preserve the interpretation after mapping and thus proved to be more resilient to filtering.

    Ladda ner fulltext (pdf)
    fulltext
  • 16.
    Aliabad, Fahime Arabi
    et al.
    Yazd Univ, Fac Nat Resources & Desert Studies, Dept Arid Land Management, Yazd 8915818411, Iran..
    Malamiri, Hamid Reza Ghafarian
    Yazd Univ, Dept Geog, Yazd 8915818411, Iran.;Delft Univ Technol, Dept Geosci & Engn, NL-2628 CD Delft, Netherlands..
    Shojaei, Saeed
    Univ Tehran, Fac Nat Resources, Dept Arid & Mt Reg Reclamat, Tehran 1417935840, Iran..
    Sarsangi, Alireza
    Univ Tehran, Fac Geog, Dept Remote Sensing & GIS, Tehran 1417935840, Iran..
    Ferreira, Carla Sofia Santos
    Stockholm Univ, Bolin Ctr Climate Res, Dept Phys Geog, S-10691 Stockholm, Sweden.;Polytech Inst Coimbra, Agr Sch Coimbra, Res Ctr Nat Resources Environm & Soc CERNAS, P-3045601 Coimbra, Portugal..
    Kalantari, Zahra
    KTH, Skolan för arkitektur och samhällsbyggnad (ABE), Hållbar utveckling, miljövetenskap och teknik, Vatten- och miljöteknik. Stockholm Univ, Bolin Ctr Climate Res, Dept Phys Geog, S-10691 Stockholm, Sweden..
    Investigating the Ability to Identify New Constructions in Urban Areas Using Images from Unmanned Aerial Vehicles, Google Earth, and Sentinel-22022Ingår i: Remote Sensing, E-ISSN 2072-4292, Vol. 14, nr 13, artikel-id 3227Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    One of the main problems in developing countries is unplanned urban growth and land use change. Timely identification of new constructions can be a good solution to mitigate some environmental and social problems. This study examined the possibility of identifying new constructions in urban areas using images from unmanned aerial vehicles (UAV), Google Earth and Sentinel-2. The accuracy of the land cover map obtained using these images was investigated using pixel-based processing methods (maximum likelihood, minimum distance, Mahalanobis, spectral angle mapping (SAM)) and object-based methods (Bayes, support vector machine (SVM), K-nearest-neighbor (KNN), decision tree, random forest). The use of DSM to increase the accuracy of classification of UAV images and the use of NDVI to identify vegetation in Sentinel-2 images were also investigated. The object-based KNN method was found to have the greatest accuracy in classifying UAV images (kappa coefficient = 0.93), and the use of DSM increased the classification accuracy by 4%. Evaluations of the accuracy of Google Earth images showed that KNN was also the best method for preparing a land cover map using these images (kappa coefficient = 0.83). The KNN and SVM methods showed the highest accuracy in preparing land cover maps using Sentinel-2 images (kappa coefficient = 0.87 and 0.85, respectively). The accuracy of classification was not increased when using NDVI due to the small percentage of vegetation cover in the study area. On examining the advantages and disadvantages of the different methods, a novel method for identifying new rural constructions was devised. This method uses only one UAV imaging per year to determine the exact position of urban areas with no constructions and then examines spectral changes in related Sentinel-2 pixels that might indicate new constructions in these areas. On-site observations confirmed the accuracy of this method.

  • 17.
    Al-Khamisi, Ardoan
    et al.
    KTH, Skolan för kemi, bioteknologi och hälsa (CBH), Medicinteknik och hälsosystem, Hälsoinformatik och logistik.
    El Khoury, Christian
    KTH, Skolan för kemi, bioteknologi och hälsa (CBH), Medicinteknik och hälsosystem, Hälsoinformatik och logistik.
    AI i rekryteringsprocessen: En studie om användningen av AI för CV-analys2024Självständigt arbete på grundnivå (högskoleexamen), 10 poäng / 15 hpStudentuppsats (Examensarbete)
    Abstract [sv]

    Studien undersöker vilka metoder som är mest lämpliga för rekryteringsprocesser genom att inkludera tre befintliga Artificiell intelligens (AI) verktyg samt en egenutvecklad prototyp. Tidigare studier har visat att AI kan förbättra rekryteringsprocessen genom att öka effektiviteten och minska fördomar, men också att det finns begränsningar i hur väl AI kan bedöma kandidaternas kompetenser. Målet är att bestämma de mest effektiva AI-lösningar för att matcha kvalificerade kandidater till ledande positioner. Identifierade möjligheter till förbättringar i hastighet, noggrannhet och kvalitet av rekryteringsprocessen. Fokuset för detta arbete ligger på analys av befintliga AI-lösningar parallellt med utvecklingen och testningen av en prototyp. Prototypen har designats för att hantera de brister som identifierats i de befintliga metoderna, såsom matchning av nyckelord mellan Curriculum Vitae (CV) och jobbannonsen. Denna metod har begränsningar i hur väl den kan identifiera kandidaters verkliga kompetenser och relevans för jobbet, vilket utforskas i denna studie. Resultatet från denna studie visar att AI för närvarande har en begränsad, men växande betydelse i rekryteringsprocesser. Detta pekar på en betydande potential för AI att erbjuda nya lösningar som kan leda till mer rättvisa och effektiva rekryteringsprocesser i framtiden.

    Ladda ner fulltext (pdf)
    AI i rekryteringsprocessen
  • 18. Almansa, A.
    et al.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Fingerprint enhancement by shape adaptation of scale-space operators with automatic scale selection2000Ingår i: IEEE Transactions on Image Processing, ISSN 1057-7149, E-ISSN 1941-0042, Vol. 9, nr 12, s. 2027-2042Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    This work presents two mechanisms for processing fingerprint images; shape-adapted smoothing based on second moment descriptors and automatic scale selection based on normalized derivatives. The shape adaptation procedure adapts the smoothing operation to the local ridge structures, which allows interrupted ridges to be joined without destroying essential singularities such as branching points and enforces continuity of their directional fields. The Scale selection procedure estimates local ridge width and adapts the amount of smoothing to the local amount of noise. In addition, a ridgeness measure is defined, which reflects how well the local image structure agrees with a qualitative ridge model, and is used for spreading the results of shape adaptation into noisy areas. The combined approach makes it possible to resolve fine scale structures in clear areas while reducing the risk of enhancing noise in blurred or fragmented areas. The result is a reliable and adaptively detailed estimate of the ridge orientation field and ridge width, as well as a Smoothed grey-level version of the input image. We propose that these general techniques should be of interest to developers of automatic fingerprint identification systems as well as in other applications of processing related types of imagery.

    Ladda ner fulltext (pdf)
    fulltext
  • 19. Almansa, Andrés
    et al.
    Lindeberg, Tony
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB.
    Enhancement of Fingerprint Images by Shape-Adapted Scale-Space Operators1996Ingår i: Gaussian Scale-Space Theory. Part I: Proceedings of PhD School on Scale-Space Theory (Copenhagen, Denmark) May 1996 / [ed] J. Sporring, M. Nielsen, L. Florack, and P. Johansen, Springer Science+Business Media B.V., 1996, s. 21-30Kapitel i bok, del av antologi (Refereegranskat)
    Abstract [en]

    This work presents a novel technique for preprocessing fingerprint images. The method is based on the measurements of second moment descriptors and shape adaptation of scale-space operators with automatic scale selection (Lindeberg 1994). This procedure, which has been successfully used in the context of shape-from-texture and shape from disparity gradients, has several advantages when applied to fingerprint image enhancement, as observed by (Weickert 1995). For example, it is capable of joining interrupted ridges, and enforces continuity of their directional fields.

    In this work, these abovementioned general ideas are applied and extended in the following ways: Two methods for estimating local ridge width are explored and tuned to the problem of fingerprint enhancement. A ridgeness measure is defined, which reflects how well the local image structure agrees with a qualitative ridge model. This information is used for guiding a scale-selection mechanism, and for spreading the results of shape adaptation into noisy areas.

    The combined approach makes it possible to resolve fine scale structures in clear areas while reducing the risk of enhancing noise in blurred or fragmented areas. To a large extent, the scheme has the desirable property of joining interrupted lines without destroying essential singularities such as branching points. Thus, the result is a reliable and adaptively detailed estimate of the ridge orientation field and ridge width, as well as a smoothed grey-level version of the input image.

    A detailed experimental evaluation is presented, including a comparison with other techniques. We propose that the techniques presented provide mechanisms of interest to developers of automatic fingerprint identification systems.

    Ladda ner fulltext (pdf)
    fulltext
  • 20.
    Ambrus, Rares
    KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS. KTH, Skolan för datavetenskap och kommunikation (CSC), Robotik, perception och lärande, RPL.
    Unsupervised construction of 4D semantic maps in a long-term autonomy scenario2017Doktorsavhandling, monografi (Övrigt vetenskapligt)
    Abstract [en]

    Robots are operating for longer times and collecting much more data than just a few years ago. In this setting we are interested in exploring ways of modeling the environment, segmenting out areas of interest and keeping track of the segmentations over time, with the purpose of building 4D models (i.e. space and time) of the relevant parts of the environment.

    Our approach relies on repeatedly observing the environment and creating local maps at specific locations. The first question we address is how to choose where to build these local maps. Traditionally, an operator defines a set of waypoints on a pre-built map of the environment which the robot visits autonomously. Instead, we propose a method to automatically extract semantically meaningful regions from a point cloud representation of the environment. The resulting segmentation is purely geometric, and in the context of mobile robots operating in human environments, the semantic label associated with each segment (i.e. kitchen, office) can be of interest for a variety of applications. We therefore also look at how to obtain per-pixel semantic labels given the geometric segmentation, by fusing probabilistic distributions over scene and object types in a Conditional Random Field.

    For most robotic systems, the elements of interest in the environment are the ones which exhibit some dynamic properties (such as people, chairs, cups, etc.), and the ability to detect and segment such elements provides a very useful initial segmentation of the scene. We propose a method to iteratively build a static map from observations of the same scene acquired at different points in time. Dynamic elements are obtained by computing the difference between the static map and new observations. We address the problem of clustering together dynamic elements which correspond to the same physical object, observed at different points in time and in significantly different circumstances. To address some of the inherent limitations in the sensors used, we autonomously plan, navigate around and obtain additional views of the segmented dynamic elements. We look at methods of fusing the additional data and we show that both a combined point cloud model and a fused mesh representation can be used to more robustly recognize the dynamic object in future observations. In the case of the mesh representation, we also show how a Convolutional Neural Network can be trained for recognition by using mesh renderings.

    Finally, we present a number of methods to analyse the data acquired by the mobile robot autonomously and over extended time periods. First, we look at how the dynamic segmentations can be used to derive a probabilistic prior which can be used in the mapping process to further improve and reinforce the segmentation accuracy. We also investigate how to leverage spatial-temporal constraints in order to cluster dynamic elements observed at different points in time and under different circumstances. We show that by making a few simple assumptions we can increase the clustering accuracy even when the object appearance varies significantly between observations. The result of the clustering is a spatial-temporal footprint of the dynamic object, defining an area where the object is likely to be observed spatially as well as a set of time stamps corresponding to when the object was previously observed. Using this data, predictive models can be created and used to infer future times when the object is more likely to be observed. In an object search scenario, this model can be used to decrease the search time when looking for specific objects.

    Ladda ner fulltext (pdf)
    Rares_Ambrus_PhD_Thesis
  • 21.
    Ambrus, Rares
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Robotik, perception och lärande, RPL. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Bore, Nils
    KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Folkesson, John
    KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS. KTH, Skolan för datavetenskap och kommunikation (CSC), Robotik, perception och lärande, RPL.
    Jensfelt, Patric
    KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS. KTH, Skolan för datavetenskap och kommunikation (CSC), Robotik, perception och lärande, RPL.
    Autonomous meshing, texturing and recognition of objectmodels with a mobile robot2017Konferensbidrag (Refereegranskat)
    Abstract [en]

    We present a system for creating object modelsfrom RGB-D views acquired autonomously by a mobile robot.We create high-quality textured meshes of the objects byapproximating the underlying geometry with a Poisson surface.Our system employs two optimization steps, first registering theviews spatially based on image features, and second aligningthe RGB images to maximize photometric consistency withrespect to the reconstructed mesh. We show that the resultingmodels can be used robustly for recognition by training aConvolutional Neural Network (CNN) on images rendered fromthe reconstructed meshes. We perform experiments on datacollected autonomously by a mobile robot both in controlledand uncontrolled scenarios. We compare quantitatively andqualitatively to previous work to validate our approach.

    Ladda ner fulltext (pdf)
    fulltext
  • 22.
    Ambrus, Rares
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Robotik, perception och lärande, RPL. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Claici, Sebastian
    Wendt, Axel
    Automatic Room Segmentation From Unstructured 3-D Data of Indoor Environments2017Ingår i: IEEE Robotics and Automation Letters, E-ISSN 2377-3766, Vol. 2, nr 2, s. 749-756Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We present an automatic approach for the task of reconstructing a 2-D floor plan from unstructured point clouds of building interiors. Our approach emphasizes accurate and robust detection of building structural elements and, unlike previous approaches, does not require prior knowledge of scanning device poses. The reconstruction task is formulated as a multiclass labeling problem that we approach using energy minimization. We use intuitive priors to define the costs for the energy minimization problem and rely on accurate wall and opening detection algorithms to ensure robustness. We provide detailed experimental evaluation results, both qualitative and quantitative, against state-of-the-art methods and labeled ground-truth data.

  • 23.
    Ambrus, Rares
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Ekekrantz, Johan
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Folkesson, John
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Jensfelt, Patric
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Unsupervised learning of spatial-temporal models of objects in a long-term autonomy scenario2015Ingår i: 2015 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), IEEE , 2015, s. 5678-5685Konferensbidrag (Refereegranskat)
    Abstract [en]

    We present a novel method for clustering segmented dynamic parts of indoor RGB-D scenes across repeated observations by performing an analysis of their spatial-temporal distributions. We segment areas of interest in the scene using scene differencing for change detection. We extend the Meta-Room method and evaluate the performance on a complex dataset acquired autonomously by a mobile robot over a period of 30 days. We use an initial clustering method to group the segmented parts based on appearance and shape, and we further combine the clusters we obtain by analyzing their spatial-temporal behaviors. We show that using the spatial-temporal information further increases the matching accuracy.

  • 24.
    Ambrus, Rares
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Robotik, perception och lärande, RPL. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Folkesson, John
    KTH, Skolan för datavetenskap och kommunikation (CSC), Robotik, perception och lärande, RPL. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Jensfelt, Patric
    KTH, Skolan för datavetenskap och kommunikation (CSC), Robotik, perception och lärande, RPL. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Centrum för Autonoma System, CAS.
    Unsupervised object segmentation through change detection in a long term autonomy scenario2016Ingår i: IEEE-RAS International Conference on Humanoid Robots, IEEE, 2016, s. 1181-1187Konferensbidrag (Refereegranskat)
    Abstract [en]

    In this work we address the problem of dynamic object segmentation in office environments. We make no prior assumptions on what is dynamic and static, and our reasoning is based on change detection between sparse and non-uniform observations of the scene. We model the static part of the environment, and we focus on improving the accuracy and quality of the segmented dynamic objects over long periods of time. We address the issue of adapting the static structure over time and incorporating new elements, for which we train and use a classifier whose output gives an indication of the dynamic nature of the segmented elements. We show that the proposed algorithms improve the accuracy and the rate of detection of dynamic objects by comparing with a labelled dataset.

  • 25.
    Antonova, Rika
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Robotik, perception och lärande, RPL. KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Centrum för autonoma system, CAS.
    Transfer-Aware Kernels, Priors and Latent Spaces from Simulation to Real Robots2020Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
    Abstract [sv]

    Betänk komplicerade fall av simulering-till-verklighet där det saknas simulatorer med hög precision och endast 10-20 hårdvaruförsök tillåts. Detta arbete visar att även oprecis simulering kan vara till nytta i dessa fall, om det används för att skapa överföringsbara representationer.

    Avhandlingen introducerar först en informerad kärna som bäddar in rummet av simulerade trajektorier i ett lågdimensionellt rum med latenta banor. Denna använder en så kallad sekventiell variational autoencoder (sVAE) för att hantera storskalig träning utifrån simulerade data. Dess modulära design medför snabb anpassning till den nya domänen då den används för Bayesiansk optimering (BO) på verklig hårdvara. Avhandlingen och de inkluderade publikationerna visar att denna metod fungerar för flera olika områden inom robotik: rörelse och manipulation av objekt. Dessutom introduceras en variant av BO som garanterar återhämtning från negativ överföring om korrupta kärnor används. En tillämpning inom uppgiftsanpassade handgrepp bekräftar metodens prestanda på hårdvara.

    När det gäller parametrisk inlärning, kan simulatorer tjäna som apriorifördelningar eller regulariserare. Detta arbete beskriver hur man kan använda simulering för att regularisera en VAEs avkodare för att koppla ihop det latenta VAE rummet till simuleringsparametrarnas aposteriorifördelning. I och med detta kan träning på ett litet antal verkliga banor snabbt anpassa aposteriorifördelningen till att återspegla verkligheten. Den inkluderade publikationen demonstrerar att detta tillvägagångssätt också kan hjälpa så kallad förstärkningsinlärning (RL) att snabbt överbrygga gapet mellan simulering och verklighet för en manipulationsuppgift på hårdvara.

    En långsiktig vision är att skapa latenta rum utan att behöva förutsätta ett specifikt simuleringsscenario. Ett första steg är att lära in generella relationer som håller för sekvenser av tillstånd i en mängd angränsande domäner. Detta arbete introducerar en enhetlig matematisk formulering för att lära in oberoende analytiska relationer. Relationerna lärs in från källdomäner och används sedan för att strukturera det latenta rummet under inlärning i måldomänen. Denna formulering medger ett mer generellt, flexibelt och principiellt sätt att skapa det latenta rummet. Det formaliserar idén om inlärning av oberoende relationer utan att påtvinga begränsande antaganden eller krav på domänspecifik information. Detta arbete presenterar matematiska egenskaper, konkreta algoritmer och experimentell utvärdering av framgångsrik träning och överföring av latenta relationer.

    Ladda ner fulltext (pdf)
    phd_thesis_rika_antonova
  • 26.
    Arndt, Karol
    et al.
    Aalto Univ, Espoo, Finland..
    Hazara, Murtaza
    Aalto Univ, Espoo, Finland.;Katholieke Univ Leuven, Dept Mech Engn, Leuven, Belgium.;Flanders Make, Robot Core Lab, Lommel, Belgium..
    Ghadirzadeh, Ali
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Robotik, perception och lärande, RPL. Aalto Univ, Espoo, Finland.
    Kyrki, Ville
    Aalto Univ, Espoo, Finland..
    Meta Reinforcement Learning for Sim-to-real Domain Adaptation2020Ingår i: 2020 IEEE International Conference On Robotics And Automation (ICRA), IEEE , 2020, s. 2725-2731Konferensbidrag (Refereegranskat)
    Abstract [en]

    Modern reinforcement learning methods suffer from low sample efficiency and unsafe exploration, making it infeasible to train robotic policies entirely on real hardware. In this work, we propose to address the problem of sim-to-real domain transfer by using meta learning to train a policy that can adapt to a variety of dynamic conditions, and using a task-specific trajectory generation model to provide an action space that facilitates quick exploration. We evaluate the method by performing domain adaptation in simulation and analyzing the structure of the latent space during adaptation. We then deploy this policy on a KUKA LBR 4+ robot and evaluate its performance on a task of hitting a hockey puck to a target. Our method shows more consistent and stable domain adaptation than the baseline, resulting in better overall performance.

  • 27.
    Arnekvist, Isac
    KTH, Skolan för datavetenskap och kommunikation (CSC).
    Reinforcement learning for robotic manipulation2017Självständigt arbete på avancerad nivå (masterexamen), 20 poäng / 30 hpStudentuppsats (Examensarbete)
    Abstract [sv]

    Reinforcement learning har nyligen använts framgångsrikt för att lära icke-simulerade robotar uppgifter med hjälp av en normalized advantage function-algoritm (NAF), detta utan att använda mänskliga demonstrationer. Restriktioner på funktionsytorna som använts kan dock visa sig vara problematiska för generalisering till andra uppgifter. För poseestimering har i liknande sammanhang convolutional neural networks använts med bilder från kamera med konstant position. I vissa applikationer kan dock inte kameran garanteras hålla en konstant position och studier har visat att kvaliteten på policys kraftigt förvärras när kameran förflyttas.

     

    Denna uppsats undersöker användandet av NAF för att lära in en ”pushing”-uppgift med tydliga multimodala egenskaper. Resultaten jämförs med användandet av en deterministisk policy med minimala restriktioner på Q-funktionsytan. Vidare undersöks användandet av convolutional neural networks för pose-estimering, särskilt med hänsyn till slumpmässigt placerade kameror med okänd placering. Genom att definiera koordinatramen för objekt i förhållande till ett synligt referensobjekt så tros relativ pose-estimering kunna utföras även när kameran är rörlig och förflyttningen är okänd. NAF appliceras i denna uppsats framgångsrikt på enklare problem där datainsamling är distribuerad över flera robotar och inlärning sker på en central server. Vid applicering på ”pushing”- uppgiften misslyckas dock NAF, både vid träning på riktiga robotar och i simulering. Deep deterministic policy gradient (DDPG) appliceras istället på problemet och lär sig framgångsrikt att lösa problemet i simulering. Den inlärda policyn appliceras sedan framgångsrikt på riktiga robotar. Pose-estimering genom att använda en fast kamera implementeras också framgångsrikt. Genom att definiera ett koordinatsystem från ett föremål i bilden med känd position, i detta fall robotarmen, kan andra föremåls positioner beskrivas i denna koordinatram med hjälp av neurala nätverk. Dock så visar sig precisionen vara för låg för att appliceras på robotar. Resultaten visar ändå att denna metod, med ytterligare utökningar och modifikationer, skulle kunna lösa problemet.

    Ladda ner fulltext (pdf)
    fulltext
  • 28.
    Arnekvist, Isac
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Robotik, perception och lärande, RPL.
    Kragic, Danica
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Robotik, perception och lärande, RPL.
    Stork, Johannes A.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Robotik, perception och lärande, RPL. Center for Applied Autonomous Sensor Systems, Örebro University, Sweden.
    Vpe: Variational policy embedding for transfer reinforcement learning2019Ingår i: 2019 International Conference on Robotics And Automation (ICRA), Institute of Electrical and Electronics Engineers (IEEE), 2019, s. 36-42Konferensbidrag (Refereegranskat)
    Abstract [en]

    Reinforcement Learning methods are capable of solving complex problems, but resulting policies might perform poorly in environments that are even slightly different. In robotics especially, training and deployment conditions often vary and data collection is expensive, making retraining undesirable. Simulation training allows for feasible training times, but on the other hand suffer from a reality-gap when applied in real-world settings. This raises the need of efficient adaptation of policies acting in new environments. We consider the problem of transferring knowledge within a family of similar Markov decision processes. We assume that Q-functions are generated by some low-dimensional latent variable. Given such a Q-function, we can find a master policy that can adapt given different values of this latent variable. Our method learns both the generative mapping and an approximate posterior of the latent variables, enabling identification of policies for new tasks by searching only in the latent space, rather than the space of all policies. The low-dimensional space, and master policy found by our method enables policies to quickly adapt to new environments. We demonstrate the method on both a pendulum swing-up task in simulation, and for simulation-to-real transfer on a pushing task.

    Ladda ner fulltext (pdf)
    fulltext
  • 29.
    Arriola-Rios, Veronica E.
    et al.
    Univ Nacl Autonoma Mexico, UNAM, Fac Sci, Dept Math, Mexico City, DF, Mexico..
    Guler, Puren
    Örebro Univ, Ctr Appl Autonomous Sensor Syst, Autonomous Mobile Manipulat Lab, Örebro, Sweden..
    Ficuciello, Fanny
    Univ Naples Federico II, PRISMA Lab, Dept Elect Engn & Informat Technol, Naples, Italy..
    Kragic, Danica
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Robotik, perception och lärande, RPL. KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Centrum för autonoma system, CAS.
    Siciliano, Bruno
    Univ Naples Federico II, PRISMA Lab, Dept Elect Engn & Informat Technol, Naples, Italy..
    Wyatt, Jeremy L.
    Univ Birmingham, Sch Comp Sci, Birmingham, W Midlands, England..
    Modeling of Deformable Objects for Robotic Manipulation: A Tutorial and Review2020Ingår i: Frontiers in Robotics and AI, E-ISSN 2296-9144, Vol. 7, artikel-id 82Artikel, forskningsöversikt (Refereegranskat)
    Abstract [en]

    Manipulation of deformable objects has given rise to an important set of open problems in the field of robotics. Application areas include robotic surgery, household robotics, manufacturing, logistics, and agriculture, to name a few. Related research problems span modeling and estimation of an object's shape, estimation of an object's material properties, such as elasticity and plasticity, object tracking and state estimation during manipulation, and manipulation planning and control. In this survey article, we start by providing a tutorial on foundational aspects of models of shape and shape dynamics. We then use this as the basis for a review of existing work on learning and estimation of these models and on motion planning and control to achieve desired deformations. We also discuss potential future lines of work.

  • 30. Aslund, Magnus
    et al.
    Fredenberg, Erik
    KTH, Skolan för teknikvetenskap (SCI), Fysik, Medicinsk bildfysik.
    Telman, M.
    Danielsson, Mats
    KTH, Skolan för teknikvetenskap (SCI), Fysik, Medicinsk bildfysik.
    Detectors for the future of X-ray imaging2010Ingår i: Radiation Protection Dosimetry, ISSN 0144-8420, E-ISSN 1742-3406, Vol. 139, nr 1-3, s. 327-333Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    In recent decades, developments in detectors for X-ray imaging have improved dose efficiency. This has been accomplished with for example, structured scintillators such as columnar CsI, or with direct detectors where the X rays are converted to electric charge carriers in a semiconductor. Scattered radiation remains a major noise source, and fairly inefficient anti-scatter grids are still a gold standard. Hence, any future development should include improved scatter rejection. In recent years, photon-counting detectors have generated significant interest by several companies as well as academic research groups. This method eliminates electronic noise, which is an advantage in low-dose applications. Moreover, energy-sensitive photon-counting detectors allow for further improvements by optimising the signal-to-quantum-noise ratio, anatomical background subtraction or quantitative analysis of object constituents. This paper reviews state-of-the-art photon-counting detectors, scatter control and their application in diagnostic X-ray medical imaging. In particular, spectral imaging with photon-counting detectors, pitfalls such as charge sharing and high rates and various proposals for mitigation are discussed.

    Ladda ner fulltext (pdf)
    fulltext
  • 31.
    Astaraki, Mehdi
    et al.
    KTH, Skolan för kemi, bioteknologi och hälsa (CBH), Medicinteknik och hälsosystem, Medicinsk avbildning. Karolinska Inst, Dept Oncol Pathol, Stockholm, Sweden..
    Yang, Guang
    Royal Brompton Hosp, Cardiovasc Res Ctr, London, England.;Imperial Coll London, Natl Heart & Lung Inst, London, England..
    Zakko, Yousuf
    Karolinska Univ Hosp, Dept Radiol Imaging & Funct, Solna, Sweden..
    Toma-Dasu, Iuliana
    Karolinska Inst, Dept Oncol Pathol, Stockholm, Sweden.;Stockholm Univ, Dept Phys, Stockholm, Sweden..
    Smedby, Örjan
    KTH, Skolan för kemi, bioteknologi och hälsa (CBH), Medicinteknik och hälsosystem, Medicinsk avbildning.
    Wang, Chunliang
    KTH, Skolan för kemi, bioteknologi och hälsa (CBH), Medicinteknik och hälsosystem, Medicinsk avbildning.
    A Comparative Study of Radiomics and Deep-Learning Based Methods for Pulmonary Nodule Malignancy Prediction in Low Dose CT Images2021Ingår i: Frontiers in Oncology, E-ISSN 2234-943X, Vol. 11, artikel-id 737368Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    ObjectivesBoth radiomics and deep learning methods have shown great promise in predicting lesion malignancy in various image-based oncology studies. However, it is still unclear which method to choose for a specific clinical problem given the access to the same amount of training data. In this study, we try to compare the performance of a series of carefully selected conventional radiomics methods, end-to-end deep learning models, and deep-feature based radiomics pipelines for pulmonary nodule malignancy prediction on an open database that consists of 1297 manually delineated lung nodules. MethodsConventional radiomics analysis was conducted by extracting standard handcrafted features from target nodule images. Several end-to-end deep classifier networks, including VGG, ResNet, DenseNet, and EfficientNet were employed to identify lung nodule malignancy as well. In addition to the baseline implementations, we also investigated the importance of feature selection and class balancing, as well as separating the features learned in the nodule target region and the background/context region. By pooling the radiomics and deep features together in a hybrid feature set, we investigated the compatibility of these two sets with respect to malignancy prediction. ResultsThe best baseline conventional radiomics model, deep learning model, and deep-feature based radiomics model achieved AUROC values (mean +/- standard deviations) of 0.792 +/- 0.025, 0.801 +/- 0.018, and 0.817 +/- 0.032, respectively through 5-fold cross-validation analyses. However, after trying out several optimization techniques, such as feature selection and data balancing, as well as adding context features, the corresponding best radiomics, end-to-end deep learning, and deep-feature based models achieved AUROC values of 0.921 +/- 0.010, 0.824 +/- 0.021, and 0.936 +/- 0.011, respectively. We achieved the best prediction accuracy from the hybrid feature set (AUROC: 0.938 +/- 0.010). ConclusionThe end-to-end deep-learning model outperforms conventional radiomics out of the box without much fine-tuning. On the other hand, fine-tuning the models lead to significant improvements in the prediction performance where the conventional and deep-feature based radiomics models achieved comparable results. The hybrid radiomics method seems to be the most promising model for lung nodule malignancy prediction in this comparative study.

  • 32.
    Aviles, Marcos
    et al.
    GMV, Spain.
    Siozios, Kostas
    School of ECE, National Technical University of Athens, Greece.
    Diamantopoulos, Dionysios
    School of ECE, National Technical University of Athens, Greece.
    Nalpantidis, Lazaros
    Production and Management Engineering Dept., Democritus University of Thrace, Greece.
    Kostavelis, Ioannis
    Production and Management Engineering Dept., Democritus University of Thrace, Greece.
    Boukas, Evangelos
    Production and Management Engineering Dept., Democritus University of Thrace, Greece.
    Soudris, Dimitrios
    School of ECE, National Technical University of Athens, Greece.
    Gasteratos, Antonios
    Production and Management Engineering Dept., Democritus University of Thrace, Greece.
    A co-design methodology for implementing computer vision algorithms for rover navigation onto reconfigurable hardware2011Ingår i: Proceedings of the FPL2011 Workshop on Computer Vision on Low-Power Reconfigurable Architectures, 2011, s. 9-10Konferensbidrag (Övrigt vetenskapligt)
    Abstract [en]

    Vision-based robotics applications have been widely studied in the last years. However, up to now solutions that have been proposed were affecting mostly software level. The SPARTAN project focuses in the tight and optimal implementation of computer vision algorithms targeting to rover navigation. For evaluation purposes, these algorithms will be implemented with a co-design methodology onto a Virtex-6 FPGA device.

  • 33.
    Axelsson, Nils
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Tal, musik och hörsel, TMH.
    Skantze, Gabriel
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Tal, musik och hörsel, TMH.
    Modelling Adaptive Presentations in Human-Robot Interaction using Behaviour Trees2019Ingår i: 20th Annual Meeting of the Special Interest Group on Discourse and Dialogue: Proceedings of the Conference / [ed] Satoshi Nakamura, Stroudsburg, PA: Association for Computational Linguistics (ACL) , 2019, s. 345-352Konferensbidrag (Refereegranskat)
    Abstract [en]

    In dialogue, speakers continuously adapt their speech to accommodate the listener, based on the feedback they receive. In this paper, we explore the modelling of such behaviours in the context of a robot presenting a painting. A Behaviour Tree is used to organise the behaviour on different levels, and allow the robot to adapt its behaviour in real-time; the tree organises engagement, joint attention, turn-taking, feedback and incremental speech processing. An initial implementation of the model is presented, and the system is evaluated in a user study, where the adaptive robot presenter is compared to a non-adaptive version. The adaptive version is found to be more engaging by the users, although no effects are found on the retention of the presented material.

  • 34.
    Axelsson, Nils
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Tal, musik och hörsel, TMH.
    Skantze, Gabriel
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Tal, musik och hörsel, TMH.
    Using knowledge graphs and behaviour trees for feedback-aware presentation agents2020Ingår i: Proceedings of Intelligent Virtual Agents 2020, Association for Computing Machinery (ACM) , 2020Konferensbidrag (Refereegranskat)
    Abstract [en]

    In this paper, we address the problem of how an interactive agent (such as a robot) can present information to an audience and adaptthe presentation according to the feedback it receives. We extend a previous behaviour tree-based model to generate the presentation from a knowledge graph (Wikidata), which allows the agent to handle feedback incrementally, and adapt accordingly. Our main contribution is using this knowledge graph not just for generating the system’s dialogue, but also as the structure through which short-term user modelling happens. In an experiment using simulated users and third-party observers, we show that referring expressions generated by the system are rated more highly when they adapt to the type of feedback given by the user, and when they are based on previously grounded information as opposed to new information.

  • 35.
    Azizpour, Hossein
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Laptev, I.
    Object detection using strongly-supervised deformable part models2012Ingår i: Computer Vision – ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part I / [ed] Andrew Fitzgibbon, Svetlana Lazebnik, Pietro Perona, Yoichi Sato, Cordelia Schmid, Springer, 2012, nr PART 1, s. 836-849Konferensbidrag (Refereegranskat)
    Abstract [en]

    Deformable part-based models [1, 2] achieve state-of-the-art performance for object detection, but rely on heuristic initialization during training due to the optimization of non-convex cost function. This paper investigates limitations of such an initialization and extends earlier methods using additional supervision. We explore strong supervision in terms of annotated object parts and use it to (i) improve model initialization, (ii) optimize model structure, and (iii) handle partial occlusions. Our method is able to deal with sub-optimal and incomplete annotations of object parts and is shown to benefit from semi-supervised learning setups where part-level annotation is provided for a fraction of positive examples only. Experimental results are reported for the detection of six animal classes in PASCAL VOC 2007 and 2010 datasets. We demonstrate significant improvements in detection performance compared to the LSVM [1] and the Poselet [3] object detectors.

  • 36.
    Azizpour, Hossein
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Razavian, Ali Sharif
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Sullivan, Josephine
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Maki, Atsuto
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Carlsson, Stefan
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    From Generic to Specific Deep Representations for Visual Recognition2015Ingår i: Proceedings of CVPR 2015, IEEE conference proceedings, 2015Konferensbidrag (Refereegranskat)
    Abstract [en]

    Evidence is mounting that ConvNets are the best representation learning method for recognition. In the common scenario, a ConvNet is trained on a large labeled dataset and the feed-forward units activation, at a certain layer of the network, is used as a generic representation of an input image. Recent studies have shown this form of representation to be astoundingly effective for a wide range of recognition tasks. This paper thoroughly investigates the transferability of such representations w.r.t. several factors. It includes parameters for training the network such as its architecture and parameters of feature extraction. We further show that different visual recognition tasks can be categorically ordered based on their distance from the source task. We then show interesting results indicating a clear correlation between the performance of tasks and their distance from the source task conditioned on proposed factors. Furthermore, by optimizing these factors, we achieve stateof-the-art performances on 16 visual recognition tasks.

    Ladda ner fulltext (pdf)
    fulltext
  • 37.
    Azizpour, Hossein
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Sharif Razavian, Ali
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Sullivan, Josephine
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Maki, Atsuto
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Carlssom, Stefan
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Factors of Transferability for a Generic ConvNet Representation2016Ingår i: IEEE Transactions on Pattern Analysis and Machine Intelligence, ISSN 0162-8828, E-ISSN 1939-3539, Vol. 38, nr 9, s. 1790-1802, artikel-id 7328311Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Evidence is mounting that Convolutional Networks (ConvNets) are the most effective representation learning method for visual recognition tasks. In the common scenario, a ConvNet is trained on a large labeled dataset (source) and the feed-forward units activation of the trained network, at a certain layer of the network, is used as a generic representation of an input image for a task with relatively smaller training set (target). Recent studies have shown this form of representation transfer to be suitable for a wide range of target visual recognition tasks. This paper introduces and investigates several factors affecting the transferability of such representations. It includes parameters for training of the source ConvNet such as its architecture, distribution of the training data, etc. and also the parameters of feature extraction such as layer of the trained ConvNet, dimensionality reduction, etc. Then, by optimizing these factors, we show that significant improvements can be achieved on various (17) visual recognition tasks. We further show that these visual recognition tasks can be categorically ordered based on their similarity to the source task such that a correlation between the performance of tasks and their similarity to the source task w.r.t. the proposed factors is observed.

  • 38.
    Baisero, Andrea
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Pokorny, Florian T.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Kragic, Danica
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Ek, Carl Henrik
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    The Path Kernel2013Ingår i: ICPRAM 2013 - Proceedings of the 2nd International Conference on Pattern Recognition Applications and Methods, 2013, s. 50-57Konferensbidrag (Refereegranskat)
    Abstract [en]

    Kernel methods have been used very successfully to classify data in various application domains. Traditionally, kernels have been constructed mainly for vectorial data defined on a specific vector space. Much less work has been addressing the development of kernel functions for non-vectorial data. In this paper, we present a new kernel for encoding sequential data. We present our results comparing the proposed kernel to the state of the art, showing a significant improvement in classification and a much improved robustness and interpretability.

  • 39.
    Baldassarre, Federico
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Robotik, perception och lärande, RPL.
    Structured Representations for Explainable Deep Learning2023Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
    Abstract [sv]

    Deep Learning har revolutionerat den vetenskapliga forskningen och används för att fatta beslut i allt mer komplexa scenarier. Med växande makt kommer ett växande krav på transparens och tolkningsbarhet. Området Explainable AI syftar till att ge förklaringar till AI-systems förutsägelser. Prestandan hos existerande lösningar för AI-förklarbarhet är dock långt ifrån tillfredsställande.Till exempel, inom datorseendeområdet, producerar de mest framträdande post-hoc-förklaringsmetoderna pixelvisa värmekartor, som är avsedda att visualisera hur viktiga enskilda pixlar är i en bild eller video. Vi hävdar att sådana metoder är svårtolkade på grund av den domän där förklaringar bildas - vi kanske känner igen former i en värmekarta men de är bara pixlar. Faktum är att indatadomänen ligger närmare digitalkamerors rådata än de strukturer som människor använder för att kommunicera, t.ex. objekt eller koncept.I den här avhandlingen föreslår vi att vi går bortom täta egenskapsattributioner genom att använda strukturerade interna representationer som en mer tolkningsbar förklaringsdomän. Begreppsmässigt delar vårt tillvägagångssätt en Deep Learning-modell i två: perception-steget som tar täta representationer som indata och reasoning-steget som lär sig att utföra uppgiften. I gränssnittet mellan de två finns strukturerade representationer som motsvarar väldefinierade objekt, entiteter och begrepp. Dessa representationer fungerar som den tolkbara domänen för att förklara modellens förutsägelser, vilket gör att vi kan gå mot mer meningsfulla och informativa förklaringar.Det föreslagna tillvägagångssättet introducerar flera utmaningar, såsom hur man skapar strukturerade representationer, hur man använder dem för senare uppgifter och hur man utvärderar de resulterande förklaringarna. Forskningen som ingår i denna avhandling tar upp dessa frågor, validerar tillvägagångssättet och ger konkreta bidrag till området. För steget perception undersöker vi hur man får strukturerade representationer från täta representationer, antingen genom att manuellt designa dem med hjälp av domänkunskap eller genom att lära dem från data utan övervakning. För steget reasoning undersöker vi hur man använder strukturerade representationer för senare uppgifter, från biologi till datorseende, och hur man utvärderar de inlärda representationerna. För steget explanation undersöker vi hur man förklarar förutsägelserna för modeller som fungerar i en strukturerad domän, och hur man utvärderar de resulterande förklaringarna. Sammantaget hoppas vi att detta arbete inspirerar till ytterligare forskning inom Explainable AI och hjälper till att överbrygga klyftan mellan högpresterande Deep Learning-modeller och behovet av transparens och tolkningsbarhet i verkliga applikationer.

    Ladda ner fulltext (pdf)
    kappa
  • 40.
    Baldassarre, Federico
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Robotik, perception och lärande, RPL.
    Azizpour, Hossein
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Explainability Techniques for Graph Convolutional Networks2019Konferensbidrag (Refereegranskat)
    Abstract [en]

    Graph Networks are used to make decisions in potentially complex scenarios but it is usually not obvious how or why they made them. In this work, we study the explainability of Graph Network decisions using two main classes of techniques, gradient-based and decomposition-based, on a toy dataset and a chemistry task. Our study sets the ground for future development as well as application to real-world problems.

    Ladda ner fulltext (pdf)
    fulltext
  • 41.
    Baldassarre, Federico
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Robotik, perception och lärande, RPL.
    Azizpour, Hossein
    Towards Self-Supervised Learning of Global and Object-Centric Representations2022Konferensbidrag (Refereegranskat)
    Abstract [en]

    Self-supervision allows learning meaningful representations of natural images, which usually contain one central object. How well does it transfer to multi-entity scenes? We discuss key aspects of learning structured object-centric representations with self-supervision and validate our insights through several experiments on the CLEVR dataset. Regarding the architecture, we confirm the importance of competition for attention-based object discovery, where each image patch is exclusively attended by one object. For training, we show that contrastive losses equipped with matching can be applied directly in a latent space, avoiding pixel-based reconstruction. However, such an optimization objective is sensitive to false negatives (recurring objects) and false positives (matching errors). Careful consideration is thus required around data augmentation and negative sample selection.

  • 42.
    Baldassarre, Federico
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Robotik, perception och lärande, RPL.
    Debard, Quentin
    Pontiveros, Gonzalo Fiz
    Wijaya, Tri Kurniawan
    Quantitative Metrics for Evaluating Explanations of Video DeepFake Detectors2022Konferensbidrag (Refereegranskat)
    Abstract [en]

    The proliferation of DeepFake technology is a rising challenge in today’s society, owing to more powerful and accessible generation methods. To counter this, the research community has developed detectors of ever-increasing accuracy. However, the ability to explain the decisions of such models to users lags behind performance and is considered an accessory in large-scale benchmarks, despite being a crucial requirement for the correct deployment of automated tools for moderation and censorship. We attribute the issue to the reliance on qualitative comparisons and the lack of established metrics. We describe a simple set of metrics to evaluate the visual quality and informativeness of explanations of video DeepFake classifiers from a human-centric perspective. With these metrics, we compare common approaches to improve explanation quality and discuss their effect on both classification and explanation performance on the recent DFDC and DFD datasets.

  • 43.
    Baldassarre, Federico
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Robotik, perception och lärande, RPL.
    El-Nouby, Alaaeldin
    Jégou, Hervé
    Variable Rate Allocation for Vector-Quantized Autoencoders2023Ingår i: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Institute of Electrical and Electronics Engineers (IEEE) , 2023Konferensbidrag (Refereegranskat)
    Abstract [en]

    Vector-quantized autoencoders have recently gained interest in image compression, generation and self-supervised learning. However, as a neural compression method, they lack the possibility to allocate a variable number of bits to each image location, e.g. according to the semantic content or local saliency. In this paper, we address this limitation in a simple yet effective way. We adopt a product quantizer (PQ) that produces a set of discrete codes for each image patch rather than a single index. This PQ-autoencoder is trained end-to-end with a structured dropout that selectively masks a variable number of codes at each location. These mechanisms force the decoder to reconstruct the original image based on partial information and allow us to control the local rate. The resulting model can compress images on a wide range of operating points of the rate-distortion curve and can be paired with any external method for saliency estimation to control the compression rate at a local level. We demonstrate the effectiveness of our approach on the popular Kodak and ImageNet datasets by measuring both distortion and perceptual quality metrics.

  • 44.
    Barbosa, Fernando S.
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Robotik, perception och lärande, RPL.
    Lacerda, Bruno
    Univ Oxford, Oxford Robot Inst, Oxford, England..
    Duckworth, Paul
    Univ Oxford, Oxford Robot Inst, Oxford, England..
    Tumova, Jana
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, ACCESS Linnaeus Centre. KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Robotik, perception och lärande, RPL.
    Hawes, Nick
    Univ Oxford, Oxford Robot Inst, Oxford, England..
    Risk-Aware Motion Planning in Partially Known Environments2021Ingår i: 2021 60th IEEE  conference on decision and control (CDC), Institute of Electrical and Electronics Engineers (IEEE) , 2021, s. 5220-5226Konferensbidrag (Refereegranskat)
    Abstract [en]

    Recent trends envisage robots being deployed in areas deemed dangerous to humans, such as buildings with gas and radiation leaks. In such situations, the model of the underlying hazardous process might be unknown to the agent a priori, giving rise to the problem of planning for safe behaviour in partially known environments. We employ Gaussian process regression to create a probabilistic model of the hazardous process from local noisy samples. The result of this regression is then used by a risk metric, such as the Conditional Value-at-Risk, to reason about the safety at a certain state. The outcome is a risk function that can be employed in optimal motion planning problems. We demonstrate the use of the proposed function in two approaches. First is a sampling-based motion planning algorithm with an event-based trigger for online replanning. Second is an adaptation to the incremental Gaussian Process motion planner (iGPMP2), allowing it to quickly react and adapt to the environment. Both algorithms are evaluated in representative simulation scenarios, where they demonstrate the ability of avoiding high-risk areas.

  • 45. Barekatain, Mohammadamin
    et al.
    Martí Rabadán, Miquel
    KTH, Skolan för datavetenskap och kommunikation (CSC). Polytechnic University of Catalonia, Barcelona.
    Shih, Hsueh-Fu
    Murray, Samuel
    KTH, Skolan för datavetenskap och kommunikation (CSC), Robotik, perception och lärande, RPL.
    Nakayama, Kotaro
    Matsuo, Yutaka
    Prendinger, Helmut
    Okutama-Action: An Aerial View Video Dataset for Concurrent Human Action Detection2017Ingår i: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Institute of Electrical and Electronics Engineers (IEEE) , 2017, s. 2153-2160Konferensbidrag (Refereegranskat)
    Abstract [en]

    Despite significant progress in the development of human action detection datasets and algorithms, no current dataset is representative of real-world aerial view scenarios. We present Okutama-Action, a new video dataset for aerial view concurrent human action detection. It consists of 43 minute-long fully-annotated sequences with 12 action classes. Okutama-Action features many challenges missing in current datasets, including dynamic transition of actions, significant changes in scale and aspect ratio, abrupt camera movement, as well as multi-labeled actors. As a result, our dataset is more challenging than existing ones, and will help push the field forward to enable real-world applications.

  • 46. Baroffio, L.
    et al.
    Cesana, M.
    Redondi, A.
    Tagliasacchi, M.
    Ascenso, J.
    Monteiro, P.
    Eriksson, Emil
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Dan, G.
    Fodor, Viktoria
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    GreenEyes: Networked energy-aware visual analysis2015Ingår i: 2015 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2015, IEEE conference proceedings, 2015Konferensbidrag (Refereegranskat)
    Abstract [en]

    The GreenEyes project aims at developing a comprehensive set of new methodologies, practical algorithms and protocols, to empower wireless sensor networks with vision capabilities. The key tenet of this research is that most visual analysis tasks can be carried out based on a succinct representation of the image, which entails both global and local features, while it disregards the underlying pixel-level representation. Specifically, GreenEyes will pursue the following goals: i) energy-constrained extraction of visual features; ii) rate-efficiency modelling and coding of visual feature; iii) networking streams of visual features. This will have a significant impact on several scenarios including, e.g., smart cities and environmental monitoring.

  • 47. Baudoin, Y.
    et al.
    Doroftei, D.
    De Cubber, G.
    Berrabah, S. A.
    Pinzon, C.
    Warlet, F.
    Gancet, J.
    Motard, E.
    Ilzkovitz, M.
    Nalpantidis, Lazaros
    Production and Management Engineering Dept., Democritus University of Thrace, Greece.
    Gasteratos, Antonios
    Production and Management Engineering Dept., Democritus University of Thrace, Greece.
    View-finder: Robotics assistance to fire-fighting services and crisis management2009Ingår i: Safety, Security & Rescue Robotics (SSRR), 2009 IEEE International Workshop on, 2009, s. 1-6Konferensbidrag (Refereegranskat)
    Abstract [en]

    In the event of an emergency due to a fire or other crisis, a necessary but time consuming pre-requisite, that could delay the real rescue operation, is to establish whether the ground or area can be entered safely by human emergency workers. The objective of the VIEW-FINDER project is to develop robots which have the primary task of gathering data. The robots are equipped with sensors that detect the presence of chemicals and, in parallel, image data is collected and forwarded to an advanced Control station (COC). The robots will be equipped with a wide array of chemical sensors, on-board cameras, Laser and other sensors to enhance scene understanding and reconstruction. At the Base Station (BS) the data is processed and combined with geographical information originating from a web of sources; thus providing the personnel leading the operation with in-situ processed data that can improve decision making. This paper will focus on the Crisis Management Information System that has been developed for improving a Disaster Management Action Plan and for linking the Control Station with a out-site Crisis Management Centre, and on the software tools implemented on the mobile robot gathering data in the outdoor area of the crisis.

  • 48.
    Bauer, Stefan
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Reglerteknik.
    Redmond, Stephen J.
    University College Dublin, University College Dublin.
    et al.,
    Real Robot Challenge: A Robotics Competition in the Cloud2022Ingår i: Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track, ML Research Press , 2022, s. 190-204Konferensbidrag (Refereegranskat)
    Abstract [en]

    Dexterous manipulation remains an open problem in robotics. To coordinate efforts of the research community towards tackling this problem, we propose a shared benchmark. We designed and built robotic platforms that are hosted at the MPI-IS1 and can be accessed remotely. Each platform consists of three robotic fingers that are capable of dexterous object manipulation. Users are able to control the platforms remotely by submitting code that is executed automatically, akin to a computational cluster. Using this setup, i) we host robotics competitions, where teams from anywhere in the world access our platforms to tackle challenging tasks ii) we publish the datasets collected during these competitions (consisting of hundreds of robot hours), and iii) we give researchers access to these platforms for their own projects.

  • 49.
    Bechlioulis, Charalampos P.
    et al.
    Natl Tech Univ Athens, Sch Mech Engn, Control Syst Lab, Zografos 15780, Greece..
    Heshmati-alamdari, Shahab
    KTH, Skolan för elektroteknik och datavetenskap (EECS).
    Karras, George C.
    Natl Tech Univ Athens, Sch Mech Engn, Control Syst Lab, Zografos 15780, Greece..
    Kyriakopoulos, Kostas J.
    Natl Tech Univ Athens, Sch Mech Engn, Control Syst Lab, Zografos 15780, Greece..
    Robust Image-Based Visual Servoing With Prescribed Performance Under Field of View Constraints2019Ingår i: IEEE Transactions on robotics, ISSN 1552-3098, E-ISSN 1941-0468, Vol. 35, nr 4, s. 1063-1070Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    In this paper, we propose a visual servoing scheme that imposes predefined performance specifications on the image feature coordinate errors and satisfies the visibility constraints that inherently arise owing to the camera's limited field of view, despite the inevitable calibration and depth measurement errors. Its efficiency is demonstrated via comparative experimental and simulation studies.

  • 50.
    Bekiroglu, Yasemin
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Detry, Renaud
    Kragic, Danica
    KTH, Skolan för datavetenskap och kommunikation (CSC), Datorseende och robotik, CVAP.
    Grasp Stability from Vision and Touch2012Konferensbidrag (Refereegranskat)
1234567 1 - 50 av 805
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf