Change search
ReferencesLink to record
Permanent link

Direct link
VOCUS: A visual attention system for object detection and goal-directed search
KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
2006 (English)In: Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349, 1-228 p.Article in journal (Refereed) Published
Abstract [en]

Visual attention is a mechanism in human perception which selects relevant regions from a scene and provides these regions for higher-level processing as object recognition. This enables humans to act effectively in their environment despite the complexity of perceivable sensor data. Computational vision systems face the same problem as humans: there is a large amount of information to be processed and to achieve this efficiently, maybe even in real-time for robotic applications, the order in which a scene is investigated must be determined in an intelligent way. A promising approach is to use computational attention systems that simulate human visual attention. This monograph introduces the biologically motivated computational attention system VOCUS (Visual Object detection with a Computational attention System) that detects regions of interest in images. It operates in two modes, in an exploration mode in which no task is provided, and in a search mode with a specified target. In exploration mode, regions of interest are defined by strong contrasts (e.g., color or intensity contrasts) and by the uniqueness of a feature. For example, a black sheep is salient in a flock of white sheep. In search mode, the system uses previously learned information about a target object to bias the saliency computations with respect to the target. In various experiments, it is shown that the target is on average found with less than three fixations, that usually less than five training images suffice to learn the target information, and that the system is mostly robust with regard to viewpoint changes and illumination variances. Furthermore, we demonstrate how VOCUS profits from additional sensor data: we apply the system to depth and reflectance data from a 3D laser scanner and show the advantages that the laser modes provide. By fusing the data of both modes, we demonstrate how the system is able to consider distinct object properties and how the flexibility of the system increases by considering different data. Finally, the regions of interest provided by VOCUS serve as input to a classifier that recognizes the object in the detected region. We show how and in which cases the classification is sped up and how the detection quality is improved by the attentional front-end. This approach is especially useful if many object classes have to be considered, a frequently occurring situation in robotics. VOCUS provides a powerful approach to improve existing vision systems by concentrating computational resources to regions that are more likely to contain relevant information. The more the complexity and power of vision systems increase in the future, the more they will profit from an attentional front-end like VOCUS.

Place, publisher, year, edition, pages
2006. 1-228 p.
Keyword [en]
Agricultural products, Classification (of information), Computational methods, Computer networks, Concentration (process), Detectors, Earnings, Image enhancement, Industrial economics, Information systems, Intelligent systems, Laser applications, Lasers, Natural resources exploration, Pulsed laser deposition, Real time systems, Robotics, Robots, Sensors, Targets, Three dimensional, Visual communication, Wool, (algorithmic) complexity, (PL) properties, 3D laser scanners, Amount of information, Computational attention system, Computational resources, Computational vision, Heidelberg (CO), human perceptions, Human visual attention, intensity contrasts, Object classes, object detection, Powerful approach, Promising approach, Reflectance data, Regions of interest (ROI), Relevant information, Robotic applications, Sensor data, Springer (CO), target information, target objects, Training images, Vision Systems (CO), Visual attention (VA), Visual attention systems, visual objects, Object recognition
National Category
Bioinformatics (Computational Biology)
URN: urn:nbn:se:kth:diva-155352ISI: 000238354400001ScopusID: 2-s2.0-37249090280ISBN: 3540327592ISBN: 9783540327592OAI: diva2:762998

QC 20141113

Available from: 2014-11-13 Created: 2014-11-05 Last updated: 2014-11-13Bibliographically approved

Open Access in DiVA

No full text

By organisation
Computer Vision and Active Perception, CVAP
In the same journal
Lecture Notes in Computer Science
Bioinformatics (Computational Biology)

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 63 hits
ReferencesLink to record
Permanent link

Direct link