kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Learning the statistics of people in images and video
KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.ORCID iD: 0000-0002-5750-9655
2003 (English)In: International Journal of Computer Vision, ISSN 0920-5691, E-ISSN 1573-1405, Vol. 54, no 2-Jan, p. 181-207Article in journal (Refereed) Published
Abstract [en]

This paper address the problems of modeling the appearance of humans and distinguishing human appearance from the appearance of general scenes. We seek a model of appearance and motion that is generic in that it accounts for the ways in which people's appearance varies and, at the same time, is specific enough to be useful for tracking people in natural scenes. Given a 3D model of the person projected into an image we model the likelihood of observing various image cues conditioned on the predicted locations and orientations of the limbs. These cues are taken to be steered filter responses corresponding to edges, ridges, and motion-compensated temporal differences. Motivated by work on the statistics of natural scenes, the statistics of these filter responses for human limbs are learned from training images containing hand-labeled limb regions. Similarly, the statistics of the filter responses in general scenes are learned to define a background distribution. The likelihood of observing a scene given a predicted pose of a person is computed, for each limb, using the likelihood ratio between the learned foreground (person) and background distributions. Adopting a Bayesian formulation allows cues to be combined in a principled way. Furthermore, the use of learned distributions obviates the need for hand-tuned image noise models and thresholds. The paper provides a detailed analysis of the statistics of how people appear in scenes and provides a connection between work on natural image statistics and the Bayesian tracking of people.

Place, publisher, year, edition, pages
2003. Vol. 54, no 2-Jan, p. 181-207
Keywords [en]
natural images, tracking, model, recognition, sequences, objects, motion, flow
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:kth:diva-22496DOI: 10.1023/A:1023765619733ISI: 000182851000009OAI: oai:DiVA.org:kth-22496DiVA, id: diva2:341194
Note
QC 20100525Available from: 2010-08-10 Created: 2010-08-10 Last updated: 2022-06-25Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records

Sidenbladh, Hedvig

Search in DiVA

By author/editor
Sidenbladh, Hedvig
By organisation
Computer Vision and Active Perception, CVAP
In the same journal
International Journal of Computer Vision
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 104 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf