Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Multi-view body part recognition with random forests
KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.ORCID iD: 0000-0003-4181-2753
KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.ORCID iD: 0000-0001-5211-6388
KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
2013 (English)In: BMVC 2013 - Electronic Proceedings of the British Machine Vision Conference 2013, Bristol, England: British Machine Vision Association , 2013Conference paper, Published paper (Refereed)
Abstract [en]

This paper addresses the problem of human pose estimation, given images taken from multiple dynamic but calibrated cameras. We consider solving this task using a part-based model and focus on the part appearance component of such a model. We use a random forest classifier to capture the variation in appearance of body parts in 2D images. The result of these 2D part detectors are then aggregated across views to produce consistent 3D hypotheses for parts. We solve correspondences across views for mirror symmetric parts by introducing a latent variable. We evaluate our part detectors qualitatively and quantitatively on a dataset gathered from a professional football game.

Place, publisher, year, edition, pages
Bristol, England: British Machine Vision Association , 2013.
Keyword [en]
Data processing, Decision trees, Motion estimation, Body part recognition, Calibrated cameras, Football game, Human pose estimations, Latent variable, Part-based models, Random forest classifier, Random forests
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
URN: urn:nbn:se:kth:diva-134190DOI: 10.5244/C.27.48ISI: 000346352700045Scopus ID: 2-s2.0-84898413079OAI: oai:DiVA.org:kth-134190DiVA: diva2:665190
Conference
2013 24th British Machine Vision Conference, BMVC 2013; Bristol; United Kingdom; 9 September 2013 through 13 September 2013
Funder
EU, FP7, Seventh Framework Programme
Note

QC 20131217

Available from: 2013-11-19 Created: 2013-11-19 Last updated: 2015-10-06Bibliographically approved
In thesis
1. Human 3D Pose Estimation in the Wild: using Geometrical Models and Pictorial Structures
Open this publication in new window or tab >>Human 3D Pose Estimation in the Wild: using Geometrical Models and Pictorial Structures
2013 (English)Doctoral thesis, comprehensive summary (Other academic)
Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2013. viii, 178 p.
Series
Trita-CSC-A, ISSN 1653-5723 ; 2013:15
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:kth:diva-138136 (URN)978-91-7501-980-2 (ISBN)
Public defence
2014-01-21, F3, KTH, Lindstedtsvägen 26, Stockholm, 13:00
Opponent
Supervisors
Note

QC 20131218

Available from: 2013-12-18 Created: 2013-12-18 Last updated: 2013-12-18Bibliographically approved
2. Correspondence Estimation in Human Face and Posture Images
Open this publication in new window or tab >>Correspondence Estimation in Human Face and Posture Images
2014 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Many computer vision tasks such as object detection, pose estimation,and alignment are directly related to the estimation of correspondences overinstances of an object class. Other tasks such as image classification andverification if not completely solved can largely benefit from correspondenceestimation. This thesis presents practical approaches for tackling the corre-spondence estimation problem with an emphasis on deformable objects.Different methods presented in this thesis greatly vary in details but theyall use a combination of generative and discriminative modeling to estimatethe correspondences from input images in an efficient manner. While themethods described in this work are generic and can be applied to any object,two classes of objects of high importance namely human body and faces arethe subjects of our experimentations.When dealing with human body, we are mostly interested in estimating asparse set of landmarks – specifically we are interested in locating the bodyjoints. We use pictorial structures to model the articulation of the body partsgeneratively and learn efficient discriminative models to localize the parts inthe image. This is a common approach explored by many previous works. Wefurther extend this hybrid approach by introducing higher order terms to dealwith the double-counting problem and provide an algorithm for solving theresulting non-convex problem efficiently. In another work we explore the areaof multi-view pose estimation where we have multiple calibrated cameras andwe are interested in determining the pose of a person in 3D by aggregating2D information. This is done efficiently by discretizing the 3D search spaceand use the 3D pictorial structures model to perform the inference.In contrast to the human body, faces have a much more rigid structureand it is relatively easy to detect the major parts of the face such as eyes,nose and mouth, but performing dense correspondence estimation on facesunder various poses and lighting conditions is still challenging. In a first workwe deal with this variation by partitioning the face into multiple parts andlearning separate regressors for each part. In another work we take a fullydiscriminative approach and learn a global regressor from image to landmarksbut to deal with insufficiency of training data we augment it by a large numberof synthetic images. While we have shown great performance on the standardface datasets for performing correspondence estimation, in many scenariosthe RGB signal gets distorted as a result of poor lighting conditions andbecomes almost unusable. This problem is addressed in another work wherewe explore use of depth signal for dense correspondence estimation. Hereagain a hybrid generative/discriminative approach is used to perform accuratecorrespondence estimation in real-time.

Place, publisher, year, edition, pages
Stockholm, Sweden: KTH Royal Institute of Technology, 2014. vii, 32 p.
Series
TRITA-CSC-A, ISSN 1653-5723 ; 2014:14
National Category
Computer Vision and Robotics (Autonomous Systems)
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-150115 (URN)978-91-7595-261-1 (ISBN)
Public defence
2014-10-10, Kollegiesalen, Brinellvägen 8, KTH, Stockholm, 10:00 (English)
Opponent
Supervisors
Note

QC 20140919

Available from: 2014-09-19 Created: 2014-08-29 Last updated: 2014-09-19Bibliographically approved

Open Access in DiVA

fulltext(8872 kB)250 downloads
File information
File name FULLTEXT01.pdfFile size 8872 kBChecksum SHA-512
5d3cab56807983be02f34fabf3d05052172b1b009492cf111ffd632ae7901fe2fe6724e2a8272c75741887783dcf74e7ff74c735145163ed58e530207dddf55d
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopusFulltextConference website

Authority records BETA

Kazemi, VahidAzizpour, Hossein

Search in DiVA

By author/editor
Kazemi, VahidBurenius, MagnusAzizpour, HosseinSullivan, Josephine
By organisation
Computer Vision and Active Perception, CVAP
Computer Vision and Robotics (Autonomous Systems)

Search outside of DiVA

GoogleGoogle Scholar
Total: 250 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 1284 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf