Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Human 3D Pose Estimation in the Wild: using Geometrical Models and Pictorial Structures
KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
2013 (English)Doctoral thesis, comprehensive summary (Other academic)
Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2013. , viii, 178 p.
Series
Trita-CSC-A, ISSN 1653-5723 ; 2013:15
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
URN: urn:nbn:se:kth:diva-138136ISBN: 978-91-7501-980-2 (print)OAI: oai:DiVA.org:kth-138136DiVA: diva2:680485
Public defence
2014-01-21, F3, KTH, Lindstedtsvägen 26, Stockholm, 13:00
Opponent
Supervisors
Note

QC 20131218

Available from: 2013-12-18 Created: 2013-12-18 Last updated: 2013-12-18Bibliographically approved
List of papers
1. Motion Capture from Dynamic Orthographic Cameras
Open this publication in new window or tab >>Motion Capture from Dynamic Orthographic Cameras
2011 (English)In: 4DMOD - 1st IEEE Workshop on Dynamic Shape Capture and Analysis, 2011Conference paper, Published paper (Refereed)
Abstract [en]

We present an extension to the scaled orthographic camera model. It deals with dynamic cameras looking at faraway objects. The camera is allowed to change focal lengthand translate and rotate in 3D. The model we derive saysthat this motion can be treated as scaling, translation androtation in a 2D image plane. It is valid if the camera and itstarget move around in two separate regions that are smallcompared to the distance between them.We show two applications of this model to motion capture applications at large distances, i.e. outside a studio,using the affine factorization algorithm. The model is usedto motivate theoretically why the factorization can be carried out in a single batch step, when having both dynamiccameras and a dynamic object. Furthermore, the model isused to motivate how the position of the object can be reconstructed by measuring the virtual 2D motion of the cameras. For testing we use videos from a real football gameand reconstruct the 3D motion of a footballer as he scoresa goal.

National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:kth:diva-51110 (URN)10.1109/ICCVW.2011.6130445 (DOI)000300056700231 ()2-s2.0-84856626223 (Scopus ID)
Conference
4DMOD - 1st IEEE Workshop on Dynamic Shape Capture and Analysis. Barcelona, Spain. 2011-11-13.
Note

QC 20111213

Available from: 2011-12-09 Created: 2011-12-09 Last updated: 2013-12-18Bibliographically approved
2. 3D pictorial structures for multiple view articulated pose estimation
Open this publication in new window or tab >>3D pictorial structures for multiple view articulated pose estimation
2013 (English)In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, 2013, 3618-3625 p.Conference paper, Published paper (Refereed)
Abstract [en]

We consider the problem of automatically estimating the 3D pose of humans from images, taken from multiple calibrated views. We show that it is possible and tractable to extend the pictorial structures framework, popular for 2D pose estimation, to 3D. We discuss how to use this framework to impose view, skeleton, joint angle and intersection constraints in 3D. The 3D pictorial structures are evaluated on multiple view data from a professional football game. The evaluation is focused on computational tractability, but we also demonstrate how a simple 2D part detector can be plugged into the framework.

Place, publisher, year, edition, pages
IEEE Computer Society, 2013
Series
IEEE Conference on Computer Vision and Pattern Recognition. Proceedings, ISSN 1063-6919
Keyword
human pose estimation, motion capture, multiple view 3D reconstruction, part-based models, pictorial structures
National Category
Engineering and Technology Computer Systems
Identifiers
urn:nbn:se:kth:diva-129706 (URN)10.1109/CVPR.2013.464 (DOI)000331094303088 ()2-s2.0-84887329445 (Scopus ID)
Conference
26th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2013; Portland, OR; United States; 23 June 2013 through 28 June 2013
Note

QC 20131007

Available from: 2013-10-03 Created: 2013-10-03 Last updated: 2014-03-24Bibliographically approved
3. Human 3D Motion Computation from a varying Number of Cameras
Open this publication in new window or tab >>Human 3D Motion Computation from a varying Number of Cameras
2011 (English)In: Image Analysis, Springer Berlin / Heidelberg , 2011, 24-35 p.Conference paper, Published paper (Refereed)
Abstract [en]

This paper focuses on how the accuracy of marker-less human motion capture is affected by the number of camera views used. Specifically, we compare the 3D reconstructions calculated from single and multiple cameras. We perform our experiments on data consisting of video from multiple cameras synchronized with ground truth 3D motion, obtained from a motion capture session with a professional footballer. The error is compared for the 3D reconstructions, of diverse motions, estimated using the manually located image joint positions from one, two or three cameras. We also present a new bundle adjustment procedure using regression splines to impose weak prior assumptions about human motion, temporal smoothness and joint angle limits, on the 3D reconstruction. The results show that even under close to ideal circumstances the monocular 3D reconstructions contain visual artifacts not present in the multiple view case, indicating accurate and efficient marker-less human motion capture requires multiple cameras.

Place, publisher, year, edition, pages
Springer Berlin / Heidelberg, 2011
Series
Lecture Notes in Computer Science, ISSN 0302-9743 ; 6688
Keyword
Motion Capture, 3D Reconstruction, Monocular, Bundle Adjustment, Regression Splines
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:kth:diva-40462 (URN)10.1007/978-3-642-21227-7_3 (DOI)000308543900003 ()2-s2.0-79957479294 (Scopus ID)
Conference
17th Scandinavian Conference on Image Analysis, SCIA 2011; Ystad; 23 May 2011 through 27 May 2011.
Funder
ICT - The Next Generation
Note

QC 20110930

Available from: 2011-09-15 Created: 2011-09-15 Last updated: 2013-12-18Bibliographically approved
4. Multi-view body part recognition with random forests
Open this publication in new window or tab >>Multi-view body part recognition with random forests
2013 (English)In: BMVC 2013 - Electronic Proceedings of the British Machine Vision Conference 2013, Bristol, England: British Machine Vision Association , 2013Conference paper, Published paper (Refereed)
Abstract [en]

This paper addresses the problem of human pose estimation, given images taken from multiple dynamic but calibrated cameras. We consider solving this task using a part-based model and focus on the part appearance component of such a model. We use a random forest classifier to capture the variation in appearance of body parts in 2D images. The result of these 2D part detectors are then aggregated across views to produce consistent 3D hypotheses for parts. We solve correspondences across views for mirror symmetric parts by introducing a latent variable. We evaluate our part detectors qualitatively and quantitatively on a dataset gathered from a professional football game.

Place, publisher, year, edition, pages
Bristol, England: British Machine Vision Association, 2013
Keyword
Data processing, Decision trees, Motion estimation, Body part recognition, Calibrated cameras, Football game, Human pose estimations, Latent variable, Part-based models, Random forest classifier, Random forests
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:kth:diva-134190 (URN)10.5244/C.27.48 (DOI)000346352700045 ()2-s2.0-84898413079 (Scopus ID)
Conference
2013 24th British Machine Vision Conference, BMVC 2013; Bristol; United Kingdom; 9 September 2013 through 13 September 2013
Funder
EU, FP7, Seventh Framework Programme
Note

QC 20131217

Available from: 2013-11-19 Created: 2013-11-19 Last updated: 2015-10-06Bibliographically approved

Open Access in DiVA

BureniusThesis2013(29943 kB)378 downloads
File information
File name FULLTEXT01.pdfFile size 29943 kBChecksum SHA-512
544ceb69bc9bae453e79b68385c0764b41f047dea4f23446f1c7bbe0abf149e073392b65b41e0443c3d3f722a4fb901ff05ceaa44b6b09fda582247741efe124
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Burenius, Magnus
By organisation
Computer Vision and Active Perception, CVAP
Computer Vision and Robotics (Autonomous Systems)

Search outside of DiVA

GoogleGoogle Scholar
Total: 378 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 494 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf