kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Comparison of object representations for detected human in football scenes
KTH, School of Engineering Sciences in Chemistry, Biotechnology and Health (CBH), Biomedical Engineering and Health Systems.
2024 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesisAlternative title
Jämförelse av objektrepresentationer för detekterade människor i fotbollsscener (Swedish)
Abstract [en]

Bounding boxes have been widely used in object detection models to represent detected objects. In football scenes where cameras are filming from the side view, problems occur when there are occlusions between the human objects. The aim of this thesis project is to utilize key points from the human body for object representation instead of bounding boxes. Adapted from You Only Look Once (YOLO), which is a real-time object detection model, the regression of bounding boxes was turned into a regression problem of key points in this thesis. Three key points models and one bounding box model were trained on a synthetic dataset for comparison. Non-Maximum Suppression (NMS) during post processing was implemented with key points distances instead of Intersection Over Union (IoU) for the key points model. The performances of the models were evaluated with Precision, Recall, F1 score and Mean Average Precision (mAP). The results indicated that the bounding box model outperforms the key points models while the pelvis and feet points model was identified to perform the best out of the key pointsmodels

Abstract [sv]

Avgränsningsrutor har använts i stor utsträckning i objektdetekteringsmodeller för att representera detekterade objekt. I fotbollsscener där kameror filmar från sidovyn uppstår problem när det finns överlappning mellan människorna. Syftet med detta examensarbete är att använda nyckelpunkter från människokroppen för att representera objekt istället för avgränsningsrutor. Inspirerad av YOLO, som är en realtidsmodell för objektdetektering, omvandlades regressionen av avgränsningsrutor till ett regressionsproblem med nyckelpunkter i detta arbete. Tre nyckelpunktsmodeller och en avgränsningsrutemodell tränades på en syntetisk datamängd för jämförelse. I postprocesseringen implementerades NMS med nyckelpunktsavstånd istället för IoU för nyckelpunktsmodellen. Modellernas prestanda utvärderades med Precision, Recall, F1-score och mAP. Resultaten visade att avgränsningsrutemodellen presterade bättre än nyckelpunktsmodellerna, medan modellen för bäcken- och fotpunkter identifierades som den bästa bland nyckelpunktsmodellerna. 

Place, publisher, year, edition, pages
2024. , p. 53
Series
TRITA-CBH-GRU ; 2024:335
Keywords [en]
Object detection, YOLO, Human key points, Computer vision
Keywords [sv]
Objektdetektering, YOLO, Människans nyckelpunkter, Datorseende
National Category
Computer graphics and computer vision Sport and Fitness Sciences
Identifiers
URN: urn:nbn:se:kth:diva-354875OAI: oai:DiVA.org:kth-354875DiVA, id: diva2:1906078
External cooperation
Spiideo
Educational program
Master of Science - Sports Technology
Supervisors
Examiners
Available from: 2024-10-16 Created: 2024-10-16 Last updated: 2025-02-11Bibliographically approved

Open Access in DiVA

fulltext(30380 kB)374 downloads
File information
File name FULLTEXT01.pdfFile size 30380 kBChecksum SHA-512
022600ac3d2852e353d1d86f9c53720bbb38682f5cf4bc274d225d664b36e006d846c90cb5739bb4be51202226e42e7ec4dcc10ba2a6e1b2c0f753c69183766f
Type fulltextMimetype application/pdf

By organisation
Biomedical Engineering and Health Systems
Computer graphics and computer visionSport and Fitness Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 374 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 303 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf