kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Object Based Image Retrieval Using Feature Maps of a YOLOv5 Network
KTH, School of Engineering Sciences (SCI), Mathematics (Dept.), Mathematical Statistics.
KTH, School of Engineering Sciences (SCI), Mathematics (Dept.), Mathematical Statistics.
2022 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesisAlternative title
Objektbaserad bildhämtning med hjälp av feature maps från ett YOLOv5-nätverk (Swedish)
Abstract [en]

As Machine Learning (ML) methods have gained traction in recent years, someproblems regarding the construction of such methods have arisen. One such problem isthe collection and labeling of data sets. Specifically when it comes to many applicationsof Computer Vision (CV), one needs a set of images, labeled as either being of someclass or not. Creating such data sets can be very time consuming. This project setsout to tackle this problem by constructing an end-to-end system for searching forobjects in images (i.e. an Object Based Image Retrieval (OBIR) method) using an objectdetection framework (You Only Look Once (YOLO) [16]). The goal of the project wasto create a method that; given an image of an object of interest q, search for that sameor similar objects in a set of other images S. The core concept of the idea is to passthe image q through an object detection model (in this case YOLOv5 [16]), create a”fingerprint” (can be seen as a sort of identity for an object) from a set of feature mapsextracted from the YOLOv5 [16] model and look for corresponding similar parts of aset of feature maps extracted from other images. An investigation regarding whichvalues to select for a few different parameters was conducted, including a comparisonof performance for a couple of different similarity metrics. In the table below,the parameter combination which resulted in the highest F_Top_300-score (a measureindicating the amount of relevant images retrieved among the top 300 recommendedimages) in the parameter selection phase is presented.

Layer: 23Pool Methd: maxSim. Mtrc: eucFP Kern. Sz: 4

Evaluation of the method resulted in F_Top_300-scores as can be seen in the table below.

Mouse: 0.820Duck: 0.640Coin: 0.770Jet ski: 0.443Handgun: 0.807Average: 0.696

Abstract [sv]

Medan ML-metoder har blivit mer populära under senare år har det uppstått endel problem gällande konstruktionen av sådana metoder. Ett sådant problem ärinsamling och annotering av data. Mer specifikt när det kommer till många metoderför datorseende behövs ett set av bilder, annoterande att antingen vara eller inte varaav en särskild klass. Att skapa sådana dataset kan vara väldigt tidskonsumerande.Metoden som konstruerades för detta projekt avser att bekämpa detta problem genomatt konstruera ett end-to-end-system för att söka efter objekt i bilder (alltså en OBIR-metod) med hjälp av en objektdetekteringsalgoritm (YOLO). Målet med projektet varatt skapa en metod som; givet en bild q av ett objekt, söka efter samma eller liknandeobjekt i ett bibliotek av bilder S. Huvudkonceptet bakom idén är att köra bilden qgenom objektdetekteringsmodellen (i detta fall YOLOv5 [16]), skapa ett ”fingerprint”(kan ses som en sorts identitet för ett objekt) från en samling feature maps extraheradefrån YOLOv5-modellen [16] och leta efter liknande delar av samlingar feature maps iandra bilder. En utredning angående vilka värden som skulle användas för ett antalolika parametrar utfördes, inklusive en jämförelse av prestandan som resultat av olikalikhetsmått. I tabellen nedan visas den parameterkombination som gav högst F_Top_300(ett mått som indikerar andelen relevanta bilder bland de 300 högst rekommenderadebilderna).

Layer: 23Pool Methd: maxSim. Mtrc: eucFP Kern. Sz: 4

Evaluering av metoden med parameterval enligt tabellen ovan resulterade i F_Top_300enligt tabellen nedan.

Mouse: 0.820Duck: 0.640Coin: 0.770Jet ski: 0.443Handgun: 0.807Average: 0.696

Place, publisher, year, edition, pages
2022. , p. 77
Series
TRITA-SCI-GRU ; 2022:313
Keywords [en]
Content based image retrieval, CBIR, Object based image retrieval, OBIR, image retrieval, YOLO, YOLOv5, object detection, PyTorch, deep learning, convolutional neural network, CNN
Keywords [sv]
Content based image retrieval, CBIR, Object based image retrieval, OBIR, image retrieval, YOLO, YOLOv5, object detection, PyTorch, deep learning, convolutional neural network, CNN
National Category
Other Mathematics
Identifiers
URN: urn:nbn:se:kth:diva-322561OAI: oai:DiVA.org:kth-322561DiVA, id: diva2:1720668
External cooperation
ACNR Cyber Technology AB
Subject / course
Mathematics
Educational program
Master of Science - Applied and Computational Mathematics
Supervisors
Examiners
Available from: 2023-02-02 Created: 2022-12-20 Last updated: 2023-02-02Bibliographically approved

Open Access in DiVA

fulltext(3920 kB)699 downloads
File information
File name FULLTEXT01.pdfFile size 3920 kBChecksum SHA-512
234eea868296e43ac5e31c31b53c93a9b738f117b4fe061324a5567d76072c8a524bfcc0706bb6846dc1b3f7d3681daf55a8cbbec5e1f8e55a6d21cac566ebf5
Type fulltextMimetype application/pdf

By organisation
Mathematical Statistics
Other Mathematics

Search outside of DiVA

GoogleGoogle Scholar
Total: 699 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 1171 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf