kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Evaluating the Impact of Tensor Cores on YOLOv8's Performance as a Pedestrian Detection System
KTH, School of Electrical Engineering and Computer Science (EECS).
KTH, School of Electrical Engineering and Computer Science (EECS).
2024 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesisAlternative title
Utvärdering av hur Tensor-kärnor påverkar YOLOv8:s prestanda för fotgängardetektion (Swedish)
Abstract [en]

The rapid advancement of autonomous vehicles underscores the need for improved pedestrian detection systems to enhance road safety. Traditional GPUs accelerate the detection process but can be prohibitively expensive and slow for widespread deployment. This study investigates the performance of GPUs equipped with Tensor Cores, which can accelerate matrix multiplication operations in pedestrian detection systems but may also introduce precision loss. Focusing on speed, precision, recall, and mean Average Precision, the impact of Tensor Cores on a YOLOv8 model was evaluated using the Caltech pedestrian dataset. Tests were conducted on both high-end NVIDIA A100 and more affordable NVIDIA RTX 3070 GPUs, with and without Tensor Cores activated. The results revealed a noteworthy 11.8% improvement in inference speed on the RTX 3070 with Tensor Cores activated, whereas performance enhancement on the A100 was less pronounced. Notably, enabling Tensor Cores did not lead to significant precision loss on either GPU model, with both exhibiting a decrease in precision of less than one percent, which is negligible. However, an 8.9% decrease in recall was observed on the A100, while the RTX 3070 experienced a 3% decrease. These findings underscore the need for careful optimisation when implementing Tensor Cores to balance speed enhancements with detection accuracy. Furthermore, this research raises ethical considerations regarding the acceptable trade-off between precision loss and speed improvement. It underscores the advantages and challenges of integrating advanced GPU technologies into autonomous vehicles, highlighting the essential need to balance enhanced performance with reliable operation.

Abstract [sv]

Den snabba utvecklingen av autonoma fordon kräver förbättrade fotgängardetekteringssystem för att öka trafiksäkerheten. Traditionella grafikprocessorer kan accelerera detektionsprocessen men är ofta för dyra och långsamma för att användas i stor skala. Denna studie undersöker prestandan hos grafikprocessorer med Tensor-kärnor, som kan snabba upp matrismultiplikation för fotgängardetektion men även potentiellt minska precisionen. Med fokus på hastighet, precision, återkallelse och mAP utvärderades effekten av Tensor-kärnor på en YOLOv8-modell med hjälp av Caltechs fotgängardataset. Tester genomfördes på både högpresterande NVIDIA A100 GPU och mer prisvärda NVIDIA RTX 3070 GPU, både med och utan aktiverade Tensor-kärnor. Resultaten visade att Tensor-kärnor förbättrade detektionshastigheten på RTX 3070 GPU:n med 11.8%, medan detektionshastigheten på A100 var mindre uttalad. I synnerhet ledde användningen av Tensor-kärnor inte till någon betydande precisionförlust på någon av GPU-modellerna, där båda visade en minskning av precisionen med mindre än en procent, vilket är försumbart. Dock observerades en minskning på 8.9% i återkallelse på A100, medan RTX 3070 upplevde en minskning på 3%. Dessa resultat understryker behovet av noggrann optimering vid implementering av Tensor-kärnor för att balansera hastighetsförbättringar med detektionsnoggrannhet. Dessutom väcker denna forskning etiska överväganden kring det acceptabla avvägandet mellan precisionförlust och hastighetsförbättring. Den belyser fördelarna och utmaningarna med att integrera avancerade GPU-teknologier i autonoma fordon och betonar det väsentliga behovet av att balansera förbättrad prestanda med pålitlig drift.

Place, publisher, year, edition, pages
2024. , p. 35
Series
TRITA-EECS-EX ; 2024:348
Keywords [en]
YOLOv8, Tensor Cores, GPU acceleration, pedestrian detection
Keywords [sv]
YOLOv8, Tensor-kärnor, GPU-acceleration, fotgängardetektion
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:kth:diva-351101OAI: oai:DiVA.org:kth-351101DiVA, id: diva2:1886190
Supervisors
Examiners
Available from: 2024-08-22 Created: 2024-07-30 Last updated: 2024-08-22Bibliographically approved

Open Access in DiVA

fulltext(3387 kB)497 downloads
File information
File name FULLTEXT01.pdfFile size 3387 kBChecksum SHA-512
ba38433f4eeff5c3e2c9a44e114cfacea01a52fe269360da9648ecb03bd861dd7987085bef0437612a77902c019b63090348d7e4e108e096a14f8e6f22f4da80
Type fulltextMimetype application/pdf

By organisation
School of Electrical Engineering and Computer Science (EECS)
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 497 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 530 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf