Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Optimizations of acoustic models for speech recognition applications on embedded systems
KTH, Skolan för informations- och kommunikationsteknik (ICT).
2017 (Engelska)Självständigt arbete på avancerad nivå (yrkesexamen), 20 poäng / 30 hpStudentuppsats (Examensarbete)
Abstract [en]

The primary focus of speech recognition systems is large vocabulary continuous speech. Nowadays most of the speech recognition platforms relies on high performance cloud computing solutions. However, when the operation of the system is required to work in off-line mode the embedded solution is preferable. An acoustic model for speech recognition is implemented based on an existing recurrent neural network model provided by EESEN framework.

The performance of automatic speech recognition has improved tremendously due to the application of RNNs for acoustic modelling and to their intrinsic ability to retain and use information from the past frames in order to understand correctly the meaning of the current input frame.

The Embedded GPU tegra k1 manufactured by Nvidia is used as target platform to accelerate the heavy computation step represented by the recurrent neural network forwarding pass. The basic implementation given by EESEN has been optimized for the specific hardware and profiled in order to evaluate performances in terms of both timing and power. The design methodology used for the Nvidia GPU has been applied to implement the same algorithm in OpenCL for FPGA. SDAccel tool from Xilinx enables the high level hardware synthesis starting from OpenCL code.

Experiments show that a significant speed up can be achieved compared to the basic implementation used in EESEN framework on both GPU and FPGA hardware. As regards to the power, GPU implementations do no guarantee considerable improvements as FPGA hardware does.

Ort, förlag, år, upplaga, sidor
2017. , s. 72
Serie
TRITA-ICT-EX ; 2017:33
Nationell ämneskategori
Data- och informationsvetenskap
Identifikatorer
URN: urn:nbn:se:kth:diva-219909OAI: oai:DiVA.org:kth-219909DiVA, id: diva2:1165908
Externt samarbete
Politecnico di Torino
Ämne / kurs
Informations- och kommunikationsteknik
Utbildningsprogram
Civilingenjörsexamen - Informationsteknik
Handledare
Examinatorer
Tillgänglig från: 2017-12-14 Skapad: 2017-12-14 Senast uppdaterad: 2018-01-13Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Av organisationen
Skolan för informations- och kommunikationsteknik (ICT)
Data- och informationsvetenskap

Sök vidare utanför DiVA

GoogleGoogle Scholar

urn-nbn

Altmetricpoäng

urn-nbn
Totalt: 94 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf