Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Optimizations of acoustic models for speech recognition applications on embedded systems
KTH, School of Information and Communication Technology (ICT).
2017 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

The primary focus of speech recognition systems is large vocabulary continuous speech. Nowadays most of the speech recognition platforms relies on high performance cloud computing solutions. However, when the operation of the system is required to work in off-line mode the embedded solution is preferable. An acoustic model for speech recognition is implemented based on an existing recurrent neural network model provided by EESEN framework.

The performance of automatic speech recognition has improved tremendously due to the application of RNNs for acoustic modelling and to their intrinsic ability to retain and use information from the past frames in order to understand correctly the meaning of the current input frame.

The Embedded GPU tegra k1 manufactured by Nvidia is used as target platform to accelerate the heavy computation step represented by the recurrent neural network forwarding pass. The basic implementation given by EESEN has been optimized for the specific hardware and profiled in order to evaluate performances in terms of both timing and power. The design methodology used for the Nvidia GPU has been applied to implement the same algorithm in OpenCL for FPGA. SDAccel tool from Xilinx enables the high level hardware synthesis starting from OpenCL code.

Experiments show that a significant speed up can be achieved compared to the basic implementation used in EESEN framework on both GPU and FPGA hardware. As regards to the power, GPU implementations do no guarantee considerable improvements as FPGA hardware does.

Place, publisher, year, edition, pages
2017. , p. 72
Series
TRITA-ICT-EX ; 2017:33
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:kth:diva-219909OAI: oai:DiVA.org:kth-219909DiVA, id: diva2:1165908
External cooperation
Politecnico di Torino
Subject / course
Information and Communication Technology
Educational program
Master of Science in Engineering - Information and Communication Technology
Supervisors
Examiners
Available from: 2017-12-14 Created: 2017-12-14 Last updated: 2018-01-13Bibliographically approved

Open Access in DiVA

No full text in DiVA

By organisation
School of Information and Communication Technology (ICT)
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 75 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf