kth.sePublications
System disruptions
We are currently experiencing disruptions on the search portals due to high traffic. We are working to resolve the issue, you may temporarily encounter an error message.
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Beam-search SIEVE for low-memory speech recognition
Aalto University, Finland.
Athena R.C., Greece.
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Theoretical Computer Science, TCS.ORCID iD: 0000-0002-5211-112X
University of Copenhagen, Denmark.
2024 (English)In: Interspeech 2024, International Speech Communication Association , 2024, p. 272-276Conference paper, Published paper (Refereed)
Abstract [en]

A capacity to recognize speech offline eliminates privacy concerns and the need for an internet connection. Despite efforts to reduce the memory demands of speech recognition systems, these demands remain formidable and thus popular tools such as Kaldi run best via cloud computing. The key bottleneck arises form the fact that a bedrock of such tools, the Viterbi algorithm, requires memory that grows linearly with utterance length even when contained via beam search. A recent recasting of the Viterbi algorithm, SIEVE, eliminates the path length factor from space complexity, but with a significant practical runtime overhead. In this paper, we develop a variant of SIEVE that lessens this runtime overhead via beam search, retains the decoding quality of standard beam search, and waives its linearly growing memory bottleneck. This space-complexity reduction is orthogonal to decoding quality and complementary to memory savings in model representation and training.

Place, publisher, year, edition, pages
International Speech Communication Association , 2024. p. 272-276
Keywords [en]
memory efficient algorithms, speech recognition
National Category
Computer Sciences Computer Engineering
Identifiers
URN: urn:nbn:se:kth:diva-358872DOI: 10.21437/Interspeech.2024-2457Scopus ID: 2-s2.0-85214840193OAI: oai:DiVA.org:kth-358872DiVA, id: diva2:1930525
Conference
25th Interspeech Conferece 2024, Kos Island, Greece, Sep 1-5, 2024
Note

QC 20250127

Available from: 2025-01-23 Created: 2025-01-23 Last updated: 2025-01-27Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Gionis, Aristides

Search in DiVA

By author/editor
Gionis, Aristides
By organisation
Theoretical Computer Science, TCS
Computer SciencesComputer Engineering

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 15 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf