Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Probabilistic Modelling of Hearing: Speech Recognition and Optimal Audiometry
KTH, School of Electrical Engineering (EES), Sound and Image Processing.
2009 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Hearing loss afflicts as many as 10\% of our population.Fortunately, technologies designed to alleviate the effects ofhearing loss are improving rapidly, including cochlear implantsand the increasing computing power of digital hearing aids. Thisthesis focuses on theoretically sound methods for improvinghearing aid technology. The main contributions are documented inthree research articles, which treat two separate topics:modelling of human speech recognition (Papers A and B) andoptimization of diagnostic methods for hearing loss (Paper C).Papers A and B present a hidden Markov model-based framework forsimulating speech recognition in noisy conditions using auditorymodels and signal detection theory. In Paper A, a model of normaland impaired hearing is employed, in which a subject's pure-tonehearing thresholds are used to adapt the model to the individual.In Paper B, the framework is modified to simulate hearing with acochlear implant (CI). Two models of hearing with CI arepresented: a simple, functional model and a biologically inspiredmodel. The models are adapted to the individual CI user bysimulating a spectral discrimination test. The framework canestimate speech recognition ability for a given hearing impairmentor cochlear implant user. This estimate could potentially be usedto optimize hearing aid settings.Paper C presents a novel method for sequentially choosing thesound level and frequency for pure-tone audiometry. A Gaussianmixture model (GMM) is used to represent the probabilitydistribution of hearing thresholds at 8 frequencies. The GMM isfitted to over 100,000 hearing thresholds from a clinicaldatabase. After each response, the GMM is updated using Bayesianinference. The sound level and frequency are chosen so as tomaximize a predefined objective function, such as the entropy ofthe probability distribution. It is found through simulation thatan average of 48 tone presentations are needed to achieve the sameaccuracy as the standard method, which requires an average of 135presentations.

Place, publisher, year, edition, pages
Stockholm: KTH , 2009. , ix, 35 p.
Series
Trita-EE, ISSN 1653-5146 ; 2009:023
Keyword [en]
auditory models, probabilistic modelling, speech modelling, human speech recognition, hearing aids, cochlear implants, psychoacoustics, diagnostic methods, optimal experiments, audiometry
Identifiers
URN: urn:nbn:se:kth:diva-10386ISBN: 978-91-7415-310-1 (print)OAI: oai:DiVA.org:kth-10386DiVA: diva2:216427
Presentation
2009-05-20, E2, Lindstedtsvägen 3, 11428 Stockholm, 13:00 (English)
Opponent
Supervisors
Available from: 2009-05-14 Created: 2009-05-08 Last updated: 2010-10-29Bibliographically approved
List of papers
1. An Information Theoretic Approach to Predict Speech Intelligibility for Listeners with Normal and Impaired Hearing
Open this publication in new window or tab >>An Information Theoretic Approach to Predict Speech Intelligibility for Listeners with Normal and Impaired Hearing
2007 (English)In: INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, BAIXAS, FRANCE: ISCA-INST SPEECH COMMUNICATION ASSOC , 2007, 1345-1348 p.Conference paper, Published paper (Refereed)
Abstract [en]

A computational method to predict speech intelligibility in noisy environments has been developed. By modeling speech and noise as stochastic signals, the information transmission through a given auditory model can be estimated. Rate-distortion theory is then applied to predict speech recognition performance. Results are compared with subjective tests on normal and hearing impaired listeners. It is found that the method underestimates the supra-threshold deficits of hearing impairment, which is believed to be due to an overly simple auditory model and a small dictionary size.

Place, publisher, year, edition, pages
BAIXAS, FRANCE: ISCA-INST SPEECH COMMUNICATION ASSOC, 2007
Keyword
speech intelligibility, speech perception, auditory models
National Category
Engineering and Technology
Identifiers
urn:nbn:se:kth:diva-25766 (URN)000269998600337 ()2-s2.0-56149113289 (Scopus ID)978-160560316-2 (ISBN)
Conference
Interspeech Conference 2007, Antwerp, BELGIUM, AUG 27-31, 2007
Available from: 2010-10-29 Created: 2010-10-29 Last updated: 2011-09-13Bibliographically approved
2. Prediction of Speech Recognition in Cochlear Implant Users by Adapting Auditory Models to Psychophysical Data
Open this publication in new window or tab >>Prediction of Speech Recognition in Cochlear Implant Users by Adapting Auditory Models to Psychophysical Data
2009 (English)In: Eurasip Journal on Advances in Signal Processing, ISSN 1687-6172, Vol. 2009, 175243- p.Article in journal (Refereed) Published
Abstract [en]

Users of cochlear implants (CIs) vary widely in their ability to recognize speech in noisy conditions. There are many factors that may influence their performance. We have investigated to what degree it can be explained by the users' ability to discriminate spectral shapes. A speech recognition task has been simulated using both a simple and a complex models of CI hearing. The models were individualized by adapting their parameters to fit the results of a spectral discrimination test. The predicted speech recognition performance was compared to experimental results, and they were significantly correlated. The presented framework may be used to simulate the effects of changing the CI encoding strategy.

Keyword
multichannel electrical-stimulation, hearing-impaired listeners, noise, nerve, intelligibility, resolution, patterns
National Category
Engineering and Technology
Identifiers
urn:nbn:se:kth:diva-18836 (URN)10.1155/2009/175243 (DOI)000270476800001 ()2-s2.0-70349213558 (Scopus ID)
Note
QC 20100525Available from: 2010-08-05 Created: 2010-08-05 Last updated: 2011-01-14Bibliographically approved
3. Bayesian Optimal Pure Tone Audiometry with Prior Knowledge
Open this publication in new window or tab >>Bayesian Optimal Pure Tone Audiometry with Prior Knowledge
(English)Manuscript (preprint) (Other academic)
Identifiers
urn:nbn:se:kth:diva-25767 (URN)
Note
QC 20101029Available from: 2010-10-29 Created: 2010-10-29 Last updated: 2010-10-29Bibliographically approved

Open Access in DiVA

fulltext(1242 kB)502 downloads
File information
File name FULLTEXT02.pdfFile size 1242 kBChecksum SHA-512
c7b5ad1302c29819a6334d16d33c3870bd35d931554eb1fa8117c8c3354a06640f7c018595b2663ea61f0436e697e19f4ae5df1d0e376423e501b770d5136e85
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Stadler, Svante
By organisation
Sound and Image Processing

Search outside of DiVA

GoogleGoogle Scholar
Total: 503 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 286 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf