Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Computer methods for perceptual, acoustic and laryngoscopic voice analysis
KTH, Superseded Departments, Speech, Music and Hearing.ORCID iD: 0000-0003-4129-9793
2000 (English)Licentiate thesis, comprehensive summary (Other scientific)
Place, publisher, year, edition, pages
Stockholm: KTH , 2000. , 9 p.
Series
Trita-TMH, 2000:12
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:kth:diva-1169ISBN: 91-7283-013-1 (print)OAI: oai:DiVA.org:kth-1169DiVA: diva2:6957
Note
QC 20100609Available from: 2001-07-17 Created: 2001-07-17 Last updated: 2010-06-09Bibliographically approved
List of papers
1. The visual sort and rate method for perceptual evaluation in listening tests
Open this publication in new window or tab >>The visual sort and rate method for perceptual evaluation in listening tests
2003 (English)In: Logopedics, Phoniatrics, Vocology, ISSN 1401-5439, Vol. 28, no 3, 109-116 p.Article in journal (Refereed) Published
Abstract [en]

This paper introduces the Visual Sort and Rate (VSR) method which can be utilized for perceptual rating of sound stimuli. The method facilitates comparing similar stimuli, thus making the rank ordering of the stimuli easier. To examine the potential benefits of the method, it was compared with two other methods for perceptual rating of audio stimuli. The first method was a straightforward computer-based implementation of a visual analogue scale (VAS) allowing multiple playbacks and re-play of previously heard stimuli (C-VAS). The second method utilized a VAS where the responses were given on paper (P-VAS). The three methods were compared by using two sets of stimuli. The first set was a synthetically generated series of stimuli mimicking the vowel /a/ with different spectral tilts. In this test, a single parameter was rated. The second set of stimuli was a naturally spoken voice. For this set of stimuli three parameters were rated. Results show that the VSR method gave better reliability of the subjects' ratings in the single-parameter tests: Pearson and Spearman correlation coefficients were significantly higher for the VSR method than for the other methods. For the multi-parameter, intra-subject test, significantly higher Pearson correlation coefficients were found for the VSR method than for the VAS on paper.

National Category
Engineering and Technology
Identifiers
urn:nbn:se:kth:diva-13260 (URN)10.1080/14015430310015255 (DOI)
Note
QC 20100609Available from: 2010-06-09 Created: 2010-06-09 Last updated: 2010-06-09Bibliographically approved
2. The correlogram: A visual display of periodicity
Open this publication in new window or tab >>The correlogram: A visual display of periodicity
2003 (English)In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 114, no 5, 2934-2945 p.Article in journal (Refereed) Published
Abstract [en]

Fundamental frequency (F-0) extraction is often used in voice quality analysis'. In pathological voices with a high degree of instability in F-0, it is common for F-0 extraction algorithms to fail. In such cases, the faulty F-0 values might spoil the possibilities for further data analysis. This paper presents the correlogram, a new method of displaying periodicity. The correlogram is based on the waveform-matching techniques often used in F-0 extraction programs, but with no mechanism to select an actual F-0 value. Instead, several candidates for F-0 are shown as dark bands. The result is presented as a 3D plot with time on the x axis, correlation delay inverted to frequency on the y axis, and correlation on the z axis. The z axis is represented in a gray scale as in a spectrogram. Delays corresponding to integer multiples, of the period time will receive high correlation, thus resulting in candidates at F-0, F-0/2, F-0/3, etc. While the correlogram, adds little to F-0 analysis of normal voices, it is useful for analysis of pathological voices since it illustrates the full. complexity of the periodicity in the voice signal. Also, in combination with manual tracing, the correlogram can be used for semimanual F-0 extraction. If so, F-0 extraction can be performed on many voices that cause problems for conventional F-0 extractors. To demonstrate the properties of the method it is applied to synthetic and natural voices, among them six pathological voices, which are characterized by roughness, vocal fry, gratings/scrape, hypofunctional breathiness and voice breaks, or combinations of these.

Keyword
VOICE QUALITY, ACOUSTIC CHARACTERISTICS, PATHOLOGICAL VOICE, ROUGH VOICE
National Category
Engineering and Technology
Identifiers
urn:nbn:se:kth:diva-13261 (URN)10.1121/1.1590972 (DOI)000186489100038 ()
Note
QC 20100609Available from: 2010-06-09 Created: 2010-06-09 Last updated: 2010-06-09Bibliographically approved
3. A method of applying Fourier analysis to high-speed laryngoscopy
Open this publication in new window or tab >>A method of applying Fourier analysis to high-speed laryngoscopy
2001 (English)In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 110, no 6, 3193-3197 p.Article in journal (Refereed) Published
Abstract [en]

A new method for analysis of digital high-speed recordings of vocal-fold vibrations is presented. The method is based on the extraction of light-intensity time sequences from consecutive images, which in turn are Fourier transformed. The spectra thus acquired can be displayed in four different modes, each having its own benefits. When applied to the larynx, the method visualizes oscillations in the entire laryngeal area, not merely the glottal region. The method was applied to two laryngoscopic high-speed image sequences. Among these examples, covibrations in the ventricular folds and in the mucosa covering the arytenoid cartilages were found. In some cases the covibrations occurred at other frequencies than those of the glottis.

Keyword
VOCAL FOLD VIBRATIONS, MATHEMATICAL-MODEL, KYMOGRAPHY, CORDS
National Category
Engineering and Technology
Identifiers
urn:nbn:se:kth:diva-13262 (URN)000172731600037 ()
Note
QC 20100609Available from: 2010-06-09 Created: 2010-06-09 Last updated: 2010-06-09Bibliographically approved

Open Access in DiVA

No full text

Authority records BETA

Granqvist, Svante

Search in DiVA

By author/editor
Granqvist, Svante
By organisation
Speech, Music and Hearing
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 85 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf