Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
The correlogram: A visual display of periodicity
KTH, Superseded Departments, Speech, Music and Hearing.ORCID iD: 0000-0003-4129-9793
2003 (English)In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 114, no 5, 2934-2945 p.Article in journal (Refereed) Published
Abstract [en]

Fundamental frequency (F-0) extraction is often used in voice quality analysis'. In pathological voices with a high degree of instability in F-0, it is common for F-0 extraction algorithms to fail. In such cases, the faulty F-0 values might spoil the possibilities for further data analysis. This paper presents the correlogram, a new method of displaying periodicity. The correlogram is based on the waveform-matching techniques often used in F-0 extraction programs, but with no mechanism to select an actual F-0 value. Instead, several candidates for F-0 are shown as dark bands. The result is presented as a 3D plot with time on the x axis, correlation delay inverted to frequency on the y axis, and correlation on the z axis. The z axis is represented in a gray scale as in a spectrogram. Delays corresponding to integer multiples, of the period time will receive high correlation, thus resulting in candidates at F-0, F-0/2, F-0/3, etc. While the correlogram, adds little to F-0 analysis of normal voices, it is useful for analysis of pathological voices since it illustrates the full. complexity of the periodicity in the voice signal. Also, in combination with manual tracing, the correlogram can be used for semimanual F-0 extraction. If so, F-0 extraction can be performed on many voices that cause problems for conventional F-0 extractors. To demonstrate the properties of the method it is applied to synthetic and natural voices, among them six pathological voices, which are characterized by roughness, vocal fry, gratings/scrape, hypofunctional breathiness and voice breaks, or combinations of these.

Place, publisher, year, edition, pages
2003. Vol. 114, no 5, 2934-2945 p.
Keyword [en]
VOICE QUALITY, ACOUSTIC CHARACTERISTICS, PATHOLOGICAL VOICE, ROUGH VOICE
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:kth:diva-13261DOI: 10.1121/1.1590972ISI: 000186489100038OAI: oai:DiVA.org:kth-13261DiVA: diva2:322974
Note
QC 20100609Available from: 2010-06-09 Created: 2010-06-09 Last updated: 2017-12-12Bibliographically approved
In thesis
1. Computer methods for perceptual, acoustic and laryngoscopic voice analysis
Open this publication in new window or tab >>Computer methods for perceptual, acoustic and laryngoscopic voice analysis
2000 (English)Licentiate thesis, comprehensive summary (Other scientific)
Place, publisher, year, edition, pages
Stockholm: KTH, 2000. 9 p.
Series
Trita-TMH, 2000:12
National Category
Engineering and Technology
Identifiers
urn:nbn:se:kth:diva-1169 (URN)91-7283-013-1 (ISBN)
Note
QC 20100609Available from: 2001-07-17 Created: 2001-07-17 Last updated: 2010-06-09Bibliographically approved
2. Computer methods for voice analysis
Open this publication in new window or tab >>Computer methods for voice analysis
2003 (English)Doctoral thesis, comprehensive summary (Other scientific)
Abstract [en]

This thesis consists of five articles and a summary. Thethesis deals with methods for measuring properties of thevoice. The methods are all computer-based, but utilisedifferent approaches for measuring different aspects of thevoice.

Paper I introduces the Visual Sort and Rate (VSR) method forperceptual rating of voice quality. The method is based on theVisual Analogue Scale (VAS), but simultaneously shows allstimuli as icons along the VAS on the computer screen. As thelistener places similar-sounding stimuli close to each otherduring the rating process, comparing stimuli becomeseasier.

Paper II introduces the correlogram. Fundamental frequencyF0 sometimes cannot be strictly defined, particularly forperturbed voice signals. The method displays multipleconsecutive correlation functions in a grey scale image. Thus,the correlogram avoids selecting a single F0 value. Rather itpresents an unbiased image of periodicity, allowing theinvestigator to select among several candidates, ifappropriate.

PaperIII introduces a method for detection of phonation tobe utilised in voice accumulators. The method uses twomicrophones attached near the subject’s ears. Phase andamplitude relations of the microphone signals are used to forma phonation detector. The output of the method can be used tomeasure phonation time, speaking time and fundamental frequencyof the subject, as well as sound pressure level of both thesubject’s voicing and the ambient sounds.

Paper IV introduces a method for Fourier analysis ofhigh-speed laryngoscopic imaging. The data from the consecutiveimages are re-arranged to form time-series that reflect thetime-variation of light intensity in each pixel. Each of thesetime series is then analysed by means of Fouriertransformation, such that a spectrum for each pixel isobtained. Several ways of displaying these spectra aredemonstrated.

Paper V examines a test set-up for simultaneous recording ofairflow, intra-oral pressure, electro-glottography, audio andhigh-speed imaging. Data are analysed with particular focus onsynchronisation between glottal area and inverse filteredairflow. Several methodological aspects are also examined, suchas the difficulties in synchronising high-speed imaging datawith the other signals.

Place, publisher, year, edition, pages
Stockholm: KTH, 2003. 22 p.
Series
Trita-TMH, 2003:2
Keyword
voice analysis, perceptual analysis, fundamental frequency, correlogram, aperiodicity, Fourier analysis, high-speed imaging, laryngoscopy, vocal fold vibration, voice accumulation.
National Category
Engineering and Technology
Identifiers
urn:nbn:se:kth:diva-3485 (URN)91-7283-461-7 (ISBN)
Public defence
2003-03-28, 00:00
Note
QC 20100609Available from: 2003-03-21 Created: 2003-03-21 Last updated: 2010-06-09Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Authority records BETA

Granqvist, Svante

Search in DiVA

By author/editor
Granqvist, Svante
By organisation
Speech, Music and Hearing
In the same journal
Journal of the Acoustical Society of America
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 64 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf