kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Phonetic potential in the extant apes and extinct hominins
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.ORCID iD: 0000-0002-6739-0838
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Several novel claims with bearing on the evolution of speech production are made. It is shown through a series of theoretical, empirical, and computational works that the vocal anatomy of non-human apes, such as gibbons, orangutans, and chimpanzees, allows for the production of variable vowel-like contrasts. These phenomena in extant nonhuman primates are likely consistent with the animals’ retracting the tongue, potentially homologous with aspects of speech production. However, relationships of biomechanics inherent to the primate vocal production apparatus render fluid speech unrealistic. The articulatory configurations necessary to achieve these contrasts likely recruit lingual gestures disparate to those of humans, reflecting disparate anatomy. Novel evidence is also presented, illustrating elementary vocal production learning capacities in chimpanzees. These capacities are thus unlikely to have emerged de novo in our lineage. Building on these two sources of evidence, the evolution of speech is not straightforwardly reducible to “neural evolution”. Rather, additional evolutionary pressures must have acted upon hominin ancestors to ultimately trigger the evolution of spoken language. Toward this end, paleoanthropological evidence of articulator evolution in the hominin lineage is explored. The introduction of increasingly complex food processing and tool use, typically argued to have led to widespread anatomical changes in the face and guts of human ancestors, appear simultaneously with changes on the hominin would-be articulatory complex. Potential articulatory benefits of these changes in ancestral hominins are explored. An efficient articulatory apparatus, and the neural substrates by which to efficiently control it, likely evolved simultaneously with the human genus itself.  

Abstract [sv]

Avhandlingen presenterar ett flertal argument med innebörd för talets utveckling. Anatomin hos icke-mänskliga primater som gibboner, orangutanger och schimpanser möjliggör produktion av flertalet vokalliknande vokaliseringar. Dessa fenomen visas vara förenliga med att djuren drar tillbaka tungan - en möjlig homolog med talproduktion. De tungester som rekryteras för att uppnå dessa ljudkvaliteer skiljer sig dock sannolikt från de som studerats i mänskligt tal, och återspeglar anatomiska begräsningar i de icke-mänskliga primaternas ansatsrör. För primater tycks ansatsrörets inneboende biomekanik förhindra flytande, effektiva talsekvenser. Nya bevis presenteras också, vilka påvisar en grundläggande inlärningsförmåga för talliknande ljud hos schimpanser. Denna kapacitet torde därför inte ha utvecklats bara i människosläktet. Talets utveckling kan därför inte reduceras till enbart “neural evolution”. Ytterligare och unika evolutionära tryck ha verkat på mänskliga förfäder för att i slutändan möjliggöra utvecklingen av talat språk. Paleoantropologiska bevis på talapparatens evolution i utdöda människor utforskas. Bevis på allt mer komplex tillverkning av verktyg uppträder tillsammans med utbredda anatomiska förändringar i ansiktet hos mänskliga anfäder. I avhandlingen undersöks fonetiska konsekvenser av dessa förändringar. En effektiv talapparat, och de neuralogiska underlagen för att kontrollera den, utvecklades sannolikt tillsammans hos den blivande moderna människan. 

Place, publisher, year, edition, pages
Stockholm, Sweden: KTH Royal Institute of Technology, 2024. , p. 71
Series
TRITA-EECS-AVL ; 55
Keywords [en]
Evolution of speech, speech acoustics, source/filter theory, primatology, evolutionary anthropology
Keywords [sv]
Talevolution, talakustik, källa/filter-teori, primatologi, evolutionär antropologi
National Category
General Language Studies and Linguistics
Research subject
Speech and Music Communication
Identifiers
URN: urn:nbn:se:kth:diva-351250ISBN: 978-91-8040-967-4 (print)OAI: oai:DiVA.org:kth-351250DiVA, id: diva2:1886743
Public defence
2024-09-26, Fantum, Lindstedtsvägen 24, Stockholm, 15:00 (English)
Opponent
Supervisors
Note

QC 20240805

Available from: 2024-08-05 Created: 2024-08-04 Last updated: 2024-08-14Bibliographically approved
List of papers
1. Correcting the record: Phonetic potential of primate vocal tracts and the legacy of Philip Lieberman (1934−2022)
Open this publication in new window or tab >>Correcting the record: Phonetic potential of primate vocal tracts and the legacy of Philip Lieberman (1934−2022)
2024 (English)In: American Journal of Primatology, ISSN 0275-2565, E-ISSN 1098-2345, Vol. 86, no 8Article, review/survey (Refereed) Published
Abstract [en]

The phonetic potential of nonhuman primate vocal tracts has been the subject of considerable contention in recent literature. Here, the work of Philip Lieberman(1934−2022) is considered at length, and two research papers—both purported challenges to Lieberman's theoretical work—and a review of Lieberman's scientific legacy are critically examined. I argue that various aspects of Lieberman's research have been consistently misinterpreted in the literature. A paper by Fitch et al. overestimates the would‐be “speech‐ready” capacities of a rhesus macaque, and the data presented nonetheless supports Lieberman's principal position—that nonhuman primates cannot articulate the full extent of human speech sounds. The suggestion that no vocal anatomical evolution was necessary for the evolution of human speech(as spoken by all normally developing humans) is not supported by phonetic or anatomical data. The second challenge, by Boë et al., attributes vowel‐like qualities of baboon calls to articulatory capacities based on audio data; I argue that such“protovocalic” properties likely result from disparate articulatory maneuvers compared to human speakers. A review of Lieberman's scientific legacy by Boë et al. ascribes a view of speech evolution (which the authors term “laryngeal descent theory”) to Lieberman, which contradicts his writings. The present article documents a pattern of incorrect interpretations of Lieberman's theoretical work in recent literature. Finally, the apparent trend of vowel‐like formant dispersions in great ape vocalization literature is discussed with regard to Lieberman's theoretical work. The review concludes that the “Lieberman account” of primate vocal tract phonetic capacities remains supported by research: the ready articulation of fully human speech reflects species‐unique anatomy.

Place, publisher, year, edition, pages
Wiley-Blackwell, 2024
Keywords
Phonetics
National Category
Languages and Literature
Research subject
Speech and Music Communication
Identifiers
urn:nbn:se:kth:diva-351239 (URN)10.1002/ajp.23637 (DOI)001220622800001 ()38741274 (PubMedID)2-s2.0-85192844535 (Scopus ID)
Note

QC 20240805

Available from: 2024-08-04 Created: 2024-08-04 Last updated: 2025-12-05Bibliographically approved
2. PREQUEL: SUPERVISED PHONETIC APPROACHES TO ANALYSES OF GREAT APE QUASI-VOWELS
Open this publication in new window or tab >>PREQUEL: SUPERVISED PHONETIC APPROACHES TO ANALYSES OF GREAT APE QUASI-VOWELS
2023 (English)In: ICPhS 2023, 2023Conference paper, Published paper (Refereed)
Abstract [en]

 There is renewed interest in potential vowel production by nonhuman primates, but no agreedupon methodologies for its estimation from reallife vocalizations. Here, we present a set of supervised approaches for estimating primate vowel-like articulation, with reference to orangutan long call pulses (N=36). We summarize our approach as a cohesive framework, the Primate Quasi-Vowel (PREQUEL) protocol. We (1) estimated f0 from correlograms, (2) and vocal tract resonances (formants) from spectrograms, (3) the results of which were then compared against synthesized vowels for those frequency values; and (4) presented to uninformed listeners (N=16), who largely agreed on the categorization of vowel-like qualities for vocalizations (Cronbach’s alpha=.701). We also provide descriptions of methods that are seemingly inadequate for formant estimation in great ape calls. We argue that a combination of phonetic methods is required to develop a science of nonhuman primate articulation.

National Category
General Language Studies and Linguistics
Identifiers
urn:nbn:se:kth:diva-351247 (URN)
Conference
ICPhS 2023,August 7-11,Prague, Czech Republic
Note

QC 20240805

Available from: 2024-08-04 Created: 2024-08-04 Last updated: 2024-08-05Bibliographically approved
3. Gibbon vowel-like quality is tied to superhuman articulator landmarks
Open this publication in new window or tab >>Gibbon vowel-like quality is tied to superhuman articulator landmarks
Show others...
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Understanding the production of vowel-like sounds by nonhuman primates requires bridging the gap between animal calls and human speech. In particular, it has been argued that call behaviors whose acoustic properties seemingly overlap with human vowels reflect disparate production mechanisms. However, there are no as of yet agreed-upon methods for the study of primate articulation. We explore the potential of video content analysis for illuminating the production of primate vowel‒like qualities and provide evidence bearing on primate articulation using an audiovisual corpus (N=29 videos) of gibbon (Hylobatidae) call behavior, sourced from online video sites. Videos were coded for apparent jaw height, lip rounding, visible tongue movements, and visible inflation of species’ throat sacs (for Symphalangus Syndactylus), and the audible vowel‒like or diphthong‒like quality of the call. Results illustrate that the vowel-like quality of gibbon calls is tightly connected to degrees of jaw height and lip rounding. For “diphthong‒like” vocalizations, articulator trajectories effectively constitute movements between the same two extremes: high jaw-rounded lips, and low jaw-flared oral cavity. Animals’ tongues were often visible but never observed moving to systematically affect the vowel-like quality of the call (though several instances of possible tongue retraction were noted). The study adds to available methods for the study of articulation by nonhuman primates.  

Keywords
Phonetics, Primatology
National Category
Zoology
Research subject
Speech and Music Communication
Identifiers
urn:nbn:se:kth:diva-351241 (URN)
Note

QC 20240805

Available from: 2024-08-04 Created: 2024-08-04 Last updated: 2024-08-05Bibliographically approved
4. Reverse engineering great ape vocal tract configurations with implications for evolving speech biomechanics
Open this publication in new window or tab >>Reverse engineering great ape vocal tract configurations with implications for evolving speech biomechanics
Show others...
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Great ape call production may inform research on the evolution of speech but remains poorly understood. The vowel-like qualities of long-distance vocalizations of nonhuman great apes are seemingly both acoustically and perceptually comparable to human close back vowel [u]. However, nonhuman great ape vocal tract morphology, including the species-typical tongue and lack of an expanded pharynx, preclude comparable articulation. Here, we explore possible vocal tract configurations underlying chimpanzee (Pan troglodytes) pant hoots, gorilla (Gorilla gorilla) hoots , and orangutan (Pongo abelii) long calls. We present the result of computer simulations of acoustic tube vocal tract models based on MRI great ape articulator data and behavior. Predicted first and second formant simulation data were compared against data collected from adult male chimpanzees, silverback male gorillas, and flanged male orangutans in the wild. We explored the explanatory value of four sets of models corresponding to (i) uniform tubes, (ii) narrowed lip passage, (iii) narrowed and extremely protruded lip passage, and (iv) a “retracted model”, with dorsal oral tract constriction achieved via tongue retraction, combined with a narrowed lip passage. Our results show that great ape hoot data are most consistent with an articulatory model assuming dorsal oral tract stricture through tongue retraction (the only model to achieve a fit for the second formant). Our work indicates articulatory configurations employed in great ape call production may exist in continuity with speech production, without being identical to those observed in modern humans. 

Keywords
Phonetics, Primatology
National Category
Zoology
Research subject
Speech and Music Communication
Identifiers
urn:nbn:se:kth:diva-351244 (URN)
Note

QC 20240805

Available from: 2024-08-04 Created: 2024-08-04 Last updated: 2024-08-05Bibliographically approved
5. Evolution and function of hominid air sacs: A synthesis bearing on vowel production
Open this publication in new window or tab >>Evolution and function of hominid air sacs: A synthesis bearing on vowel production
Show others...
(English)Manuscript (preprint) (Other academic)
Abstract [en]

This text synthesizes the available literature on the effects of great ape air sacs and highlights shortcomings of previous approaches in understanding the roles of these intriguing organs in vocal communication and evolution. Several points of yet unattended nuance are highlighted. The interpretation that A. afarensis possessed air sacs is based on a single hyoid likely belonging to a juvenile female, and australopiths likely exhibited a combination of sexually dimorphic traits unique among primates, hindering inferences at the level of species or genera. While much of the modern literature on air sacs asserts a meaningful relationship between their presence and evolving speech production capacities, no empirical works analyze non-human great ape vocalizations from a viewpoint of evaluating air sac function hypotheses. We conclude that there is little support for the hypothesis that air sacs seriously impede potential vowel production by extant great apes. 

Keywords
Phonetics, Primatology
National Category
Zoology
Research subject
Speech and Music Communication
Identifiers
urn:nbn:se:kth:diva-351245 (URN)
Note

QC 20240805

Available from: 2024-08-04 Created: 2024-08-04 Last updated: 2024-08-05Bibliographically approved
6. No neural “missing link” for verbal control in chimpanzees
Open this publication in new window or tab >>No neural “missing link” for verbal control in chimpanzees
Show others...
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Nonhuman great apes have been claimed to be unable to learn human words due to a lack of the necessary neural circuitry. We recovered original footage of two enculturated chimpanzees uttering the word “mama” and subjected recordings to phonetic analysis. Our analyses demonstrate that chimpanzees are capable of syllabic production, achieving consonant-to-vowel phonetic contrasts via the simultaneous recruitment and coupling of voice, jaw and lips. In an online experiment, human listeners naive to the recordings’ origins reliably perceived chimpanzee utterances as syllabic utterances, primarily as “ma-ma”, among foil syllables. Our findings demonstrate that in the absence of direct data-driven examination, great ape vocal production capacities have been underestimated. Chimpanzees possess the neural building blocks necessary for speech.

Keywords
Phonetics, Primatology, Vocal learning
National Category
General Language Studies and Linguistics
Research subject
Speech and Music Communication
Identifiers
urn:nbn:se:kth:diva-351249 (URN)
Note

QC 20240805

Available from: 2024-08-04 Created: 2024-08-04 Last updated: 2024-08-05Bibliographically approved
7. Phonetic correlates of hominin evolution in the late Pliocene and Pleistocene epochs: Becoming pre-adapted for speech
Open this publication in new window or tab >>Phonetic correlates of hominin evolution in the late Pliocene and Pleistocene epochs: Becoming pre-adapted for speech
Show others...
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Despite decades of research, the field of language evolution lacks a cohesive integrative account, capable of explicating possible linguistic evolution throughout the development of modern humans. We review archaeological findings in search of a timeline during which features of the modern human articulatory morphology emerged. Rudimentary systems of speech may have driven selection for a vocal tract “optimal” for speech in early humans. However, a range of other factors have also enacted substantial morphological changes to the would-be speech articulators. The incorporation of processed and (ultimately) cooked food in the Homo lineage likely facilitated significant reduction of mandible and masticatory muscles, decreased the time spent masticating, and may have been maintainable in the lineage because food processing had already been outsourced to the hands and rudimentary stone tools (reducing selection pressure for robust jaws). The articulatory anatomy of early human ancestors is limited with regard to human speech sounds, but theoretically allows for a greater range of sounds than are observed in nature. We suggest that with decreased pressure to maintain anatomical elements required for mastication of foods that are mechanically challenging to eat, the would-be articulatory complex of human ancestors may have become pre-adapted for the development toward fully modern human speech. 

Keywords
Archaeology, Human evolution, Phonetics
National Category
Archaeology
Research subject
Speech and Music Communication
Identifiers
urn:nbn:se:kth:diva-351246 (URN)
Note

QC 20240805

Available from: 2024-08-04 Created: 2024-08-04 Last updated: 2024-08-05Bibliographically approved

Open Access in DiVA

fulltext(54755 kB)931 downloads
File information
File name FULLTEXT01.pdfFile size 54755 kBChecksum SHA-512
851bc254a80e5938e9bd7c2ae33c20d90151b751a45af46ce32fad23184b0fe5d57390755591da76b0a436e18573f6e92a4b4b753e4580c66b99b73b88f98a09
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Ekström, Axel
By organisation
Speech, Music and Hearing, TMH
General Language Studies and Linguistics

Search outside of DiVA

GoogleGoogle Scholar
Total: 936 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 793 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf