Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Approximating phonotactic input in children's linguistic environments from orthographic transcripts
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.ORCID iD: 0000-0001-9327-9482
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH. Karolinska Institutet (KI), Sweden.ORCID iD: 0000-0002-7829-5561
2017 (English)In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2017, International Speech Communication Association , 2017, Vol. 2017, p. 2213-2217Conference paper, Published paper (Refereed)
Abstract [en]

Child-directed spoken data is the ideal source of support for claims about children's linguistic environments. However, phonological transcriptions of child-directed speech are scarce, compared to sources like adult-directed speech or text data. Acquiring reliable descriptions of children's phonological environments from more readily accessible sources would mean considerable savings of time and money. The first step towards this goal is to quantify the reliability of descriptions derived from such secondary sources. We investigate how phonological distributions vary across different modalities (spoken vs. written), and across the age of the intended audience (children vs. adults). Using a previously unseen collection of Swedish adult-and child-directed spoken and written data, we combine lexicon look-up and graphemeto-phoneme conversion to approximate phonological characteristics. The analysis shows distributional differences across datasets both for single phonemes and for longer phoneme sequences. Some of these are predictably attributed to lexical and contextual characteristics of text vs. speech. The generated phonological transcriptions are remarkably reliable. The differences in phonological distributions between child-directed speech and secondary sources highlight a need for compensatory measures when relying on written data or on adult-directed spoken data, and/or for continued collection of actual child-directed speech in research on children's language environments.

Place, publisher, year, edition, pages
International Speech Communication Association , 2017. Vol. 2017, p. 2213-2217
Keywords [en]
Grapheme-To-Phoneme Conversion, language Acquisition, Phonology
National Category
Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:kth:diva-222093DOI: 10.21437/Interspeech.2017-1634Scopus ID: 2-s2.0-85039166025OAI: oai:DiVA.org:kth-222093DiVA, id: diva2:1178952
Conference
18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017, 20 August 2017 through 24 August 2017
Note

QC 20180131

Available from: 2018-01-31 Created: 2018-01-31 Last updated: 2018-01-31Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records BETA

Edlund, JensGötze, Jana

Search in DiVA

By author/editor
Edlund, JensGötze, Jana
By organisation
Speech, Music and Hearing, TMH
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 5 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf