Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Who am I speaking at?: perceiving the head orientation of speakers from acoustic cues alone
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Speech Communication and Technology.ORCID iD: 0000-0001-9327-9482
Stockholm University, Faculty of Humanities, Department of Linguistics.
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Speech Communication and Technology.ORCID iD: 0000-0002-0397-6442
2012 (English)In: Proc. of LREC Workshop on Multimodal Corpora 2012, Istanbul, Turkey, 2012Conference paper, Published paper (Refereed)
Abstract [en]

The ability of people, and of machines, to determine the position of a sound source in a room is well studied. The related ability to determine the orientation of a directed sound source, on the other hand, is not, but the few studies there are show people to be surprisingly skilled at it. This has bearing for studies of face-to-face interaction and of embodied spoken dialogue systems, as sound source orientation of a speaker is connected to the head pose of the speaker, which is meaningful in a number of ways. We describe in passing some preliminary findings that led us onto this line of investigation, and in detail a study in which we extend an experiment design intended to measure perception of gaze direction to test instead for perception of sound source orientation. The results corroborate those of previous studies, and further show that people are very good at performing this skill outside of studio conditions as well.

Place, publisher, year, edition, pages
Istanbul, Turkey, 2012.
National Category
Computer Science Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:kth:diva-106998OAI: oai:DiVA.org:kth-106998DiVA: diva2:574368
Conference
LREC Workshop on Multimodal Corpora 2012
Funder
ICT - The Next Generation
Note

tmh_import_12_12_05, tmh_id_3721. QC 20121217

Available from: 2012-12-05 Created: 2012-12-05 Last updated: 2013-04-19Bibliographically approved

Open Access in DiVA

No full text

Authority records BETA

Edlund, JensGustafson, Joakim

Search in DiVA

By author/editor
Edlund, JensHeldner, MattiasGustafson, Joakim
By organisation
Speech Communication and Technology
Computer ScienceLanguage Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 60 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf