kth.sePublications KTH
Change search
Link to record
Permanent link

Direct link
Bystedt, Mattias
Publications (2 of 2) Show all publications
Bystedt, M. & Edlund, J. (2019). New applications of gaze tracking in speech science. In: CEUR Workshop Proceedings: . Paper presented at 4th Conference on Digital Humanities in the Nordic Countries, DHN 2019, 5-8 March 2019, Copenhagen, Denmark (pp. 73-78). CEUR-WS
Open this publication in new window or tab >>New applications of gaze tracking in speech science
2019 (English)In: CEUR Workshop Proceedings, CEUR-WS , 2019, p. 73-78Conference paper, Published paper (Refereed)
Abstract [en]

We present an overview of speech research applications of gaze tracking technology, where gaze behaviours are exploited as a tool for analysis rather than as a primary object of study. The methods presented are all in their infancy, but can greatly assist the analysis of digital audio and video as well as unlock the relationship between writing and other encodings on the one hand, and natural language, such as speech, on the other. We discuss three directions in this type of gaze tracking application: modelling of text that is read aloud, evaluation and annotation with naïve informants, and evaluation and annotation with expert annotators. In each of these areas, we use gaze tracking information to gauge the behaviour of people when working with speech and conversation, rather than when reading text aloud or partaking in conversations, in order to learn something about how the speech may be ana-lysed from a human perspective.

Place, publisher, year, edition, pages
CEUR-WS, 2019
Keywords
Annotation, Gaze tracking, Label acquisition, Speech technology, Speech, Speech recognition, Technology transfer, Gaze behaviours, Human perspectives, Natural languages, New applications, Speech research, Eye tracking
National Category
Natural Language Processing
Identifiers
urn:nbn:se:kth:diva-280554 (URN)2-s2.0-85066039643 (Scopus ID)
Conference
4th Conference on Digital Humanities in the Nordic Countries, DHN 2019, 5-8 March 2019, Copenhagen, Denmark
Note

QC 20200909

Available from: 2020-09-09 Created: 2020-09-09 Last updated: 2025-02-07Bibliographically approved
Jonell, P., Bystedt, M., Fallgren, P., Kontogiorgos, D., David Aguas Lopes, J., Malisz, Z., . . . Shore, T. (2018). FARMI: A Framework for Recording Multi-Modal Interactions. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018): . Paper presented at The Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, 7-12 May 2018 (pp. 3969-3974). Paris: European Language Resources Association
Open this publication in new window or tab >>FARMI: A Framework for Recording Multi-Modal Interactions
Show others...
2018 (English)In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Paris: European Language Resources Association, 2018, p. 3969-3974Conference paper, Published paper (Refereed)
Abstract [en]

In this paper we present (1) a processing architecture used to collect multi-modal sensor data, both for corpora collection and real-time processing, (2) an open-source implementation thereof and (3) a use-case where we deploy the architecture in a multi-party deception game, featuring six human players and one robot. The architecture is agnostic to the choice of hardware (e.g. microphones, cameras, etc.) and programming languages, although our implementation is mostly written in Python. In our use-case, different methods of capturing verbal and non-verbal cues from the participants were used. These were processed in real-time and used to inform the robot about the participants’ deceptive behaviour. The framework is of particular interest for researchers who are interested in the collection of multi-party, richly recorded corpora and the design of conversational systems. Moreover for researchers who are interested in human-robot interaction the available modules offer the possibility to easily create both autonomous and wizard-of-Oz interactions.

Place, publisher, year, edition, pages
Paris: European Language Resources Association, 2018
National Category
Natural Sciences Engineering and Technology
Identifiers
urn:nbn:se:kth:diva-230237 (URN)000725545004009 ()2-s2.0-85058179983 (Scopus ID)
Conference
The Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, 7-12 May 2018
Note

Part of proceedings ISBN 979-10-95546-00-9

QC 20180618

Available from: 2018-06-13 Created: 2018-06-13 Last updated: 2022-09-22Bibliographically approved
Organisations

Search in DiVA

Show all publications