kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A Deep Learning approach to Analysing Multimodal User Feedback during Adaptive Robot-Human Presentations: A comparative study of state-of-the-art Deep Learning architectures against high performing Machine Learning approaches
KTH, School of Electrical Engineering and Computer Science (EECS).
2023 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesisAlternative title
En djupinlärningsmetod för att analysera multimodal användarfeedback under adaptiva presentationer från robotar till människor : En jämförande studie av toppmoderna djupinlärningsarkitekturer mot högpresterande maskininlärningsmetoder (Swedish)
Abstract [en]

When two human beings engage in a conversation, feedback is generally present since it helps in modulating and guiding the conversation for the involved parties. When a robotic agent engages in a conversation with a human, the robot is not capable of understanding the feedback given by the human as other humans would. In this thesis, we model human feedback as a Multivariate Time Series to be classified as positive, negative or neutral. We explore state-of-the-art Deep Learning architectures such as InceptionTime, a Convolutional Neural Network approach, and the Time Series Encoder, a Transformer approach. We demonstrate state-of-the art performance in accuracy, loss and f1-score of such models and improved performance in all metrics when compared to best performing approaches in previous studies such as the Random Forest Classifier. While InceptionTime and the Time Series Encoder reach an accuracy of 85.09% and 84.06% respectively, the Random Forest Classifier stays back with an accuracy of 81.99%. Moreover, InceptionTime reaches an f1-score of 85.07%, the Time Series Encoder of 83.27% and the Random Forest Classifier of 77.61%. In addition to this, we study the data classified by both Deep Learning approaches to outline relevant, redundant and trivial human feedback signals over the whole dataset as well as for the positive, negative and neutral cases.

Abstract [sv]

När två människor konverserar, är feedback (återmatning) en del av samtalet eftersom det hjälper till att styra och leda samtalet för de samtalande parterna. När en robot-agent samtalar med en människa, kan den inte förstå denna feedback på samma sätt som en människa skulle kunna. I den här avhandlingen modelleras människans feedback som en flervariabeltidsserie (Multivariate Time Series) som klassificeras som positiv, negativ eller neutral. Vi utforskar toppmoderna djupinlärningsarkitekturer som InceptionTime, en CNN-metod och Time Series Encoder, som är en Transformer-metod. Vi uppnår hög noggrannhet, F1 och lägre värden på förlustfunktionen jämfört med tidigare högst presterande metoder, som Random Forest-metoder. InceptionTime och Time Series Encoder uppnår en noggrannhet på 85,09% respektive 84,06%, men Random Forest-klassificeraren uppnår endast 81,99%. Dessutom uppnår InceptionTime ett F1 på 85,07%, Time Series Encoder 83,27%, och Random Forest-klassificeraren 77,61. Utöver detta studerar vi data som har klassificerats av båda djupinlärningsmetoderna för att hitta relevanta, redundanta och enklare mänskliga feedback-signaler över hela datamängden, samt för positiva, negativa och neutrala datapunkter.

Place, publisher, year, edition, pages
2023. , p. 77
Series
TRITA-EECS-EX ; 2023:393
Keywords [en]
Human Feedback, Deep Learning, Convolutional Neural Networks, Transformers
Keywords [sv]
Mänsklig återmatning, mänsklig feedback, djupinlärning, CNN, transformer
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:kth:diva-333922OAI: oai:DiVA.org:kth-333922DiVA, id: diva2:1787723
Supervisors
Examiners
Available from: 2023-08-19 Created: 2023-08-14 Last updated: 2023-08-19Bibliographically approved

Open Access in DiVA

fulltext(3265 kB)252 downloads
File information
File name FULLTEXT01.pdfFile size 3265 kBChecksum SHA-512
f4205c441b5695e342ec46110aa62b8d7b72841eb342dae459a286da54ac3160cae917b9049bba16ec107ee84c626e1a2d52ddcbc63f3fa6252387d907811e5d
Type fulltextMimetype application/pdf

By organisation
School of Electrical Engineering and Computer Science (EECS)
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 254 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 274 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf