Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Designing a Question Answering System in the Domain of Swedish Technical Consulting Using Deep Learning
KTH, School of Electrical Engineering and Computer Science (EECS).
2018 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesisAlternative title
Design av ett frågebesvarande system inom svensk konsultverksamhet med användning av djupinlärning (Swedish)
Abstract [en]

Question Answering systems are greatly sought after in many areas of industry. Unfortunately, as most research in Natural Language Processing is conducted in English, the applicability of such systems to other languages is limited. Moreover, these systems often struggle in dealing with long text sequences.

This thesis explores the possibility of applying existing models to the Swedish language, in a domain where the syntax and semantics differ greatly from typical Swedish texts. Additionally, the text length may vary arbitrarily. To solve these problems, transfer learning techniques and state-of-the-art Question Answering models are investigated. Furthermore, a novel, divide-and-conquer based technique for processing long texts is developed.

Results show that the transfer learning is partly unsuccessful, but the system is capable of perform reasonably well in the new domain regardless. Furthermore, the system shows great performance improvement on longer text sequences with the use of the new technique.

Abstract [sv]

System som givet en text besvarar frågor är högt eftertraktade inom många arbetsområden. Eftersom majoriteten av all forskning inom naturligtspråkbehandling behandlar engelsk text är de flesta system inte direkt applicerbara på andra språk. Utöver detta har systemen ofta svårt att hantera långa textsekvenser.

Denna rapport utforskar möjligheten att applicera existerande modeller på det svenska språket, i en domän där syntaxen och semantiken i språket skiljer sig starkt från typiska svenska texter. Dessutom kan längden på texterna variera godtyckligt. För att lösa dessa problem undersöks flera tekniker inom transferinlärning och frågebesvarande modeller i forskningsfronten. En ny metod för att behandla långa texter utvecklas, baserad på en dekompositionsalgoritm.

Resultaten visar på att transfer learning delvis misslyckas givet domänen och modellerna, men att systemet ändå presterar relativt väl i den nya domänen. Utöver detta visas att systemet presterar väl på långa texter med hjälp av den nya metoden.

Place, publisher, year, edition, pages
2018.
Series
TRITA-EECS-EX ; 2018:322
Keywords [en]
Question Answering, Deep Learning, Machine Learning, Transfer Learning, Natural Language Processing, Technical Consulting, Word Embeddings, Divide and Conquer
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:kth:diva-231586OAI: oai:DiVA.org:kth-231586DiVA, id: diva2:1229461
External cooperation
Aiwizo AB
Educational program
Master of Science in Engineering -Engineering Physics
Supervisors
Examiners
Available from: 2018-08-14 Created: 2018-06-30 Last updated: 2018-08-14Bibliographically approved

Open Access in DiVA

fulltext(2290 kB)220 downloads
File information
File name FULLTEXT01.pdfFile size 2290 kBChecksum SHA-512
727203c289652dda4af7177dd01f82da788281dab07bf7ca8aa0016ea1e93d2ad4aa20e5fb8242b83359eb73b697a42bea673503646ee94a96401127fea8c171
Type fulltextMimetype application/pdf

By organisation
School of Electrical Engineering and Computer Science (EECS)
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 220 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 949 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf