kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Utvärdering av domänanpassning i maskinöversättningssystem för användning inom MyScania
KTH, School of Engineering Sciences in Chemistry, Biotechnology and Health (CBH), Biomedical Engineering and Health Systems, Health Informatics and Logistics.
KTH, School of Engineering Sciences in Chemistry, Biotechnology and Health (CBH), Biomedical Engineering and Health Systems, Health Informatics and Logistics.
2022 (Swedish)Independent thesis Basic level (university diploma), 10 credits / 15 HE creditsStudent thesisAlternative title
Evaluation of domain customization in machine translation systems for use in MyScania (English)
Abstract [sv]

Denna rapport syftar primärt till att undersöka hur väl system för maskinöversättning kan prestera i relation till Scanias kravbild. Undersökningen riktar sig främst till att undersöka systemens förmåga till domänanpassning och vilken effekt det har på dess maskinöversättningar. Utvärdering görs dels med automatiska utvärderingsmetoder som på olika sätt mäter korrelation till existerande textinnehåll från diverse tjänster i samlingsplattformen MyScania, men även manuellt av översättare med erfarenhet inom Scanias språkbruk.

Resultatet av denna undersökning visade att domänanpassning med egna data generellt ökar kvaliteten av maskinöversättningar. Det noteras även hur väl maskinöversättningarna presterar varierar mycket på faktorer som exempelvis språk. Google AutoML lyckas däremot prestera bäst i alla de testade språken. Detta visades vid både automatisk utvärdering och manuell utvärdering. Undersökningen visade även svagheter i automatisk utvärderingsmetrik vid fristående användning men samtidigt att det kan bidra med meningsfulla insikter när det kompletteras med mänsklig bedömning. Undersökningen bekräftar att mänsklig bedömning alltid bör användas om det är möjligt.

Abstract [en]

This report’s primary purpose is to examine how well systems for machine translation can perform in relation to what is sought after by Scania. This examination is primarily aimed at investigating the systems capability for domain customization and what effects these have on the results of machine translations. Evaluation is done partly using multiple automatic metrics that in different ways measure correlation to existing translations within MyScania, combined with manual evaluation done by translators experienced with Scania’s language usage. 

The results of this examination showed that domain customization using own data generally increases the quality of machine translations. It is noted that how the machine translations perform is affected by many factors such as languages, Google AutoML however succeeds to perform the best in all the tested languages. This is proven both in evaluation using automatic metrics and manual evaluation. This investigation also showed weaknesses in automatic metrics in stand-alone use but that they can contribute with meaningful knowledge when complemented by manual evaluation. This investigation confirms that manual evaluation should always be used when possible.

Place, publisher, year, edition, pages
2022. , p. 51
Series
TRITA-CBH-GRU ; 2022:266
Keywords [en]
machine translation, neural machine translation, domain customization, automatic metrics, manual evaluation
Keywords [sv]
maskinöversättning, neural maskinöversättning, domänanpassning, automatisk utvärderingsmetrik, manuell utvärdering
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:kth:diva-320097OAI: oai:DiVA.org:kth-320097DiVA, id: diva2:1703610
Educational program
Bachelor of Science in Engineering - Computer Engineering
Supervisors
Examiners
Available from: 2022-10-14 Created: 2022-10-13 Last updated: 2022-10-14Bibliographically approved

Open Access in DiVA

Rapport(1111 kB)105 downloads
File information
File name FULLTEXT01.pdfFile size 1111 kBChecksum SHA-512
c4dc539e394b861ba5fbc3a0c94ebc46dedcce4a55ff5a3473ffb06fc97c77260446da79cc1413dc2de281c0ea4e8d8a38680b2fd9ec3b6c4b5734236c4828f3
Type fulltextMimetype application/pdf

By organisation
Health Informatics and Logistics
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 105 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 609 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf