Neural Machine Translation and Finnish Case-Inflections: Translation Problems and Pre-editing Possibilities
Rantanen, Mikael (2024-05-31)
Neural Machine Translation and Finnish Case-Inflections: Translation Problems and Pre-editing Possibilities
Rantanen, Mikael
(31.05.2024)
Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.
avoin
Julkaisun pysyvä osoite on:
https://urn.fi/URN:NBN:fi-fe2024061352418
https://urn.fi/URN:NBN:fi-fe2024061352418
Tiivistelmä
Machine translation has slowly been becoming more prevalent from the 1950s onwards with the most recent leap in development being neural machine translation, which has quickly been de-ployed in professional translation settings with human translators post-editing machine translated text. Neural machine translation is still not a perfect form of machine translation. One of the problems when translating between English and Finnish is the Finnish case system, as the Finn-ish case-inflections can be problematic for machine translation engines.
This thesis presents an evaluation of the four most prominent neural machine translation en-gines’ capability of translating case-inflections accurately from English to Finnish and proposes pre-editing or pre-translation being a solution for the problem. The four neural machine transla-tion engines that are featured are Google translate, Microsoft translator, Amazon translate, and DeepL translator.
Samples were gathered from news media, a fictional book, and a scientific book to account for differences in textual style. These samples were then given as input to neural machine translation engines and the samples’ translations were given a numerical score on a seven-levelled evalua-tion scale. The samples that were assigned scores which were deemed unsatisfactory were select-ed and the possible reasons for their inaccuracy was examined.
The further examination showed that the frequency in which a case-inflection occurs predicts it’s accurate usage by a machine translator to some degree. Unconventional syntax structure and longer sentence length were also shown to be predictors for accuracy problems when translating with a machine translator. The results suggested that pre-editing would not be more efficient than post-editing.
This thesis presents an evaluation of the four most prominent neural machine translation en-gines’ capability of translating case-inflections accurately from English to Finnish and proposes pre-editing or pre-translation being a solution for the problem. The four neural machine transla-tion engines that are featured are Google translate, Microsoft translator, Amazon translate, and DeepL translator.
Samples were gathered from news media, a fictional book, and a scientific book to account for differences in textual style. These samples were then given as input to neural machine translation engines and the samples’ translations were given a numerical score on a seven-levelled evalua-tion scale. The samples that were assigned scores which were deemed unsatisfactory were select-ed and the possible reasons for their inaccuracy was examined.
The further examination showed that the frequency in which a case-inflection occurs predicts it’s accurate usage by a machine translator to some degree. Unconventional syntax structure and longer sentence length were also shown to be predictors for accuracy problems when translating with a machine translator. The results suggested that pre-editing would not be more efficient than post-editing.