A product and process analysis of post-editor corrections on neural, statistical and rule-based machine translation output

dc.contributor.authorMaarit Koponen
dc.contributor.authorLeena Salmi
dc.contributor.authorMarkku Nikulin
dc.contributor.organizationfi=digitaalinen kielentutkimus, espanja, italia, kiina, ranska, saksa|en=Digital Language Studies, Chinese, French, German, Italian, Spanish|
dc.contributor.organizationfi=englannin kieli, klassilliset kielet ja monikielinen käännösviestintä|en=English, Classics and Multilingual Translation Studies|
dc.contributor.organizationfi=kotimaiset kielet ja niiden sukukielet|en=Finnish, Finno-Ugric and Scandinavian languages|
dc.contributor.organization-code1.2.246.10.2458963.20.22758552511
dc.contributor.organization-code1.2.246.10.2458963.20.36764574459
dc.contributor.organization-code1.2.246.10.2458963.20.59108485091
dc.converis.publication-id40147635
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/40147635
dc.date.accessioned2022-10-27T12:26:50Z
dc.date.available2022-10-27T12:26:50Z
dc.description.abstract<p>This paper presents a comparison of post-editing (PE) changes performed on English-to-Finnish neural (NMT), rule-based (RBMT) and statistical machine translation (SMT) output, combining a product-based and a process-based approach. A total of 33 translation students acted as participants in a PE experiment providing both post-edited texts and edit process data. Our product-based analysis of the post-edited texts shows statistically significant differences in the distribution of edit types between machine translation systems. Deletions were the most common edit type for the RBMT, insertions for the SMT, and word form changes as well as word substitutions for the NMT system. The results also show significant differences in the correctness and necessity of the edits, particularly in the form of a large number of unnecessary edits in the RBMT output. Problems related to certain verb forms and ambiguity were observed for NMT and SMT, while RBMT was more likely to handle them correctly. Process-based comparison of effort indicators shows a slight increase of keystrokes per word for NMT output, and a slight decrease in average pause length for NMT compared to RBMT and SMT in specific text blocks. A statistically significant difference was observed in the number of visits per sub-segment, which is lower for NMT than for RBMT and SMT. The results suggest that although different types of edits were needed to outputs from NMT, RBMT and SMT systems, the difference is not necessarily reflected in process-based effort indicators.<br /></p>
dc.format.pagerange61
dc.format.pagerange90
dc.identifier.jour-issn0922-6567
dc.identifier.olddbid175557
dc.identifier.oldhandle10024/158651
dc.identifier.urihttps://www.utupub.fi/handle/11111/30890
dc.identifier.urnURN:NBN:fi-fe2021042823802
dc.language.isoen
dc.okm.affiliatedauthorKoponen, Maarit
dc.okm.affiliatedauthorSalmi, Leena
dc.okm.affiliatedauthorNikulin, Markku
dc.okm.discipline6121 Languagesen_GB
dc.okm.discipline6121 Kielitieteetfi_FI
dc.okm.internationalcopublicationnot an international co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA1 ScientificArticle
dc.publisherSpringer Netherlands
dc.publisher.countryNetherlandsen_GB
dc.publisher.countryAlankomaatfi_FI
dc.publisher.country-codeNL
dc.relation.doi10.1007/s10590-019-09228-7
dc.relation.ispartofjournalMachine Translation
dc.relation.issue1-2
dc.relation.volume33
dc.source.identifierhttps://www.utupub.fi/handle/10024/158651
dc.titleA product and process analysis of post-editor corrections on neural, statistical and rule-based machine translation output
dc.year.issued2019

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
A product and process analysis of post-editor corrections on neural, statistical and rule-based machine translation output.pdf
Size:
664.8 KB
Format:
Adobe Portable Document Format
Description:
Publisher's PDF