Näytä suppeat kuvailutiedot

A product and process analysis of post-editor corrections on neural, statistical and rule-based machine translation output

Maarit Koponen; Leena Salmi; Markku Nikulin

dc.contributor.authorMaarit Koponen
dc.contributor.authorLeena Salmi
dc.contributor.authorMarkku Nikulin
dc.date.accessioned2022-10-27T12:26:50Z
dc.date.available2022-10-27T12:26:50Z
dc.identifier.urihttps://www.utupub.fi/handle/10024/158651
dc.description.abstract<p>This paper presents a comparison of post-editing (PE) changes performed on English-to-Finnish neural (NMT), rule-based (RBMT) and statistical machine translation (SMT) output, combining a product-based and a process-based approach. A total of 33 translation students acted as participants in a PE experiment providing both post-edited texts and edit process data. Our product-based analysis of the post-edited texts shows statistically significant differences in the distribution of edit types between machine translation systems. Deletions were the most common edit type for the RBMT, insertions for the SMT, and word form changes as well as word substitutions for the NMT system. The results also show significant differences in the correctness and necessity of the edits, particularly in the form of a large number of unnecessary edits in the RBMT output. Problems related to certain verb forms and ambiguity were observed for NMT and SMT, while RBMT was more likely to handle them correctly. Process-based comparison of effort indicators shows a slight increase of keystrokes per word for NMT output, and a slight decrease in average pause length for NMT compared to RBMT and SMT in specific text blocks. A statistically significant difference was observed in the number of visits per sub-segment, which is lower for NMT than for RBMT and SMT. The results suggest that although different types of edits were needed to outputs from NMT, RBMT and SMT systems, the difference is not necessarily reflected in process-based effort indicators.<br /></p>
dc.language.isoen
dc.publisherSpringer Netherlands
dc.titleA product and process analysis of post-editor corrections on neural, statistical and rule-based machine translation output
dc.identifier.urnURN:NBN:fi-fe2021042823802
dc.relation.volume33
dc.contributor.organizationfi=englannin kieli|en=Department of English|
dc.contributor.organizationfi=suomen kieli ja suomalais-ugrilainen kielentutkimus|en=Department of Finnish and Finno-Ugric Languages|
dc.contributor.organizationfi=ranska|en=Department of French|
dc.contributor.organization-code2602101
dc.contributor.organization-code2602107
dc.contributor.organization-code2602110
dc.converis.publication-id40147635
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/40147635
dc.format.pagerange90
dc.format.pagerange61
dc.identifier.jour-issn0922-6567
dc.okm.affiliatedauthorKoponen, Maarit
dc.okm.affiliatedauthorSalmi, Leena
dc.okm.affiliatedauthorNikulin, Markku
dc.okm.discipline6121 Kielitieteetfi_FI
dc.okm.discipline6121 Languagesen_GB
dc.okm.internationalcopublicationnot an international co-publication
dc.okm.internationalityInternational publication
dc.okm.typeJournal article
dc.publisher.countryNetherlandsen_GB
dc.publisher.countryAlankomaatfi_FI
dc.publisher.country-codeNL
dc.relation.doi10.1007/s10590-019-09228-7
dc.relation.ispartofjournalMachine Translation
dc.relation.issue1-2
dc.year.issued2019


Aineistoon kuuluvat tiedostot

Thumbnail

Aineisto kuuluu seuraaviin kokoelmiin

Näytä suppeat kuvailutiedot