Grammatical Error Correction Using Large Language Models: A Case Study on Universal Dependencies Treebanks
| dc.contributor.author | Jalali, Arvin | |
| dc.contributor.department | fi=Tietotekniikan laitos|en=Department of Computing| | |
| dc.contributor.faculty | fi=Teknillinen tiedekunta|en=Faculty of Technology| | |
| dc.contributor.studysubject | fi=Tietotekniikka|en=Information and Communication Technology| | |
| dc.date.accessioned | 2025-06-26T21:06:26Z | |
| dc.date.available | 2025-06-26T21:06:26Z | |
| dc.date.issued | 2025-06-23 | |
| dc.description.abstract | This thesis addresses Grammatical Error Correction (GEC) through two phases. The first phase investigates the use of Universal Dependencies (UD), a cross-linguistically consistent framework for syntactic annotation, particularly focusing on the Typo=Yes feature, to support error analysis in GEC. Tokens marked with Typo=Yes were extracted from three UD treebanks, including UD English EWT, UD English GUM, and UD Finnish TDT, and manually annotated based on the criteria of the ERRANT framework, which is designed to classify grammatical errors consistently. This enabled detailed cross-dataset and cross-linguistic error analysis. The second phase evaluates the ability of a Large Language Model (LLM) to classify grammatical errors using structured prompts based on the ERRANT framework. Both zero-shot and few-shot prompting techniques were applied, and the LLM's performance was compared against manually annotated gold standards developed during the first phase. This work aims to bridge linguistic annotation frameworks and neural language models to advance GEC systems. | |
| dc.format.extent | 111 | |
| dc.identifier.olddbid | 199459 | |
| dc.identifier.oldhandle | 10024/182490 | |
| dc.identifier.uri | https://www.utupub.fi/handle/11111/20102 | |
| dc.identifier.urn | URN:NBN:fi-fe2025062674959 | |
| dc.language.iso | eng | |
| dc.rights | fi=Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.|en=This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.| | |
| dc.rights.accessrights | avoin | |
| dc.source.identifier | https://www.utupub.fi/handle/10024/182490 | |
| dc.subject | Grammatical Error Correction, Universal Dependencies, ERRANT Framework, Large Language Models, Prompt Engineering | |
| dc.title | Grammatical Error Correction Using Large Language Models: A Case Study on Universal Dependencies Treebanks | |
| dc.type.ontasot | fi=Diplomityö|en=Master's thesis| |
Tiedostot
1 - 1 / 1