Mind the Gap: Data Enrichment in Dependency Parsing of Elliptical Constructions
Daniel Zeman; Filip Ginter; Kira Droganova; Jenna Kanerva
Mind the Gap: Data Enrichment in Dependency Parsing of Elliptical Constructions
Daniel Zeman
Filip Ginter
Kira Droganova
Jenna Kanerva
Julkaisun pysyvä osoite on:
https://urn.fi/URN:NBN:fi-fe2021042827530
https://urn.fi/URN:NBN:fi-fe2021042827530
Tiivistelmä
In this paper, we focus on parsing rare and
non-trivial constructions, in particular ellipsis. We report on several experiments in
enrichment of training data for this specific
construction, evaluated on five languages:
Czech, English, Finnish, Russian and Slovak.
These data enrichment methods draw upon
self-training and tri-training, combined with
a stratified sampling method mimicking the
structural complexity of the original treebank.
In addition, using these same methods, we
also demonstrate small improvements over the
CoNLL-17 parsing shared task winning system for four of the five languages, not only restricted to the elliptical constructions.
Kokoelmat
- Rinnakkaistallenteet [19207]