Silver Syntax Pre-training for Cross-Domain Relation Extraction

dc.contributor.authorBassignana Elisa
dc.contributor.authorGinter Filip
dc.contributor.authorPyysalo Sampo
dc.contributor.authorvan der Goot Rob
dc.contributor.authorPlank Barbara
dc.contributor.organizationfi=data-analytiikka|en=Data-analytiikka|
dc.contributor.organization-code1.2.246.10.2458963.20.68940835793
dc.converis.publication-id181712942
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/181712942
dc.date.accessioned2025-08-28T01:29:31Z
dc.date.available2025-08-28T01:29:31Z
dc.description.abstract<p>Relation Extraction (RE) remains a challenging task, especially when considering realistic outof-domain evaluations. One of the main reasons for this is the limited training size of current RE datasets: obtaining high-quality (manually annotated) data is extremely expensive and cannot realistically be repeated for each new domain. An intermediate training step on data from related tasks has shown to be beneficial across many NLP tasks. However, this setup still requires supplementary annotated data, which is often not available. In this paper, we investigate intermediate pre-training specifically for RE. We exploit the affinity between syntactic structure and semantic RE, and identify the syntactic relations which are closely related to RE by being on the shortest dependency path between two entities. We then take advantage of the high accuracy of current syntactic parsers in order to automatically obtain large amounts of low-cost pre-training data. By pre-training our RE model on the relevant syntactic relations, we are able to outperform the baseline in five out of six cross-domain setups, without any additional annotated data.</p>
dc.format.pagerange6984
dc.format.pagerange6993
dc.identifier.isbn978-1-959429-62-3
dc.identifier.olddbid207619
dc.identifier.oldhandle10024/190646
dc.identifier.urihttps://www.utupub.fi/handle/11111/54382
dc.identifier.urlhttps://aclanthology.org/2023.findings-acl.436
dc.identifier.urnURN:NBN:fi-fe2025082787730
dc.language.isoen
dc.okm.affiliatedauthorGinter, Filip
dc.okm.affiliatedauthorPyysalo, Sampo
dc.okm.discipline113 Computer and information sciencesen_GB
dc.okm.discipline113 Tietojenkäsittely ja informaatiotieteetfi_FI
dc.okm.internationalcopublicationinternational co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA4 Conference Article
dc.publisher.countryUnited Statesen_GB
dc.publisher.countryYhdysvallat (USA)fi_FI
dc.publisher.country-codeUS
dc.relation.conferenceFindings of the Association for Computational Linguistics
dc.relation.doi10.18653/v1/2023.findings-acl.436
dc.source.identifierhttps://www.utupub.fi/handle/10024/190646
dc.titleSilver Syntax Pre-training for Cross-Domain Relation Extraction
dc.title.bookFindings of the Association for Computational Linguistics: ACL 2023
dc.year.issued2023

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
Silver Syntax.pdf
Size:
783.05 KB
Format:
Adobe Portable Document Format