Dependency parsing of biomedical text with BERT

dc.contributor.authorKanerva Jenna
dc.contributor.authorGinter Filip
dc.contributor.authorPyysalo Sampo
dc.contributor.organizationfi=kieli- ja puheteknologia|en=Language and Speech Technology|
dc.contributor.organization-code1.2.246.10.2458963.20.47465613983
dc.converis.publication-id51380265
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/51380265
dc.date.accessioned2022-02-25T16:08:24Z
dc.date.available2022-02-25T16:08:24Z
dc.description.abstract<p>Abstract Background: : Syntactic analysis, or parsing, is a key task in natural language processing and a required component for many text mining approaches. In recent years, Universal Dependencies (UD) has emerged as the leading formalism for dependency parsing. While a number of recent tasks centering on UD have substantially advanced the state of the art in multilingual parsing, there has been only little study of parsing texts from specialized domains such as biomedicine. Methods: : We explore the application of state-of-the-art neural dependency parsing methods to biomedical text using the recently introduced CRAFT-SA shared task dataset. The CRAFT-SA task broadly follows the UD representation and recent UD task conventions, allowing us to fne-tune the UD-compatible Turku Neural Parser and UDify neural parsers to the task. We further evaluate the efect of transfer learning using a broad selection of BERT models, including several models pre-trained specifcally for biomedical text processing. Results: : We fnd that recently introduced neural parsing technology is capable of generating highly accurate analyses of biomedical text, substantially improving on the best performance reported in the original CRAFT-SA shared task. We also fnd that initialization using a deep transfer learning model pre-trained on in-domain texts is key to maximizing the performance of the parsing methods. Keywords: Parsing, Deep learning, CRAFT<br /></p>
dc.identifier.eissn1471-2105
dc.identifier.jour-issn1471-2105
dc.identifier.olddbid170146
dc.identifier.oldhandle10024/153256
dc.identifier.urihttps://www.utupub.fi/handle/11111/29228
dc.identifier.urnURN:NBN:fi-fe2021042820764
dc.language.isoen
dc.okm.affiliatedauthorKanerva, Jenna
dc.okm.affiliatedauthorGinter, Filip
dc.okm.affiliatedauthorPyysalo, Sampo
dc.okm.discipline113 Computer and information sciencesen_GB
dc.okm.discipline113 Tietojenkäsittely ja informaatiotieteetfi_FI
dc.okm.internationalcopublicationnot an international co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA1 ScientificArticle
dc.publisherBiomed Central Ltd.
dc.publisher.countryUnited Kingdomen_GB
dc.publisher.countryBritanniafi_FI
dc.publisher.country-codeGB
dc.relation.articlenumber580
dc.relation.doi10.1186/s12859-020-03905-8
dc.relation.ispartofjournalBMC Bioinformatics
dc.relation.issueSuppl 23
dc.relation.volume21
dc.source.identifierhttps://www.utupub.fi/handle/10024/153256
dc.titleDependency parsing of biomedical text with BERT
dc.year.issued2020

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
s12859-020-03905-8.pdf
Size:
941.78 KB
Format:
Adobe Portable Document Format
Description:
Publisher's PDF