Hyppää sisältöön
    • Suomeksi
    • In English
  • Suomeksi
  • In English
  • Kirjaudu
Näytä aineisto 
  •   Etusivu
  • 3. UTUCris-artikkelit
  • Rinnakkaistallenteet
  • Näytä aineisto
  •   Etusivu
  • 3. UTUCris-artikkelit
  • Rinnakkaistallenteet
  • Näytä aineisto
JavaScript is disabled for your browser. Some features of this site may not work without it.

Exploring Cross-sentence Contexts for Named Entity Recognition with BERT

Pyysalo Sampo; Luoma Jouni

Exploring Cross-sentence Contexts for Named Entity Recognition with BERT

Pyysalo Sampo
Luoma Jouni
Katso/Avaa
Publisher's PDF (336.9Kb)
Lataukset: 

doi:10.18653/v1/2020.coling-main.78
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on:
https://urn.fi/URN:NBN:fi-fe2021042825330
Tiivistelmä

Named entity recognition (NER) is frequently addressed as a sequence classification task with
each input consisting of one sentence of text. It is nevertheless clear that useful information for
NER is often found also elsewhere in text. Recent self-attention models like BERT can both
capture long-distance relationships in input and represent inputs consisting of several sentences.
This creates opportunities for adding cross-sentence information in natural language processing
tasks. This paper presents a systematic study exploring the use of cross-sentence information
for NER using BERT models in five languages. We find that adding context as additional sentences to BERT input systematically increases NER performance. Multiple sentences in input
samples allows us to study the predictions of the sentences in different contexts. We propose
a straightforward method, Contextual Majority Voting (CMV), to combine these different predictions and demonstrate this to further increase NER performance. Evaluation on established
datasets, including the CoNLL’02 and CoNLL’03 NER benchmarks, demonstrates that our proposed approach can improve on the state-of-the-art NER results on English, Dutch, and Finnish,
achieves the best reported BERT-based results on German, and is on par with other BERT-based
approaches in Spanish. We release all methods implemented in this work under open licenses.

Kokoelmat
  • Rinnakkaistallenteet [19207]

Turun yliopiston kirjasto | Turun yliopisto
julkaisut@utu.fi | Tietosuoja | Saavutettavuusseloste
 

 

Tämä kokoelma

JulkaisuajatTekijätNimekkeetAsiasanatTiedekuntaLaitosOppiaineYhteisöt ja kokoelmat

Omat tiedot

Kirjaudu sisäänRekisteröidy

Turun yliopiston kirjasto | Turun yliopisto
julkaisut@utu.fi | Tietosuoja | Saavutettavuusseloste