Applying BLAST to Text Reuse Detection in Finnish Newspapers and Journals, 1771–1910
Aleksi Vesanto; Asko Nivala; Heli Rantala; Tapio Salakoski; Hannu Salmi; Filip Ginter
Applying BLAST to Text Reuse Detection in Finnish Newspapers and Journals, 1771–1910
Aleksi Vesanto
Asko Nivala
Heli Rantala
Tapio Salakoski
Hannu Salmi
Filip Ginter
Julkaisun pysyvä osoite on:
https://urn.fi/URN:NBN:fi-fe2021042716757
https://urn.fi/URN:NBN:fi-fe2021042716757
Tiivistelmä
We present the results of text reuse de-
tection, based on the corpus of scanned
and OCR-recognized Finnish newspapers
and journals from 1771 to 1910. Our
study draws on BLAST, a software cre-
ated for comparing and aligning biologi-
cal sequences. We show different types of
text reuse in this corpus, and also present
a comparison to the software Passim, de-
veloped at the Northeastern University in
Boston, for text reuse detection.
Kokoelmat
- Rinnakkaistallenteet [27094]
