Applying BLAST to Text Reuse Detection in Finnish Newspapers and Journals, 1771–1910
Heli Rantala; Asko Nivala; Hannu Salmi; Aleksi Vesanto; Filip Ginter; Tapio Salakoski
Applying BLAST to Text Reuse Detection in Finnish Newspapers and Journals, 1771–1910
Heli Rantala
Asko Nivala
Hannu Salmi
Aleksi Vesanto
Filip Ginter
Tapio Salakoski
Julkaisun pysyvä osoite on:
https://urn.fi/URN:NBN:fi-fe2021042716757
https://urn.fi/URN:NBN:fi-fe2021042716757
Tiivistelmä
We present the results of text reuse de-
tection, based on the corpus of scanned
and OCR-recognized Finnish newspapers
and journals from 1771 to 1910. Our
study draws on BLAST, a software cre-
ated for comparing and aligning biologi-
cal sequences. We show different types of
text reuse in this corpus, and also present
a comparison to the software Passim, de-
veloped at the Northeastern University in
Boston, for text reuse detection.
Kokoelmat
- Rinnakkaistallenteet [19207]