Finnish Paraphrase Corpus
Pysyvä osoite
Verkkojulkaisu
DOI
Tiivistelmä
In this paper, we introduce the firstfully manually annotated paraphrase cor-pus for Finnish containing 53,572 para-phrase pairs harvested from alternative subtitles and news headings. Out of all paraphrase pairs in our corpus 98% are manually classified to be paraphrases at least in their given context, if not in all contexts. Additionally, we establish a manual candidate selection method and demonstrate its feasibility in high quality paraphrase selection in terms of both costand quality.
Sarja
Linköping Electronic Conference Proceedings