Template-free Data-to-Text Generation of Finnish Sports News
Pysyvä osoite
Verkkojulkaisu
DOI
Tiivistelmä
News articles such as sports game reports
are often thought to closely follow the underlying game statistics, but in practice
they contain a notable amount of background knowledge, interpretation, insight
into the game, and quotes that are not
present in the official statistics. This
poses a challenge for automated data-totext news generation with real-world news
corpora as training data. We report on
the development of a corpus of Finnish
ice hockey news, edited to be suitable
for training of end-to-end news generation
methods, as well as demonstrate generation of text, which was judged by journalists to be relatively close to a viable product. The new dataset and system source
code are available for research purposes.
Sarja
NEALT Proceedings Series