Automatic text summarization

dc.contributor.authorAnttila, Pauliina
dc.contributor.departmentfi=Tulevaisuuden teknologioiden laitos|en=Department of Future Technologies|
dc.contributor.facultyfi=Luonnontieteiden ja tekniikan tiedekunta|en=Faculty of Science and Engineering|
dc.contributor.studysubjectfi=Tietojenkäsittelytiede|en=Computer Science|
dc.date.accessioned2018-07-02T11:24:33Z
dc.date.available2018-07-02T11:24:33Z
dc.date.issued2018-05-22
dc.description.abstractAutomatic text summarization has been a rapidly developing research area in natural language processing for the last 70 years. The development has progressed from simple heuristics to neural networks and deep learning. Both extractive and abstractive methods have maintained their interest to this day. In this thesis, we will research different methods on automatic text summarization and evaluate their capability to summarize text written in Finnish. We will build an extractive summarizer and evaluate how well it performs on Finnish news data. We also evaluate the goodness of the news data to see can it be used in the future to develop a deep learning based summarizer. The obtained ROUGE scores tell that the performance is not what is expected today from a generic summarizer. On the other hand, the qualitative evaluation reveals that the generated summaries often are more factual than the gold standard summaries in the data set.
dc.format.extent57
dc.identifier.olddbid162238
dc.identifier.oldhandle10024/145467
dc.identifier.urihttps://www.utupub.fi/handle/11111/14432
dc.identifier.urnURN:NBN:fi-fe2018052924966
dc.language.isoeng
dc.rightsfi=Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.|en=This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.|
dc.rights.accessrightsavoin
dc.source.identifierhttps://www.utupub.fi/handle/10024/145467
dc.subjectsummarization, news data, keyphrase extraction, extractive, natural language processing, data mining
dc.titleAutomatic text summarization
dc.type.ontasotfi=Pro gradu -tutkielma|en=Master's thesis|

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
Anttila_Pauliina_progradu.pdf
Size:
332.03 KB
Format:
Adobe Portable Document Format