Transfer learning for hate speech detection in social media

dc.contributor.authorYuan Lanqin
dc.contributor.authorWang Tianyu
dc.contributor.authorFerraro Gabriela
dc.contributor.authorSuominen Hanna
dc.contributor.authorRizoiu Marian-Andrei
dc.contributor.organizationfi=tietotekniikan laitos|en=Department of Computing|
dc.contributor.organization-code1.2.246.10.2458963.20.85312822902
dc.converis.publication-id181952431
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/181952431
dc.date.accessioned2025-08-27T23:32:31Z
dc.date.available2025-08-27T23:32:31Z
dc.description.abstract<p>Today, the internet is an integral part of our daily lives, enabling people to be more connected than ever before. However, this greater connectivity and access to information increase exposure to harmful content, such as cyber-bullying and cyber-hatred. Models based on machine learning and natural language offer a way to make online platforms safer by identifying hate speech in web text autonomously. However, the main difficulty is annotating a sufficiently large number of examples to train these models. This paper uses a transfer learning technique to leverage two independent datasets jointly and builds a single representation of hate speech. We build an interpretable two-dimensional visualization tool of the constructed hate speech representation—dubbed the Map of Hate—in which multiple datasets can be projected and comparatively analyzed. The hateful content is annotated differently across the two datasets (racist and sexist in one dataset, hateful and offensive in another). However, the common representation successfully projects the harmless class of both datasets into the same space and can be used to uncover labeling errors (false positives). We also show that the joint representation boosts prediction performances when only a limited amount of supervision is available. These methods and insights hold the potential for safer social media and reduce the need to expose human moderators and annotators to distressing online messaging.<br></p>
dc.format.pagerange1081
dc.format.pagerange1101
dc.identifier.eissn2432-2725
dc.identifier.jour-issn2432-2717
dc.identifier.olddbid204154
dc.identifier.oldhandle10024/187181
dc.identifier.urihttps://www.utupub.fi/handle/11111/52298
dc.identifier.urlhttps://doi.org/10.1007/s42001-023-00224-9
dc.identifier.urnURN:NBN:fi-fe2025082786335
dc.language.isoen
dc.okm.affiliatedauthorSuominen, Hanna
dc.okm.discipline113 Computer and information sciencesen_GB
dc.okm.discipline113 Tietojenkäsittely ja informaatiotieteetfi_FI
dc.okm.internationalcopublicationinternational co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA1 ScientificArticle
dc.publisherSpringer
dc.publisher.countryUnited Kingdomen_GB
dc.publisher.countryBritanniafi_FI
dc.publisher.country-codeGB
dc.relation.doi10.1007/s42001-023-00224-9
dc.relation.ispartofjournalJournal of computational social science
dc.relation.issue2
dc.relation.volume6
dc.source.identifierhttps://www.utupub.fi/handle/10024/187181
dc.titleTransfer learning for hate speech detection in social media
dc.year.issued2023

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
s42001-023-00224-9.pdf
Size:
4.04 MB
Format:
Adobe Portable Document Format