Transfer learning for hate speech detection in social media

Yuan Lanqin; Wang Tianyu; Ferraro Gabriela; Suominen Hanna; Rizoiu Marian-Andrei

Transfer learning for hate speech detection in social media

dc.contributor.author	Yuan Lanqin
dc.contributor.author	Wang Tianyu
dc.contributor.author	Ferraro Gabriela
dc.contributor.author	Suominen Hanna
dc.contributor.author	Rizoiu Marian-Andrei
dc.contributor.organization	fi=tietotekniikan laitos\|en=Department of Computing\|
dc.contributor.organization-code	1.2.246.10.2458963.20.85312822902
dc.converis.publication-id	181952431
dc.converis.url	https://research.utu.fi/converis/portal/Publication/181952431
dc.date.accessioned	2025-08-27T23:32:31Z
dc.date.available	2025-08-27T23:32:31Z
dc.description.abstract	<p>Today, the internet is an integral part of our daily lives, enabling people to be more connected than ever before. However, this greater connectivity and access to information increase exposure to harmful content, such as cyber-bullying and cyber-hatred. Models based on machine learning and natural language offer a way to make online platforms safer by identifying hate speech in web text autonomously. However, the main difficulty is annotating a sufficiently large number of examples to train these models. This paper uses a transfer learning technique to leverage two independent datasets jointly and builds a single representation of hate speech. We build an interpretable two-dimensional visualization tool of the constructed hate speech representation—dubbed the Map of Hate—in which multiple datasets can be projected and comparatively analyzed. The hateful content is annotated differently across the two datasets (racist and sexist in one dataset, hateful and offensive in another). However, the common representation successfully projects the harmless class of both datasets into the same space and can be used to uncover labeling errors (false positives). We also show that the joint representation boosts prediction performances when only a limited amount of supervision is available. These methods and insights hold the potential for safer social media and reduce the need to expose human moderators and annotators to distressing online messaging.<br></p>
dc.format.pagerange	1101
dc.identifier.eissn	2432-2725
dc.identifier.jour-issn	2432-2717
dc.identifier.olddbid	204154
dc.identifier.oldhandle	10024/187181
dc.identifier.uri	https://www.utupub.fi/handle/11111/52298
dc.identifier.url	https://doi.org/10.1007/s42001-023-00224-9
dc.identifier.urn	URN:NBN:fi-fe2025082786335
dc.language.iso	en
dc.okm.affiliatedauthor	Suominen, Hanna
dc.okm.discipline	113 Computer and information sciences	en_GB
dc.okm.discipline	113 Tietojenkäsittely ja informaatiotieteet	fi_FI
dc.okm.internationalcopublication	international co-publication
dc.okm.internationality	International publication
dc.okm.type	A1 ScientificArticle
dc.publisher	Springer
dc.publisher.country	United Kingdom	en_GB
dc.publisher.country	Britannia	fi_FI
dc.publisher.country-code	GB
dc.relation.doi	10.1007/s42001-023-00224-9
dc.relation.ispartofjournal	Journal of computational social science
dc.relation.issue	2
dc.relation.volume	6
dc.source.identifier	https://www.utupub.fi/handle/10024/187181
dc.title	Transfer learning for hate speech detection in social media
dc.year.issued	2023

Tiedostot

Näytetään 1 - 1 / 1

Name:: s42001-023-00224-9.pdf
Size:: 4.04 MB
Format:: Adobe Portable Document Format

Lataa

Kokoelmat

Rinnakkaistallenteet