Subword Representations Successfully Decode Brain Responses to Morphologically Complex Written Words

dc.contributor.authorHakala, Tero
dc.contributor.authorLindh-Knuutila, Tiina
dc.contributor.authorHulten, Annika
dc.contributor.authorLehtonen, Minna
dc.contributor.authorSalmelin, Riitta
dc.contributor.organizationfi=logopedia|en=Speech-Language Pathology|
dc.contributor.organization-code1.2.246.10.2458963.20.46679761984
dc.converis.publication-id458252538
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/458252538
dc.date.accessioned2025-08-27T23:34:45Z
dc.date.available2025-08-27T23:34:45Z
dc.description.abstractThis study extends the idea of decoding word-evoked brain activations using a corpus-semantic vector space to multimorphemic words in the agglutinative Finnish language. The corpus-semantic models are trained on word segments, and decoding is carried out with word vectors that are composed of these segments. We tested several alternative vector-space models using different segmentations: no segmentation (whole word), linguistic morphemes, statistical morphemes, random segmentation, and character-level 1-, 2- and 3-grams, and paired them with recorded MEG responses to multimorphemic words in a visual word recognition task. For all variants, the decoding accuracy exceeded the standard word-label permutation-based significance thresholds at 350-500 ms after stimulus onset. However, the critical segment-label permutation test revealed that only those segmentations that were morphologically aware reached significance in the brain decoding task. The results suggest that both whole-word forms and morphemes are represented in the brain and show that neural decoding using corpus-semantic word representations derived from compositional subword segments is applicable also for multimorphemic word forms. This is especially relevant for languages with complex morphology, because a large proportion of word forms are rare and it can be difficult to find statistically reliable surface representations for them in any large corpus.
dc.format.pagerange844
dc.format.pagerange863
dc.identifier.eissn2641-4368
dc.identifier.jour-issn2641-4368
dc.identifier.olddbid204231
dc.identifier.oldhandle10024/187258
dc.identifier.urihttps://www.utupub.fi/handle/11111/52362
dc.identifier.urlhttps://doi.org/10.1162/nol_a_00149
dc.identifier.urnURN:NBN:fi-fe2025082786361
dc.language.isoen
dc.okm.affiliatedauthorLehtonen, Minna
dc.okm.discipline515 Psychologyen_GB
dc.okm.discipline515 Psykologiafi_FI
dc.okm.internationalcopublicationinternational co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA1 ScientificArticle
dc.publisherMIT PRESS
dc.publisher.countryUnited Statesen_GB
dc.publisher.countryYhdysvallat (USA)fi_FI
dc.publisher.country-codeUS
dc.publisher.placeCAMBRIDGE
dc.relation.doi10.1162/nol_a_00149
dc.relation.ispartofjournalNeurobiology of language
dc.relation.issue4
dc.relation.volume5
dc.source.identifierhttps://www.utupub.fi/handle/10024/187258
dc.titleSubword Representations Successfully Decode Brain Responses to Morphologically Complex Written Words
dc.year.issued2024

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
nol_a_00149.pdf
Size:
2.57 MB
Format:
Adobe Portable Document Format