Subword Representations Successfully Decode Brain Responses to Morphologically Complex Written Words
| dc.contributor.author | Hakala, Tero | |
| dc.contributor.author | Lindh-Knuutila, Tiina | |
| dc.contributor.author | Hulten, Annika | |
| dc.contributor.author | Lehtonen, Minna | |
| dc.contributor.author | Salmelin, Riitta | |
| dc.contributor.organization | fi=logopedia|en=Speech-Language Pathology| | |
| dc.contributor.organization-code | 1.2.246.10.2458963.20.46679761984 | |
| dc.converis.publication-id | 458252538 | |
| dc.converis.url | https://research.utu.fi/converis/portal/Publication/458252538 | |
| dc.date.accessioned | 2025-08-27T23:34:45Z | |
| dc.date.available | 2025-08-27T23:34:45Z | |
| dc.description.abstract | This study extends the idea of decoding word-evoked brain activations using a corpus-semantic vector space to multimorphemic words in the agglutinative Finnish language. The corpus-semantic models are trained on word segments, and decoding is carried out with word vectors that are composed of these segments. We tested several alternative vector-space models using different segmentations: no segmentation (whole word), linguistic morphemes, statistical morphemes, random segmentation, and character-level 1-, 2- and 3-grams, and paired them with recorded MEG responses to multimorphemic words in a visual word recognition task. For all variants, the decoding accuracy exceeded the standard word-label permutation-based significance thresholds at 350-500 ms after stimulus onset. However, the critical segment-label permutation test revealed that only those segmentations that were morphologically aware reached significance in the brain decoding task. The results suggest that both whole-word forms and morphemes are represented in the brain and show that neural decoding using corpus-semantic word representations derived from compositional subword segments is applicable also for multimorphemic word forms. This is especially relevant for languages with complex morphology, because a large proportion of word forms are rare and it can be difficult to find statistically reliable surface representations for them in any large corpus. | |
| dc.format.pagerange | 844 | |
| dc.format.pagerange | 863 | |
| dc.identifier.eissn | 2641-4368 | |
| dc.identifier.jour-issn | 2641-4368 | |
| dc.identifier.olddbid | 204231 | |
| dc.identifier.oldhandle | 10024/187258 | |
| dc.identifier.uri | https://www.utupub.fi/handle/11111/52362 | |
| dc.identifier.url | https://doi.org/10.1162/nol_a_00149 | |
| dc.identifier.urn | URN:NBN:fi-fe2025082786361 | |
| dc.language.iso | en | |
| dc.okm.affiliatedauthor | Lehtonen, Minna | |
| dc.okm.discipline | 515 Psychology | en_GB |
| dc.okm.discipline | 515 Psykologia | fi_FI |
| dc.okm.internationalcopublication | international co-publication | |
| dc.okm.internationality | International publication | |
| dc.okm.type | A1 ScientificArticle | |
| dc.publisher | MIT PRESS | |
| dc.publisher.country | United States | en_GB |
| dc.publisher.country | Yhdysvallat (USA) | fi_FI |
| dc.publisher.country-code | US | |
| dc.publisher.place | CAMBRIDGE | |
| dc.relation.doi | 10.1162/nol_a_00149 | |
| dc.relation.ispartofjournal | Neurobiology of language | |
| dc.relation.issue | 4 | |
| dc.relation.volume | 5 | |
| dc.source.identifier | https://www.utupub.fi/handle/10024/187258 | |
| dc.title | Subword Representations Successfully Decode Brain Responses to Morphologically Complex Written Words | |
| dc.year.issued | 2024 |
Tiedostot
1 - 1 / 1