Clustering Nursing Sentences - Comparing Three Sentence Embedding Methods

dc.contributor.authorMoen Hans
dc.contributor.authorSuhonen Henry
dc.contributor.authorSalanterä Sanna
dc.contributor.authorSalakoski Tapio
dc.contributor.authorPeltonen Laura-Maria
dc.contributor.organizationfi=hoitotieteen laitos|en=Department of Nursing Science|
dc.contributor.organizationfi=matemaattis-luonnontieteellinen tiedekunta|en=Faculty of Science|
dc.contributor.organizationfi=matematiikan ja tilastotieteen laitos|en=Department of Mathematics and Statistics|
dc.contributor.organizationfi=tyks, vsshp|en=tyks, varha|
dc.contributor.organization-code1.2.246.10.2458963.20.27201741504
dc.contributor.organization-code1.2.246.10.2458963.20.36798383026
dc.contributor.organization-code2607400
dc.converis.publication-id178641781
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/178641781
dc.date.accessioned2025-08-28T02:30:25Z
dc.date.available2025-08-28T02:30:25Z
dc.description.abstract<p>In health sciences, high-quality text embeddings may augment qualitative data analysis of large amounts of text by enabling, e.g., searching and clustering of health information. This study aimed to evaluate three different sentence-level embedding methods in clustering sentences in nursing narratives from individual patients' hospital care episodes. Two of these embeddings are generated from language models based on the BERT framework, and the third on the Sent2Vec method. These embedding methods were used to cluster sentences from 20 patient care episodes and the results were manually evaluated. Findings suggest that the best clusters were produced by the embeddings from a BERT model fine-tuned for the proxy task of predicting subject headings for nursing text.<br></p>
dc.format.pagerange854
dc.format.pagerange858
dc.identifier.eisbn978-1-64368-285-3
dc.identifier.isbn978-1-64368-284-6
dc.identifier.issn0926-9630
dc.identifier.olddbid209211
dc.identifier.oldhandle10024/192238
dc.identifier.urihttps://www.utupub.fi/handle/11111/40784
dc.identifier.urlhttps://ebooks.iospress.nl/doi/10.3233/SHTI220606
dc.identifier.urnURN:NBN:fi-fe2023022128021
dc.language.isoen
dc.okm.affiliatedauthorSuhonen, Henry
dc.okm.affiliatedauthorSalanterä, Sanna
dc.okm.affiliatedauthorSalakoski, Tapio
dc.okm.affiliatedauthorPeltonen, Laura-Maria
dc.okm.affiliatedauthorDataimport, tyks, vsshp
dc.okm.affiliatedauthorDataimport, Matematiikan ja tilastotieteen lait yht
dc.okm.discipline113 Computer and information sciencesen_GB
dc.okm.discipline316 Nursingen_GB
dc.okm.discipline113 Tietojenkäsittely ja informaatiotieteetfi_FI
dc.okm.discipline316 Hoitotiedefi_FI
dc.okm.internationalcopublicationnot an international co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA4 Conference Article
dc.publisher.countryNetherlandsen_GB
dc.publisher.countryAlankomaatfi_FI
dc.publisher.country-codeNL
dc.relation.conferenceMedical Informatics Europe
dc.relation.doi10.3233/SHTI220606
dc.relation.ispartofjournalMedical informatics Europe
dc.relation.ispartofseriesStudies in Health Technology and Informatics
dc.relation.volume294
dc.source.identifierhttps://www.utupub.fi/handle/10024/192238
dc.titleClustering Nursing Sentences - Comparing Three Sentence Embedding Methods
dc.title.bookChallenges of Trustable AI and Added-Value on Health
dc.year.issued2022

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
SHTI-294-SHTI220606.pdf
Size:
161.64 KB
Format:
Adobe Portable Document Format