Machine learning for survival outcome in head and neck squamous cell carcinoma: a multicenter validation study

dc.contributor.authorAlabi, Rasheed Omobolaji
dc.contributor.authorGuntinas-Lichius, Orlando
dc.contributor.authorElmusrati, Mohammed
dc.contributor.authorAlmangush, Alhadi
dc.contributor.authorTiblom Ehrsson, Ylva
dc.contributor.authorLaurell, Göran
dc.contributor.authorMäkitie, Antti A.
dc.contributor.organizationfi=biolääketieteen laitos|en=Institute of Biomedicine|
dc.contributor.organization-code1.2.246.10.2458963.20.77952289591
dc.converis.publication-id505677559
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/505677559
dc.date.accessioned2026-01-21T14:32:52Z
dc.date.available2026-01-21T14:32:52Z
dc.description.abstract<p>Most head and neck squamous cell carcinoma (HNSCC) cases are diagnosed late, with an increased risk of recurrence and distant metastasis. In recent years, there has been a surge in the development of prognostic and predictive machine learning (ML) models for personalized treatment planning. However, only a small number of these have been externally validated. This study aimed to build a prognostic system by combining clinicopathological parameters and treatment-related factors as integrative inputs to build a machine learning (ML) model using data from the Surveillance, Epidemiology, and End Results (SEER, United States) program. We further validated the developed model using multicenter data obtained from the Thuringian Cancer Registry (Germany) and a multicenter prospective observational study obtained from the Uppsala University Hospital (Sweden) to estimate the overall survival (OS) of patients with HNSCC. Additionally, we explored the complementary prognostic potentials of these input parameters using permutation feature importance (PFI). A total of 40,164 patients with HNSCC were recruited from the SEER database and validated with 3950 cases obtained from the Thuringian Cancer Registry and 323 cases recruited from three University Hospitals in Sweden. We evaluated the prognostic significance of the input variables to predict OS in patients with HNSCC using permutation feature importance. The voting ensemble ML algorithm gave an area under receiving operating characteristics curve (AUC) of 0.76 and an accuracy of 70.0%. Independent external validation of the validation model with data from the Thuringian Cancer Registry and the Uppsala University Hospital gave AUCs of 0.68 and 0.76, with decreased performance accuracy in both cohorts. The PFI analysis of the base model showed that age at diagnosis, T stage, tumor site, marital status, and surgical treatment were the most important parameters for the predictive ability of the model for OS. External independent geographic validation is important for performance reproducibility and model generalization before recommending the model for further clinical evaluation. External independent geographic validation may not necessarily increase the performance accuracy. However, it can reveal and demonstrate the performance of the model outside the development data. A generalized ML can lead to individualized risk-based therapeutic decision-making. While independently validating the model may be possible during model development, data privacy and security-related issues may prevent including it as a prerequisite in the ML model development pipeline.<br></p>
dc.identifier.eissn2045-2322
dc.identifier.olddbid213380
dc.identifier.oldhandle10024/196398
dc.identifier.urihttps://www.utupub.fi/handle/11111/55248
dc.identifier.urlhttps://doi.org/10.1038/s41598-025-29295-6
dc.identifier.urnURN:NBN:fi-fe202601215505
dc.language.isoen
dc.okm.affiliatedauthorAlmangush, Alhadi
dc.okm.discipline3111 Biomedicineen_GB
dc.okm.discipline3111 Biolääketieteetfi_FI
dc.okm.internationalcopublicationinternational co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA1 ScientificArticle
dc.publisherSpringer Nature
dc.publisher.countryUnited Kingdomen_GB
dc.publisher.countryBritanniafi_FI
dc.publisher.country-codeGB
dc.relation.articlenumber254
dc.relation.doi10.1038/s41598-025-29295-6
dc.relation.ispartofjournalScientific Reports
dc.relation.volume16
dc.source.identifierhttps://www.utupub.fi/handle/10024/196398
dc.titleMachine learning for survival outcome in head and neck squamous cell carcinoma: a multicenter validation study
dc.year.issued2026

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
s41598-025-29295-6.pdf
Size:
1.73 MB
Format:
Adobe Portable Document Format