Empirical investigation of multi-source cross-validation in clinical ECG classification

Leinonen, Tuija; Wong, David; Vasankari, Antti; Wahab, Ali; Nadarajah, Ramesh; Kaisti, Matti; Airola, Antti

Empirical investigation of multi-source cross-validation in clinical ECG classification

dc.contributor.author	Leinonen, Tuija
dc.contributor.author	Wong, David
dc.contributor.author	Vasankari, Antti
dc.contributor.author	Wahab, Ali
dc.contributor.author	Nadarajah, Ramesh
dc.contributor.author	Kaisti, Matti
dc.contributor.author	Airola, Antti
dc.contributor.organization	fi=terveysteknologia\|en=Health Technology\|
dc.contributor.organization-code	1.2.246.10.2458963.20.28696315432
dc.converis.publication-id	458889243
dc.converis.url	https://research.utu.fi/converis/portal/Publication/458889243
dc.date.accessioned	2025-08-27T21:40:47Z
dc.date.available	2025-08-27T21:40:47Z
dc.description.abstract	Traditionally, machine learning-based clinical prediction models have been trained and evaluated on patient data from a single source, such as a hospital. Cross-validation methods can be used to estimate the accuracy of such models on new patients originating from the same source, by repeated random splitting of the data. However, such estimates tend to be highly overoptimistic when compared to accuracy obtained from deploying models to sources not represented in the dataset, such as a new hospital. The increasing availability of multi-source medical datasets provides new opportunities for obtaining more comprehensive and realistic evaluations of expected accuracy through source-level cross-validation designs. In this study, we present a systematic empirical evaluation of standard K-fold cross-validation and leave-source-out cross-validation methods in a multi-source setting. We consider the task of electrocardiogram based cardiovascular disease classification, combining and harmonizing the openly available PhysioNet/CinC Challenge 2021 and the Shandong Provincial Hospital datasets for our study. Our results show that K-fold cross-validation, both on single-source and multi-source data, systemically overestimates prediction performance when the end goal is to generalize to new sources. Leave-source-out cross-validation provides more reliable performance estimates, having close to zero bias though larger variability. The evaluation highlights the dangers of obtaining misleading cross-validation results on medical data and demonstrates how these issues can be mitigated when having access to multi-source data.
dc.identifier.eissn	1879-0534
dc.identifier.jour-issn	0010-4825
dc.identifier.olddbid	200869
dc.identifier.oldhandle	10024/183896
dc.identifier.uri	https://www.utupub.fi/handle/11111/47251
dc.identifier.url	https://doi.org/10.1016/j.compbiomed.2024.109271
dc.identifier.urn	URN:NBN:fi-fe2025082785161
dc.language.iso	en
dc.okm.affiliatedauthor	Leinonen, Tuija
dc.okm.affiliatedauthor	Vasankari, Antti
dc.okm.affiliatedauthor	Kaisti, Matti
dc.okm.affiliatedauthor	Airola, Antti
dc.okm.discipline	113 Computer and information sciences	en_GB
dc.okm.internationalcopublication	international co-publication
dc.okm.internationality	International publication
dc.okm.type	A1 ScientificArticle
dc.publisher	Elsevier
dc.publisher.country	United States	en_GB
dc.publisher.country	Yhdysvallat (USA)	fi_FI
dc.publisher.country-code	US
dc.relation.articlenumber	109271
dc.relation.doi	10.1016/j.compbiomed.2024.109271
dc.relation.ispartofjournal	Computers in Biology and Medicine
dc.relation.volume	183
dc.source.identifier	https://www.utupub.fi/handle/10024/183896
dc.title	Empirical investigation of multi-source cross-validation in clinical ECG classification
dc.year.issued	2024

Tiedostot

Näytetään 1 - 1 / 1

Name:: 1-s2.0-S0010482524013568-main.pdf
Size:: 2.14 MB
Format:: Adobe Portable Document Format

Lataa

Kokoelmat

Rinnakkaistallenteet