Has machine learning over-promised in healthcare?: A critical analysis and a proposal for improved evaluation, with evidence from Parkinson's disease

dc.contributor.authorGe Wenbo
dc.contributor.authorLueck Christian
dc.contributor.authorSuominen Hanna
dc.contributor.authorApthorp Deborah
dc.contributor.organizationfi=tietotekniikan laitos|en=Department of Computing|
dc.contributor.organization-code1.2.246.10.2458963.20.85312822902
dc.converis.publication-id179260325
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/179260325
dc.date.accessioned2025-08-27T23:00:36Z
dc.date.available2025-08-27T23:00:36Z
dc.description.abstract<p>Adoption of artificial intelligence (AI) by the medical community has long been anticipated, endorsed by a stream of machine learning literature showcasing AI systems that yield extraordinary performance. However, many of these systems are likely over-promising and will under-deliver in practice. One key reason is the community’s failure to acknowledge and address the presence of inflationary effects in the data. These simultaneously inflate evaluation performance and prevent a model from learning the underlying task, thus severely misrepresenting how that model would perform in the real world. This paper investigated the impact of these inflationary effects on healthcare tasks, as well as how these effects can be addressed. Specifically, we defined three inflationary effects that occur in medical data sets and allow models to easily reach small training losses and prevent skillful learning. We investigated two data sets of sustained vowel phonation from participants with and without Parkinson’s disease, and revealed that published models which have achieved high classification performances on these were artificially enhanced due to the inflationary effects. Our experiments showed that removing each inflationary effect corresponded with a decrease in classification accuracy, and that removing all inflationary effects reduced the evaluated performance by up to 30%. Additionally, the performance on a more realistic test set increased, suggesting that the removal of these inflationary effects enabled the model to better learn the underlying task and generalize. Source code is available at https://github.com/Wenbo-G/pd-phonation-analysis under the MIT license.<br></p>
dc.identifier.eissn1873-2860
dc.identifier.jour-issn0933-3657
dc.identifier.olddbid203213
dc.identifier.oldhandle10024/186240
dc.identifier.urihttps://www.utupub.fi/handle/11111/29134
dc.identifier.urlhttps://www.sciencedirect.com/science/article/pii/S0933365723000386
dc.identifier.urnURN:NBN:fi-fe2023042037770
dc.language.isoen
dc.okm.affiliatedauthorSuominen, Hanna
dc.okm.discipline113 Computer and information sciencesen_GB
dc.okm.discipline217 Medical engineeringen_GB
dc.okm.discipline113 Tietojenkäsittely ja informaatiotieteetfi_FI
dc.okm.discipline217 Lääketieteen tekniikkafi_FI
dc.okm.internationalcopublicationinternational co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA1 ScientificArticle
dc.publisherElsevier B.V.
dc.publisher.countryNetherlandsen_GB
dc.publisher.countryAlankomaatfi_FI
dc.publisher.country-codeNL
dc.relation.articlenumber102524
dc.relation.doi10.1016/j.artmed.2023.102524
dc.relation.ispartofjournalArtificial Intelligence in Medicine
dc.relation.volume139
dc.source.identifierhttps://www.utupub.fi/handle/10024/186240
dc.titleHas machine learning over-promised in healthcare?: A critical analysis and a proposal for improved evaluation, with evidence from Parkinson's disease
dc.year.issued2023

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
1-s2.0-S0933365723000386-main(1).pdf
Size:
1023.23 KB
Format:
Adobe Portable Document Format