Explainable discovery of disease biomarkers: The case of ovarian cancer to illustrate the best practice in machine learning and Shapley analysis

dc.contributor.authorHuang Weitong
dc.contributor.authorSuominen Hanna
dc.contributor.authorLiu Tommy
dc.contributor.authorRice Gregory
dc.contributor.authorSalomon Carlos
dc.contributor.authorBarnard Amanda S
dc.contributor.organizationfi=tietotekniikan laitos|en=Department of Computing|
dc.contributor.organization-code1.2.246.10.2458963.20.85312822902
dc.converis.publication-id179594009
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/179594009
dc.date.accessioned2025-08-27T23:37:52Z
dc.date.available2025-08-27T23:37:52Z
dc.description.abstract<p>Objective:</p><p>Ovarian cancer is a significant health issue with lasting impacts on the community. Despite recent advances in surgical, chemotherapeutic and radiotherapeutic interventions, they have had only marginal impacts due to an inability to identify biomarkers at an early stage. Biomarker discovery is challenging, yet essential for improving drug discovery and clinical care. Machine learning (ML) techniques are invaluable for recognising complex patterns in biomarkers compared to conventional methods, yet they can lack physical insights into diagnosis. eXplainable Artificial Intelligence (XAI) is capable of providing deeper insights into the decision-making of complex ML algorithms increasing their applicability. We aim to introduce best practice for combining ML and XAI techniques for biomarker validation tasks.<br></p><p>Methods:</p><p>We focused on classification tasks and a game theoretic approach based on Shapley values to build and evaluate models and visualise results. We described the workflow and apply the pipeline in a case study using the CDAS PLCO Ovarian Biomarkers dataset to demonstrate the potential for accuracy and utility.<br></p><p>Results:</p><p>The case study results demonstrate the efficacy of the ML pipeline, its consistency, and advantages compared to conventional statistical approaches.<br></p><p>Conclusion:</p><p>The resulting guidelines provide a general framework for practical application of XAI in medical research that can inform clinicians and validate and explain cancer biomarkers.</p>
dc.identifier.eissn1532-0480
dc.identifier.jour-issn1532-0464
dc.identifier.olddbid204321
dc.identifier.oldhandle10024/187348
dc.identifier.urihttps://www.utupub.fi/handle/11111/52501
dc.identifier.urlhttps://doi.org/10.1016/j.jbi.2023.104365
dc.identifier.urnURN:NBN:fi-fe2023052648320
dc.language.isoen
dc.okm.affiliatedauthorSuominen, Hanna
dc.okm.discipline113 Computer and information sciencesen_GB
dc.okm.discipline3111 Biomedicineen_GB
dc.okm.discipline113 Tietojenkäsittely ja informaatiotieteetfi_FI
dc.okm.discipline3111 Biolääketieteetfi_FI
dc.okm.internationalcopublicationinternational co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA1 ScientificArticle
dc.publisherElsevier
dc.publisher.countryUnited Statesen_GB
dc.publisher.countryYhdysvallat (USA)fi_FI
dc.publisher.country-codeUS
dc.relation.articlenumber104365
dc.relation.doi10.1016/j.jbi.2023.104365
dc.relation.ispartofjournalJournal of Biomedical Informatics
dc.relation.volume141
dc.source.identifierhttps://www.utupub.fi/handle/10024/187348
dc.titleExplainable discovery of disease biomarkers: The case of ovarian cancer to illustrate the best practice in machine learning and Shapley analysis
dc.year.issued2023

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
1-s2.0-S1532046423000862-main.pdf
Size:
1.45 MB
Format:
Adobe Portable Document Format