Post-hoc Interpretation of Deep Learning-based Survival Model on Colorectal Cancer Using Hematoxylin and Eosin Slides

dc.contributor.authorLe, Phuong
dc.contributor.departmentfi=Biolääketieteen laitos|en=Institute of Biomedicine|
dc.contributor.facultyfi=Lääketieteellinen tiedekunta|en=Faculty of Medicine|
dc.contributor.studysubjectfi=Biomedical Imaging|en=Biomedical Imaging|
dc.date.accessioned2026-06-15T19:31:43Z
dc.date.issued2026-04-29
dc.description.abstractBackground: Colorectal cancer (CRC) prognosis relies primarily on the Tumor-Nodes-Metastasis (TNM) staging system, yet tumor heterogeneity limits its accuracy. Therefore, numerous new pathological features and biomarkers have been developed to complement this system. Nevertheless, they have been overcomplicated to implement clinically. Deep learning (DL) models show promising ability to solve this issue by extracting prognostic information from routine hematoxylin and eosin (H&E)-stained whole slide images (WSIs). However, their clinical adoption remains limited because most models lack explanation, while clinical pathologists require interpretability. Objective: The purpose of this thesis was to interpret a multiple-instance learning (MIL)-based survival model for CRC, addressing three research questions: (1) model performance, (2) histopathological patterns within the image patches most strongly influencing the model’s predictions, and (3) sources of model failure. Methods: A Clustering-constrained Attention Multiple Instance Learning (CLAM) model using UNI2-h foundation model embeddings predicted 5-year survival status, approximating disease-specific survival (DSS), as non-disease-related deaths within 5 years were excluded, from WSIs in a single-center, retrospective cohort. Post-hoc interpretation combined attention heatmaps, K-means clustering (k=75) of high-confidence tile embeddings, forward stepwise selection of prognostic clusters, and pathologist review of representative tiles from the top 10 most informative feature clusters. Results: The UNI2-h–CLAM survival model achieved robust discrimination of 5‑year survival status in CRC, with AUC = 0.79 ± 0.034 at the patient-level, c-index = 0.738. The Kaplan-Meier analysis proved a clear survival stratification by CLAM predictions, with a hazard ratio (HR) of 4.27 (95% CI, 3.22–5.66; log-rank p-value <0.001). The model mostly focused on the tumor epithelium in alive-predicted cases, whereas invasion zones and peritumoral stroma were important in dead-predicted cases. Dead-predicted tiles showed high-grade atypia, poorly differentiated clusters, immature/myxoid desmoplastic stroma, and stroma-high feature, while alive-predicted tiles displayed preserved glands and inflammatory infiltrates. The top 10 clusters explained 73.9% of model variance (R²=0.739). Errors involved non-representative WSIs selection, intra-patient heterogeneity, artifacts, and tissue microarray core loss. Conclusion: This multiple-instance learning-based DL model predicts CRC survival effectively. The model independently learned prognostic features aligning with WHO/AJCC-recognized markers (poor differentiation, tumor budding-like clusters, immature desmoplasia). This data-driven interpretation framework bridges the gap between deep learning models and clinical pathology, supporting tumor-stroma ratio, desmoplastic reaction as robust artificial intelligence - discoverable biomarkers for risk stratification.
dc.format.extent78
dc.identifier.urihttps://www.utupub.fi/handle/11111/61921
dc.identifier.urnURN:NBN:fi-fe2026061569350
dc.language.isoeng
dc.rightsfi=Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.|en=This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.|
dc.rights.accessrightsavoin
dc.subjectSurvival Prediction
dc.subjectDeep learning
dc.subjectColorectal Cancer
dc.subjectDigital Pathology
dc.titlePost-hoc Interpretation of Deep Learning-based Survival Model on Colorectal Cancer Using Hematoxylin and Eosin Slides
dc.type.ontasotfi=Pro gradu -tutkielma|en=Master's thesis|

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
Le_Phuong_Thesis.pdf
Size:
3.4 MB
Format:
Adobe Portable Document Format