Enhancing Cross-Hospital Generalizability of Deep Learning Models in ECG Classification: A Comparative Study
Bektanov, Aituar (2025-06-04)
Enhancing Cross-Hospital Generalizability of Deep Learning Models in ECG Classification: A Comparative Study
Bektanov, Aituar
(04.06.2025)
Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.
avoin
Julkaisun pysyvä osoite on:
https://urn.fi/URN:NBN:fi-fe2025061267387
https://urn.fi/URN:NBN:fi-fe2025061267387
Tiivistelmä
Deep learning models have demonstrated excellent performance in electrocardiogram (ECG) classification tasks. However, because of domain shift caused by differences in patient demographics, recording devices, and labeling procedures, its generalizability across data from other hospitals is still limited. Therefore, this thesis examines whether some domain generalization methods are useful for improving the cross-hospital generalizability of ECG classification models.
The thesis investigates two domain generalization methods: multi-source domain-adversarial training (DANN) and MixStyle. Both of them are implemented inside an SE-ResNet model architecture. The models are trained, validated, and tested on a multi-source dataset comprising five publicly available ECG data sources. The macro-averaged area under the ROC curve (AUROC) serves as an evaluation metric of the performance of the models on unseen domains.
According to the results of the experiments, the DANN-based model modestly outperforms the baseline model in several cases, particularly when the CPSC and CPSC-Extra and the SPH domains are used as test sets. On the other hand, the MixStyle-based model does not produce improved generalization results. Additionally, the results suggest that standard hyperparameter selection using the validation set may not work well for domain generalization in this context because validation sets do not contain data from unseen domains.
Overall, the results of this thesis show how complex the domain generalization problem in ECG classification is. Apart from that, they demonstrate that, even though multi-source domain-adversarial training (DANN) might be useful for improving the generalization performance of deep learning models in the context of ECG classification, it is not a standalone solution. Thorough future work, including the usage of other domain generalization methods and data from new domains, should be done on this topic.
The thesis investigates two domain generalization methods: multi-source domain-adversarial training (DANN) and MixStyle. Both of them are implemented inside an SE-ResNet model architecture. The models are trained, validated, and tested on a multi-source dataset comprising five publicly available ECG data sources. The macro-averaged area under the ROC curve (AUROC) serves as an evaluation metric of the performance of the models on unseen domains.
According to the results of the experiments, the DANN-based model modestly outperforms the baseline model in several cases, particularly when the CPSC and CPSC-Extra and the SPH domains are used as test sets. On the other hand, the MixStyle-based model does not produce improved generalization results. Additionally, the results suggest that standard hyperparameter selection using the validation set may not work well for domain generalization in this context because validation sets do not contain data from unseen domains.
Overall, the results of this thesis show how complex the domain generalization problem in ECG classification is. Apart from that, they demonstrate that, even though multi-source domain-adversarial training (DANN) might be useful for improving the generalization performance of deep learning models in the context of ECG classification, it is not a standalone solution. Thorough future work, including the usage of other domain generalization methods and data from new domains, should be done on this topic.