Multimodal Artificial Intelligence Model for Wound Care
Pörhö, Benjamin (2025-08-29)
Multimodal Artificial Intelligence Model for Wound Care
Pörhö, Benjamin
(29.08.2025)
Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.
avoin
Julkaisun pysyvä osoite on:
https://urn.fi/URN:NBN:fi-fe2025091195458
https://urn.fi/URN:NBN:fi-fe2025091195458
Tiivistelmä
Chronic wounds, particularly diabetic wounds represent a significant clinical and economic burden due to prolonged healing times and high rates of complications. Artificial intelligence (AI) is a promising approach to support wound assessment and guide treatment decisions. However, most existing models lean on single-modality inputs. This thesis presents a proof-of-concept (POC) multimodal AI model that combines RGB images, thermal images, and wound area measurements to predict treatment responses on wounds treated with nanofibrillated cellulose (NFC) hydrogel.
Biological mechanisms of wound healing, conventional and advanced wound care, and prior AI-based methods in wound care were reviewed. The hypothesis was defined: A multimodal AI model integrating RGB images, thermal images, and wound area measurements would succeed in classifying the post operative day (POD), of an input wound. Contribution of different modalities would vary, RGB images providing the most predictive value for the model, and the thermal images adding confusion due to their lower quality and homogeneous nature in the later PODs. Still the multimodal model would achieve higher accuracies in predicting treatment outcomes compared to the unimodal approach. The specific aims were to design a three-branch neural network combining convolutional feature extraction for RGB and thermal images with a fully connected branch for wound area data. Another aim was to evaluate the contribution of each modality through ablation experiment and fusion strategies, such as concatenation, attention based fusion and weighted static fusion. Finally, Grad- CAM interpretability techniques were applied to visualize the model’s decision-making.
Dataset details, preprocessing techniques, and cross-validation strategy were covered along with performance metrics including accuracy, precision, recall, and F1-score. Model was implemented using TensorFlow, and hyperparameter optimization was performed with KerasTuner. Classes were defined by POD, although this label introduced impracticalities due to varying individual healing trajectories.
Results demonstrated that models trained without strict subject-level splitting showed inflated performance, underscoring the necessity of partitioning by subject (wound). In an ablation study, RGB-only model achieved a classification accuracy of 0.70, combining wound area improved accuracy to 0.75, indicating that explicit size context improves predictions. Adding thermal images to form a three-input model provided a slight decrease in overall accuracy (~0.73), suggesting that thermal data is most informative during early days, but may introduce noise later in the timeframe. Grad-CAM visualizations confirmed that the model was focusing on healing relevant features, rather than noise or image specific features. Exploring different fusion strategies revealed that the weighted static fusion model provided the highest accuracies, reaching testing accuracy of ~0.85 with the same subject that was held out in the ablation study. KerasTuner hyperparameter tuning process involving nested Leave-One-Group-Out Cross-Validation (LOGO CV) reached an average accuracy of 0.71. Key limitations included the small, homogeneous dataset, and the use of POD as the main label for healing stage, which did not align uniformly with biological wound progression. Future work should utilize clinically relevant labels, focus on high-quality RGB images, precise sizing, and careful integration of thermal cues, avoiding confusing the model. This multimodal POC model introduced a scalable framework for AI-driven wound assessment, highlighting best practices in data handling, model architecture, and interpretability. By addressing these considerations, AI-based wound care tools may achieve greater reliability and clinical relevance when extended to human datasets.
Biological mechanisms of wound healing, conventional and advanced wound care, and prior AI-based methods in wound care were reviewed. The hypothesis was defined: A multimodal AI model integrating RGB images, thermal images, and wound area measurements would succeed in classifying the post operative day (POD), of an input wound. Contribution of different modalities would vary, RGB images providing the most predictive value for the model, and the thermal images adding confusion due to their lower quality and homogeneous nature in the later PODs. Still the multimodal model would achieve higher accuracies in predicting treatment outcomes compared to the unimodal approach. The specific aims were to design a three-branch neural network combining convolutional feature extraction for RGB and thermal images with a fully connected branch for wound area data. Another aim was to evaluate the contribution of each modality through ablation experiment and fusion strategies, such as concatenation, attention based fusion and weighted static fusion. Finally, Grad- CAM interpretability techniques were applied to visualize the model’s decision-making.
Dataset details, preprocessing techniques, and cross-validation strategy were covered along with performance metrics including accuracy, precision, recall, and F1-score. Model was implemented using TensorFlow, and hyperparameter optimization was performed with KerasTuner. Classes were defined by POD, although this label introduced impracticalities due to varying individual healing trajectories.
Results demonstrated that models trained without strict subject-level splitting showed inflated performance, underscoring the necessity of partitioning by subject (wound). In an ablation study, RGB-only model achieved a classification accuracy of 0.70, combining wound area improved accuracy to 0.75, indicating that explicit size context improves predictions. Adding thermal images to form a three-input model provided a slight decrease in overall accuracy (~0.73), suggesting that thermal data is most informative during early days, but may introduce noise later in the timeframe. Grad-CAM visualizations confirmed that the model was focusing on healing relevant features, rather than noise or image specific features. Exploring different fusion strategies revealed that the weighted static fusion model provided the highest accuracies, reaching testing accuracy of ~0.85 with the same subject that was held out in the ablation study. KerasTuner hyperparameter tuning process involving nested Leave-One-Group-Out Cross-Validation (LOGO CV) reached an average accuracy of 0.71. Key limitations included the small, homogeneous dataset, and the use of POD as the main label for healing stage, which did not align uniformly with biological wound progression. Future work should utilize clinically relevant labels, focus on high-quality RGB images, precise sizing, and careful integration of thermal cues, avoiding confusing the model. This multimodal POC model introduced a scalable framework for AI-driven wound assessment, highlighting best practices in data handling, model architecture, and interpretability. By addressing these considerations, AI-based wound care tools may achieve greater reliability and clinical relevance when extended to human datasets.