Testing human-hand segmentation on in-distribution and out-of-distribution data in human–robot interactions using a deep ensemble model

dc.contributor.authorJalayer, Reza
dc.contributor.authorChen, Yuxin
dc.contributor.authorJalayer, Masoud
dc.contributor.authorOrsenigo, Carlotta
dc.contributor.authorTomizuka, Masayoshi
dc.contributor.organizationfi=materiaalitekniikka|en=Materials Engineering|
dc.contributor.organization-code1.2.246.10.2458963.20.80931480620
dc.converis.publication-id499135968
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/499135968
dc.date.accessioned2025-08-27T22:27:57Z
dc.date.available2025-08-27T22:27:57Z
dc.description.abstract<div><div><div><div>Reliable detection and segmentation of human hands are critical for enhancing safety and facilitating advanced interactions in human–robot collaboration. Current research predominantly evaluates hand segmentation under in-distribution (ID) data, which reflects the training data of deep learning (DL) models. However, this approach fails to address out-of-distribution (OOD) scenarios that often arise in real-world human–robot interactions. In this work, we make three key contributions: first we assess the generalization of deep learning (DL) models for hand segmentation under both ID and OOD scenarios, utilizing a newly collected industrial dataset that captures a wide range of real-world conditions including simple and cluttered backgrounds with industrial tools, varying numbers of hands (0 to 4), gloves, rare gestures, and motion blur. Our second contribution is considering both egocentric and static viewpoints. We evaluated the models trained on four datasets, i.e. EgoHands, Ego2Hands (egocentric mobile camera), HADR, and HAGS (static fixed viewpoint) by testing them with both egocentric (head-mounted) and static cameras, enabling robustness evaluation from multiple points of view. Our third contribution is introducing an uncertainty analysis pipeline based on the predictive entropy of predicted hand pixels. This procedure enables flagging unreliable segmentation outputs by applying thresholds established in the validation phase. This enables automatic identification and filtering of untrustworthy predictions, significantly improving segmentation reliability in OOD scenarios. For segmentation, we used a deep ensemble model composed of UNet and RefineNet as base learners. Our experiments demonstrate that models trained on industrial datasets (HADR, HAGS) outperform those trained on non-industrial datasets, both in segmentation accuracy and in their ability to flag unreliable outputs via uncertainty estimation. These findings underscore the necessity of domain-specific training data and show that our uncertainty analysis pipeline can provide a practical safety layer for real-world deployment.<br></div></div></div></div>
dc.identifier.jour-issn0957-4158
dc.identifier.olddbid202215
dc.identifier.oldhandle10024/185242
dc.identifier.urihttps://www.utupub.fi/handle/11111/46318
dc.identifier.urlhttps://www.sciencedirect.com/science/article/pii/S0957415825000741
dc.identifier.urnURN:NBN:fi-fe2025082785648
dc.language.isoen
dc.okm.affiliatedauthorJalayer, Masoud
dc.okm.discipline216 Materials engineeringen_GB
dc.okm.discipline216 Materiaalitekniikkafi_FI
dc.okm.internationalcopublicationinternational co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA1 ScientificArticle
dc.publisherElsevier BV
dc.publisher.countryNetherlandsen_GB
dc.publisher.countryAlankomaatfi_FI
dc.publisher.country-codeNL
dc.relation.articlenumber103365
dc.relation.doi10.1016/j.mechatronics.2025.103365
dc.relation.ispartofjournalMechatronics
dc.relation.volume110
dc.source.identifierhttps://www.utupub.fi/handle/10024/185242
dc.titleTesting human-hand segmentation on in-distribution and out-of-distribution data in human–robot interactions using a deep ensemble model
dc.year.issued2025

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
1-s2.0-S0957415825000741-main.pdf
Size:
3.24 MB
Format:
Adobe Portable Document Format