A generative multimodal network for facial expression recognition

dc.contributor.authorZhao, Yue
dc.contributor.authorSong, Mingjian
dc.contributor.authorZhang, Qi
dc.contributor.authorYang, Jiawei
dc.contributor.authorYoshigoe, Kenji
dc.contributor.authorTian, Chunwei
dc.contributor.organizationfi=terveysteknologia|en=Health Technology|
dc.contributor.organization-code1.2.246.10.2458963.20.28696315432
dc.converis.publication-id523237226
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/523237226
dc.date.accessioned2026-05-22T20:15:38Z
dc.description.abstract<p>Deep networks with strong feature extraction abilities have been extensively employed in facial expression recognition (FER). However, they focus on structural information from data dependency rather than facial attribute to limit robustness of obtained models for FER. In this paper, we propose a generative multimodal network (GMNet) for FER. Firstly, GMNet can generate and align multimodal face images, according to face asymmetry and mirror imaging principle. Secondly, it utilizes parallel networks to respectively learn diversity information based on original and generative multimodal face images and merge them from obtained multimodal face images to obtain reliable facial expression information. Thirdly, a sparse mechanism can further refine obtained richer facial features above to obtain more accurate facial expression information and reduce training costs. Finally, a cross loss can utilize cross domain restriction to guarantee reliability of multimodal face images to improve performance in facial expression. Experimental results show that our GMNet is superior to other popular FER methods. Codes of GMNet can be used at https://github.com/hellloxiaotian/GMNet.<br></p>
dc.embargo.lift2027-03-26
dc.identifier.eissn1873-5142
dc.identifier.jour-issn0031-3203
dc.identifier.urihttps://www.utupub.fi/handle/11111/61039
dc.identifier.urlhttps://doi.org/10.1016/j.patcog.2026.113518
dc.identifier.urnURN:NBN:fi-fe2026052252390
dc.language.isoen
dc.okm.affiliatedauthorYang, Jiawei
dc.okm.discipline113 Computer and information sciencesen_GB
dc.okm.discipline113 Tietojenkäsittely ja informaatiotieteetfi_FI
dc.okm.internationalcopublicationinternational co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA1 ScientificArticle
dc.publisherElsevier
dc.publisher.countryUnited Kingdomen_GB
dc.publisher.countryBritanniafi_FI
dc.publisher.country-codeGB
dc.relation.articlenumber113518
dc.relation.doi10.1016/j.patcog.2026.113518
dc.relation.ispartofjournalPattern Recognition
dc.relation.issuePart A
dc.relation.volume179
dc.titleA generative multimodal network for facial expression recognition
dc.year.issued2026

Tiedostot