Efficient Search Algorithms for Identifying Synergistic Associations in High-Dimensional Datasets

dc.contributor.authorHourican, Cillian
dc.contributor.authorLi, Jie
dc.contributor.authorMishra, Pashupati P.
dc.contributor.authorLehtimäki, Terho
dc.contributor.authorMishra, Binisha H.
dc.contributor.authorKähönen, Mika
dc.contributor.authorRaitakari, Olli T.
dc.contributor.authorLaaksonen, Reijo
dc.contributor.authorKeltikangas-Järvinen, Liisa
dc.contributor.authorJuonala, Markus
dc.contributor.authorQuax, Rick
dc.contributor.organizationfi=InFLAMES Lippulaiva|en=InFLAMES Flagship|
dc.contributor.organizationfi=sisätautioppi|en=Internal Medicine|
dc.contributor.organizationfi=sydäntutkimuskeskus|en=Cardiovascular Medicine (CAPC)|
dc.contributor.organizationfi=tyks, vsshp|en=tyks, varha|
dc.contributor.organizationfi=väestötutkimuskeskus|en=Centre for Population Health Research (POP Centre)|
dc.contributor.organization-code1.2.246.10.2458963.20.35734063924
dc.contributor.organization-code1.2.246.10.2458963.20.40502528769
dc.contributor.organization-code1.2.246.10.2458963.20.42471027641
dc.contributor.organization-code1.2.246.10.2458963.20.68445910604
dc.converis.publication-id477015493
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/477015493
dc.date.accessioned2025-08-28T01:13:31Z
dc.date.available2025-08-28T01:13:31Z
dc.description.abstractIn recent years, there has been a notably increased interest in the study of multivariate interactions and emergent higher-order dependencies. This is particularly evident in the context of identifying synergistic sets, which are defined as combinations of elements whose joint interactions result in the emergence of information that is not present in any individual subset of those elements. The scalability of frameworks such as partial information decomposition (PID) and those based on multivariate extensions of mutual information, such as O-information, is limited by combinational explosion in the number of sets that must be assessed. In order to address these challenges, we propose a novel approach that utilises stochastic search strategies in order to identify synergistic triplets within datasets. Furthermore, the methodology is extensible to larger sets and various synergy measures. By employing stochastic search, our approach circumvents the constraints of exhaustive enumeration, offering a scalable and efficient means to uncover intricate dependencies. The flexibility of our method is illustrated through its application to two epidemiological datasets: The Young Finns Study and the UK Biobank Nuclear Magnetic Resonance (NMR) data. Additionally, we present a heuristic for reducing the number of synergistic sets to analyse in large datasets by excluding sets with overlapping information. We also illustrate the risks of performing a feature selection before assessing synergistic information in the system.
dc.identifier.eissn1099-4300
dc.identifier.olddbid207228
dc.identifier.oldhandle10024/190255
dc.identifier.urihttps://www.utupub.fi/handle/11111/50928
dc.identifier.urlhttp://doi.org/10.3390/e26110968
dc.identifier.urnURN:NBN:fi-fe2025082787604
dc.language.isoen
dc.okm.affiliatedauthorRaitakari, Olli
dc.okm.affiliatedauthorJuonala, Markus
dc.okm.affiliatedauthorDataimport, tyks, vsshp
dc.okm.discipline3121 Internal medicineen_GB
dc.okm.discipline3121 Sisätauditfi_FI
dc.okm.internationalcopublicationinternational co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA1 ScientificArticle
dc.publisherMDPI AG
dc.publisher.countrySwitzerlanden_GB
dc.publisher.countrySveitsifi_FI
dc.publisher.country-codeCH
dc.relation.articlenumber968
dc.relation.doi10.3390/e26110968
dc.relation.ispartofjournalEntropy
dc.relation.issue11
dc.relation.volume26
dc.source.identifierhttps://www.utupub.fi/handle/10024/190255
dc.titleEfficient Search Algorithms for Identifying Synergistic Associations in High-Dimensional Datasets
dc.year.issued2024

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
entropy-26-00968-v2.pdf
Size:
2.22 MB
Format:
Adobe Portable Document Format