EpiSmokEr2: a robust epigenetic classifier for smoking status inference using Illumina EPIC methylation data

dc.contributor.authorZhu, Tianyu
dc.contributor.authorFaragó, Teodóra
dc.contributor.authorBollepalli, Sailalitha
dc.contributor.authorHeikkinen, Aino
dc.contributor.authorHukkanen, Mikaela
dc.contributor.authorRaitakari, Olli
dc.contributor.authorLehtimäki, Terho
dc.contributor.authorKorhonen, Tellervo
dc.contributor.authorKaprio, Jaakko
dc.contributor.authorFang, Fang
dc.contributor.authorLawrence, Kaitlyn G.
dc.contributor.authorSandler, Dale P.
dc.contributor.authorRoberts Spildrejorde, Mari
dc.contributor.authorGervin, Kristina
dc.contributor.authorPan, Yanyu
dc.contributor.authorCosteira, Ricardo
dc.contributor.authorBell, Jordana T.
dc.contributor.authorOllikainen, Miina
dc.contributor.organizationfi=tyks, vsshp|en=tyks, varha|
dc.contributor.organizationfi=väestötutkimuskeskus|en=Centre for Population Health Research (POP Centre)|
dc.contributor.organizationfi=InFLAMES Lippulaiva|en=InFLAMES Flagship|
dc.contributor.organization-code1.2.246.10.2458963.20.42471027641
dc.contributor.organization-code1.2.246.10.2458963.20.68445910604
dc.converis.publication-id515686691
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/515686691
dc.date.accessioned2026-04-24T17:32:52Z
dc.description.abstract<h3>Aim</h3><p>Tobacco smoking induces persistent DNA methylation (DNAm) changes in blood that can serve as long-term biomarkers for smoking exposure. We aimed to develop and validate a DNAm classifier of smoking status using Illumina EPIC array data.</p><h3>Methods</h3><p>We built Epigenetic Smoking status Estimator2 (EpiSmokEr2), a Least Absolute Shrinkage and Selection Operator (LASSO) regression-based DNAm classifier using 511 CpGs from Illumina Infinium MethylationEPIC array (EPIC) data. The model was trained on 1343 samples from the Young Finns Study cohort and validated across six independent datasets from four cohorts and two array platforms (EPIC and EPICv2).</p><h3>Results</h3><p>EpiSmokEr2 achieved an average sensitivity of 0.87 and specificity of 0.86 in distinguishing current from never smokers. Predicted smoking status correlated strongly with established DNAm smoking scores and GrimAge, indicating its ability to capture biologically relevant smoking effects. Simulation analysis showed EpiSmokEr2 was robust for up to 10% missing CpGs.</p><h3>Conclusion</h3><p>EpiSmokEr2 provides a reliable DNAm-based estimator of smoking status. It is available as an open-source R package on GitHub, facilitating broad use in epidemiological and clinical research.</p>
dc.format.pagerange215
dc.format.pagerange205
dc.identifier.eissn1750-192X
dc.identifier.jour-issn1750-1911
dc.identifier.urihttps://www.utupub.fi/handle/11111/58983
dc.identifier.urlhttps://doi.org/10.1080/17501911.2026.2630841
dc.identifier.urnURN:NBN:fi-fe2026042332981
dc.language.isoen
dc.okm.affiliatedauthorRaitakari, Olli
dc.okm.affiliatedauthorDataimport, tyks, vsshp
dc.okm.discipline1184 Genetics, developmental biology, physiologyen_GB
dc.okm.discipline1184 Genetiikka, kehitysbiologia, fysiologiafi_FI
dc.okm.discipline3142 Public health care science, environmental and occupational healthen_GB
dc.okm.discipline3142 Kansanterveystiede, ympäristö ja työterveysfi_FI
dc.okm.internationalcopublicationinternational co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA1 ScientificArticle
dc.publisherFuture Medicine Ltd.
dc.publisher.countryUnited Kingdomen_GB
dc.publisher.countryBritanniafi_FI
dc.publisher.country-codeGB
dc.relation.doi10.1080/17501911.2026.2630841
dc.relation.ispartofjournalEpigenomics
dc.relation.issue2
dc.relation.volume18
dc.titleEpiSmokEr2: a robust epigenetic classifier for smoking status inference using Illumina EPIC methylation data
dc.year.issued2026

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
EpiSmokEr2 a robust epigenetic classifier for smoking status inference using Illumina EPIC methylation data.pdf
Size:
5.54 MB
Format:
Adobe Portable Document Format