Predicting gas-particle partitioning coefficients of atmospheric molecules with machine learning

dc.contributor.authorLumiaro Emma
dc.contributor.authorTodorovic Milica
dc.contributor.authorKurten Theo
dc.contributor.authorVehkamäki Hanna
dc.contributor.authorRinke Patrick
dc.contributor.organizationfi=materiaalitekniikka|en=Materials Engineering|
dc.contributor.organization-code1.2.246.10.2458963.20.80931480620
dc.converis.publication-id67222683
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/67222683
dc.date.accessioned2022-10-28T13:29:55Z
dc.date.available2022-10-28T13:29:55Z
dc.description.abstractThe formation, properties, and lifetime of secondary organic aerosols in the atmosphere are largely determined by gas-particle partitioning coefficients of the participating organic vapours. Since these coefficients are often difficult to measure and to compute, we developed a machine learning model to predict them given molecular structure as input. Our data-driven approach is based on the dataset by Wang et al. (2017), who computed the partitioning coefficients and saturation vapour pressures of 3414 atmospheric oxidation products from the Master Chemical Mechanism using the COSMOtherm programme. We trained a kernel ridge regression (KRR) machine learning model on the saturation vapour pressure (P-sat) and on two equilibrium partitioning coefficients: between a water-insoluble organic matter phase and the gas phase (K-WIOM/G) and between an infinitely dilute solution with pure water and the gas phase (K-W/G). For the input representation of the atomic structure of each organic molecule to the machine, we tested different descriptors. We find that the many-body tensor representation (MBTR) works best for our application, but the topological fingerprint (TopFP) approach is almost as good and computationally cheaper to evaluate. Our best machine learning model (KRR with a Gaussian kernel + MBTR) predicts P-sat and K-WIOM/G to within 0.3 logarithmic units and K-W/G to within 0.4 logarithmic units of the original COSMOtherm calculations. This is equal to or better than the typical accuracy of COSMOtherm predictions compared to experimental data (where available). We then applied our machine learning model to a dataset of 35 383 molecules that we generated based on a carbon-10 backbone functionalized with zero to six carboxyl, carbonyl, or hydroxyl groups to evaluate its performance for polyfunctional compounds with potentially low P-sat. The resulting saturation vapour pressure and partitioning coefficient distributions were physico-chemically reasonable, for example, in terms of the average effects of the addition of single functional groups. The volatility predictions for the most highly oxidized compounds were in qualitative agreement with experimentally inferred volatilities of, for example, alpha-pinene oxidation products with as yet unknown structures but similar elemental compositions.
dc.format.pagerange13227
dc.format.pagerange13246
dc.identifier.eissn1680-7324
dc.identifier.jour-issn1680-7316
dc.identifier.olddbid182505
dc.identifier.oldhandle10024/165599
dc.identifier.urihttps://www.utupub.fi/handle/11111/39735
dc.identifier.urnURN:NBN:fi-fe2021093048517
dc.language.isoen
dc.okm.affiliatedauthorTodorovic, Milica
dc.okm.discipline1171 Geosciencesen_GB
dc.okm.discipline1172 Environmental sciencesen_GB
dc.okm.discipline1171 Geotieteetfi_FI
dc.okm.discipline1172 Ympäristötiedefi_FI
dc.okm.internationalcopublicationnot an international co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA1 ScientificArticle
dc.publisherCOPERNICUS GESELLSCHAFT MBH
dc.publisher.countryGermanyen_GB
dc.publisher.countrySaksafi_FI
dc.publisher.country-codeDE
dc.relation.doi10.5194/acp-21-13227-2021
dc.relation.ispartofjournalAtmospheric Chemistry and Physics
dc.relation.issue17
dc.relation.volume21
dc.source.identifierhttps://www.utupub.fi/handle/10024/165599
dc.titlePredicting gas-particle partitioning coefficients of atmospheric molecules with machine learning
dc.year.issued2021

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
acp-21-13227-2021.pdf
Size:
1.53 MB
Format:
Adobe Portable Document Format
Description:
Publisher's PDF