TY - JOUR
T1 - Assessing the Nonlinear Effect of Atmospheric Variables on Primary and Oxygenated Organic Aerosol Concentration Using Machine Learning
AU - Qin, Yiming
AU - Ye, Jianhuai
AU - Ohno, Paul
AU - Liu, Pengfei
AU - Wang, Junfeng
AU - Fu, Pingqing
AU - Zhou, Liyuan
AU - Li, Yong Jie
AU - Martin, Scot T.
AU - Chan, Chak K.
N1 - Generated from Scopus record by KAUST IRTS on 2023-07-06
PY - 2022/4/21
Y1 - 2022/4/21
N2 - Organic aerosol (OA) accounts for a significant fraction of atmospheric particulate matter. The OA concentration in the atmosphere is of high variability and depends on factors such as emission, the atmospheric oxidation process, meteorology, and transport. Due to the complex interactions among the numerous factors, accurate estimation of the effects of target variables on OA concentration is often challenging. Herein, a random forest machine learning algorithm successfully predicted the concentrations of primary and oxygenated organic aerosol (POA and OOA) at urban and rural sites in Hong Kong. The random forest model explained more than 80% of the observed traffic-POA, cooking-POA, and OOA. In contrast, a multiple linear regression model only explained 30-50% of these OA concentrations. In the random forest model training process, NOxwas also the most important variable for traffic-POA and cooking-POA. For OOA, multiple parameters were equally crucial in the model prediction, including NOx, O3, and relative humidity (RH). The dependence of OA concentrations on atmospheric conditions (e.g., various NOxand O3concentrations and meteorological conditions) was calculated via the partial dependence algorithm. The results suggested that the dependence of OA concentrations on atmospheric conditions was nonlinear and depended on different condition regimes. The partial dependence algorithm provides insights into the POA source and OOA formation mechanisms under a complex environment.
AB - Organic aerosol (OA) accounts for a significant fraction of atmospheric particulate matter. The OA concentration in the atmosphere is of high variability and depends on factors such as emission, the atmospheric oxidation process, meteorology, and transport. Due to the complex interactions among the numerous factors, accurate estimation of the effects of target variables on OA concentration is often challenging. Herein, a random forest machine learning algorithm successfully predicted the concentrations of primary and oxygenated organic aerosol (POA and OOA) at urban and rural sites in Hong Kong. The random forest model explained more than 80% of the observed traffic-POA, cooking-POA, and OOA. In contrast, a multiple linear regression model only explained 30-50% of these OA concentrations. In the random forest model training process, NOxwas also the most important variable for traffic-POA and cooking-POA. For OOA, multiple parameters were equally crucial in the model prediction, including NOx, O3, and relative humidity (RH). The dependence of OA concentrations on atmospheric conditions (e.g., various NOxand O3concentrations and meteorological conditions) was calculated via the partial dependence algorithm. The results suggested that the dependence of OA concentrations on atmospheric conditions was nonlinear and depended on different condition regimes. The partial dependence algorithm provides insights into the POA source and OOA formation mechanisms under a complex environment.
UR - https://pubs.acs.org/doi/10.1021/acsearthspacechem.1c00443
UR - http://www.scopus.com/inward/record.url?scp=85126598054&partnerID=8YFLogxK
U2 - 10.1021/acsearthspacechem.1c00443
DO - 10.1021/acsearthspacechem.1c00443
M3 - Article
SN - 2472-3452
VL - 6
SP - 1059
EP - 1066
JO - ACS Earth and Space Chemistry
JF - ACS Earth and Space Chemistry
IS - 4
ER -