Microsoft and Tsinghua University jointly propose the DeepRSM model to help control air pollution with AI
In the past few decades, the rapid progress of human industries and agriculture, the growth of the population, and harmful gases produced by human activities have caused serious air pollution and are endangering human health. In addition to causing lung diseases and human discomfort, studies have shown that in the past 25 years, the fatality rate associated with air pollution has ranked first among all environmental problems [1].
Pollutant emission reduction is the most important method to curb air pollution, including factory shutdowns, motor vehicle restrictions, and coal use restrictions among residents. However, an intractable problem is that it’s difficult to deduce the relationship between air quality and emission reduction by simply following through with the latter, because the whole chemical process of having various reactant gases emitted into the air, reacting in the atmosphere, and finally generating pollutants is very complex and nonlinear.
Characterize the relationship between emissions and pollutant concentration
Environmentalists have developed a complex numerical model (CTM) to calculate the final concentration of pollutants in the air (air quality) from emissions. The basic principle is to use numerical functions to simulate the entire chemical reaction process of air pollutants, and then at the same time to simulate the final concentration of pollutants in the air by combining light, geographic, and meteorological information.
Although the CTM model can satisfactorily simulate the relationship between air quality and emissions, it is far from enough to produce an accurate simulation of a particular emission reduction plan. For policy makers, what’s most important is to quickly identify the emission reduction plan with the best emission reduction effect and the lowest cost. Therefore, to obtain a fine-grained, high-precision emission-pollutant concentration response surface model (RSM) is very necessary. A common method is to use a polynomial function RSM (pfRSM) to randomly sample multiple different emission reduction schemes (corresponding to different emission reductions of different pollutants) and use the CTM model to simulate air quality. The air quality corresponding to the different emission reduction schemes on the response surface is then used to estimate the coefficients of the polynomial function using the least squares method or interpolation. The algorithm complexity is represented by O(N), where N is the number of samples.
However, the CTM model involves a huge amount of calculation (solving partial differential equations for continuous integration of the time dimension), and is very time-consuming and labor-intensive. Simulating one day’s data often takes 1-2 weeks to compute on a supercomputer cluster, so its timeliness is also very poor. More importantly, there may be as many as ten or more types of harmful gas emissions, and so it is often necessary to sample a lot of different emission reduction plans and carry out simulations using the CTM model in order to develop a more accurate approximation of the response surface. This is not an ideal process in actual policy-making.
DeepRSM: More accurate “emission-pollutant concentration” response surface model with significant cost reduction
How is artificial intelligence helping environmentalists solve this problem?
Recently, researchers from Microsoft Research Asia and Tsinghua University jointly published a paper titled “Deep Learning for Prediction of the Air Quality Response to Emission Changes” in the top environmental science journal Environmental Science & Technology [2]. They proposed the DeepRSM model and explained in detail how to use artificial intelligence to help environmentalists obtain more accurate and fine-grained response surface modeling under O(1) complexity and reduce the cost of a single modeling by more than 90% .
First, if a grid in the CTM model is regarded as a black box model, the rate of change of pollutant concentration over time can be expressed as:
Where [P] is the concentration of pollutants (such as PM2.5 or O3); f_i is a numerical equation describing the process i (such as transmission, chemical reaction, sedimentation, etc.) that contributes to the concentration of pollutants; k_i is a variable related to geography and meteorology, and [I_s] represents the concentration of reactants in the air. If you separate the reactants that will produce pollutants in the emitted harmful gases into independent variables, their concentration is proportional to the emissions E_p, so E1 can be expressed as:
The numerical model CTM contains environmentalists’ understanding of a large number of complex chemical reaction processes. Simply using artificial intelligence algorithms to approximate them cannot guarantee any accuracy. Therefore, we want E2 to be further divided to help artificial intelligence algorithms avoid the approximation of complex chemical reaction processes. The emission amount E_p can be further expressed as the baseline emission amount E_(p_0) multiplied by an emission reduction coefficient r_p.
When external conditions do not change, the chemical reaction will reach equilibrium. At this point, the concentration of the product (pollutant) is only related to the concentration of the reactant. A relationship function R( ∙ ) can be used to express the relationship between the reactant, the product, and the external conditions. We will further split E3:
In E4, we still retain the simulation of the CTM numerical model for the baseline scenario E_(p_0) and use the deep learning model to characterize the relational function R( ∙ ). Because R( ∙ ) does not involve any complex chemical reaction processes and is only concerned with the relationship between points on the response surface, the deep learning model can achieve a very low level of approximation errors.
The researchers used the air quality response surface data of four different scales and regions for verification, including mainland China (CN27), the North China Plains (NCP), the Fenwei Plains (FWP), and the Sichuan-Chongqing Region (CYR). On the four data sets, DeepRSM achieved significantly better results than previous methods in modeling the response surface, and reduced the required sampling time and numerical simulation costs by more than 90%.
In addition to using artificial intelligence to help control natural environmental pollution, Microsoft is also paying close attention to many important issues related to the sustainable development of mankind and is exploring the application of AI technology in fields such as energy conservation and emission reduction, medicine and health, and smart city construction, continuing to use its latest technologies to promote sustainable development.
References
[1] Cohen AJ, Brauer M, Burnett R, et al. Estimates and 25-year trends of the global burden of disease attributable to ambient air pollution: an analysis of data from the Global Burden of Diseases Study 2015[J]. The Lancet, 2017, 389(10082): 1907-1918.
[2] Xing J, Zheng S, Ding D, et al. Deep learning for prediction of the air quality response to emission changes[J]. Environmental Science & Technology, 2020.