It is springtime in Eastern Washington, USA, and the temperature is slightly above freezing. A farmer is preparing to fertilize his fields of wheat and lentils as winter runoff and frost are nearly finished. The plants are susceptible to fertilizer at freezing temperatures, so the farmer checks forecasts from the local weather station, which is about 50 miles away. The three-day outlook shows temperatures above freezing. The farmer rents equipment and starts fertilizing the farm. But at night, the temperature in parts of the fields drops below freezing and kills around 20% of the crops. This is unfortunately a common situation, since climatic parameters can vary over short distances and even between sections of the farm.
To address this problem and others, we developed DeepMC, a framework for predicting micro-climates, or the accumulation of climatic parameters formed around a relatively small, homogeneous region. Micro-climate predictions are beneficial in agriculture, forestry, architecture, urban design, ecology conservation, maritime and other domains. DeepMC predicts various micro-climate parameters with over 90% accuracy at IoT sensor locations deployed around the world.
This work is a part of a Microsoft Research initiative, Research for Industry, which aims to address challenges including climate change, pandemics, and food security through technological breakthroughs. To learn more about the work Microsoft is doing to enable data-driving farming, check out the FarmBeats: AI, Edge, and IoT agriculture project page.
Spotlight: Event Series
Photography by Maryatt Photography
Prediction meets practical decision making
Climatic parameters are stochastic (randomly occurring) in nature, making them difficult to model for prediction tasks. The methodology used to build the prediction model must meet four significant challenges:
- Accuracy: The scarcity of labelled datasets, heterogeneity of features and nonstationarity of input features make it difficult to generate highly accurate results.
- Reliability: Nonstationarity of the climatic time series data makes it difficult to reliably characterize the input-output relationships. Each input feature affects the output variable at a different temporal scale. For example, the effect of precipitation on soil moisture is instantaneous while the effect of temperature on soil moisture accumulates over time.
- Replicability: Any system for micro-climate predictions is expected to perform across various terrains, plus geographic and climate conditions, where high quality labelled data may not be available. Smarter techniques are required to transfer models learned in one domain to another domain with few paired labelled datasets.
- Adaptability: Various factors influence the trend of a particular climatic parameter. For example, soil moisture predictions are correlated with climatic parameters such as ambient temperature, humidity, precipitation and soil temperature, while ambient humidity is correlated with ambient temperature, wind speed and precipitation. A machine learning system must be able to accept vectors of varying dimensions as input to replicate predictions for different use cases.
DeepMC is designed to satisfy each of these requirements, which we discuss below, along with building the appropriate architecture. We also present scenarios where DeepMC is being used today and examine its potential impact on environmental sustainability and broader industrial applications.“ For deeper details, read our paper titled, “Micro-climate Prediction – Multi Scale Encoder-decoder based Deep Learning Framework (opens in new tab)” which was published at the Proceedings of 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (opens in new tab).
Data requirements
DeepMC builds on top of the Azure FarmBeats platform to predict micro-climatic parameters in real-time, with inputs from weather station forecasts and IoT sensors. The parameters collected depend on the predicted variable of interest. We can collect current data as well as forecasts for ambient temperature, ambient pressure, humidity, soil moisture, soil temperature, radiation, precipitation, wind speed and wind direction.
Methodology
The DeepMC learning framework, shown in Figure 1, is based on a sequence-to-sequence encoder-decoder framework consisting of five distinct parts: 1) pre-processor, 2) forecast error computer, 3) wavelet packet decomposition, 4) multi-scale deep learning, and 5) attention mechanism. The decoder is a multi-layer, long short-term memory (LSTM) and fully connected layer. Each component is described in the following subsections with some implementation details for the sake of reproducibility.
-
Sensor data is received using IoT sensors deployed on the farm. The raw data is usually noisy, with missing data and varying temporal resolution. We standardize the temporal resolution using average values for the data collected.
Figure 1. DeepMC architecture for the multi scale encode-decoder deep learning system. The architecture consists of 6 distinct parts- A) the preprocessor, B) forecast error computation, C) wavelet packet decomposition, D) multi-scale deep learning, E) attention mechanism, F) decoder. -
DeepMC uses weather station forecasts of the predicted variable to learn better models for micro-climate predictions. Instead of predicting the climatic parameter directly, we predict the error between the nearest commercial weather station forecast and the local micro-climate forecast. This is based on the hypothesis that hyperlocalization of weather station forecasts is more efficient than learning the relationships of the predicted climatic parameter y with the other parameters z and auto-relationship of the y with itself at earlier times.
One artifact of using the forecast error as the predictor signal is that it does not inherently capture the effect of distance of the weather station from the point of interest. For this purpose, we include a Relative Latitude (RLat) and Relative Longitude (Rlong) as additional features.
RLat = Lat(Weather Station)−Lat(Micro−region),
RLong = Long(Weather Station)−Long(Micro−region).
-
-
Wavelet Packet Decomposition (WPD) is a classical signal processing method built on wavelet analysis, which gives an efficient way to decompose time series from the time domain to scale domain. It localizes the change across time within different scales of the original signal. Decomposing original time series using WPD gives signals with multiple levels of trends and details. In the context of climatic data, this corresponds variations such as long-term trends, yearly variation, seasonal variation, daily variations, etc.
-
Once we have prepared the output data from WPD in the previous step \(o_{WPD}^{(n,m)}, \forall n,m \in [1,N]\), we input it into the deep learning network. We separate the data into long-scale (latex]n[/latex] or \(m=1\)), medium-scale (\(n,m ∈\)[2,\(N\) −1]) and short-scale (\(n\) or \(m = N\)) signals. The long-scale signals pass through a CNN-LSTM stack. The medium-scale and the short-scale signals pass through a multi-layered CNN stack. For the data with short-term dependencies (medium and short scale data), the CNN layer has similar performance and faster computing speed when compared to the LSTM recurrent layer. Thus we use CNN network layers for the medium- and short-scale data. For the long-scale data, the CNN network layers extract the deep features of the time series and the LSTM layer sequentially processes the temporal data with long-term and short-term dependence. Therefore, CNN-LSTM architecture is used for long-scale data.
-
DeepMC uses two levels of attention models, similar to those used in vision-to-language tasks. First is a long-range, guided attention model which is used with the CNN-LSTM output and memorizes the long-term dynamics of the input time series. We use a position-based content attention model described by Cinar et al. 2017 for this level.
The second level attention model is scale guided attention model and is used to capture the respective weighting of different scales. The scale guided attention model uses an additive attention mechanism described here. The outputs of the multi-scale model (including the output of the long-range guided attention mechanism on the CNNLSTM stack) is represented as \(o^{(m,n)}, m,n \in [1,N]\).
Figure 3. Micro-Climate Wind Speed prediction comparisons at the 24th hour with a resolution of one-hour over a 10-day period -
The DeepMC decoder uses LSTM to generate a sequence of L outputs, which is equal to the number of future timesteps to be predicted. The decoder LSTM layer receives a multivariate encoded time series and produces a vector for each step of prediction. Each output of the LSTM is connected with two layers of time-distributed, fully connected layer.
-
Real-world deployments
DeepMC is deployed across many different regions of the world on top of Azure FarmBeats. In this section, we present four real-world scenarios in agriculture and energy weather conditions impact operations. We also show some results in comparison to common models used to solve prediction tasks in addition to comparisons with some variations on DeepMC.
Comparison: micro-wind speed predictions
Figure 3 shows the wind speed predictions at the 24th hour over a period of 10 days with one-hour resolution. Figure 4 plots the RMSE (Root Mean Squared Error) for each hour prediction and compares with other models. DeepMC shows significantly better performance and is more likely to follow the details and trends of the time series data. Other models used for comparison (in this case for wind speed) are the CNNLSTM model, modified CNNLSTM with LSTM decoder, regular convolutional network with LSTM decoder, a vanilla LSTM-based forecaster, and a vanilla CNN-based forecaster. Of interest: the performance of all models decreases as the horizon of prediction increases, which is to be expected, as it is more accurate to predict the next immediate hour versus a forecast on the 24th hour. Figure 5 plots the RMSE, MAE (Maximum Absolute Error) and MAPE (Maximum Absolute Percentage Error) values for micro-wind speed predictions using various models, averaged over all 24-step predictions.
Solar Farm: micro-radiation predictions
Micro-radiation predictions are required to estimate the electricity produced at commercial solar farms. These predictions are fed into an optimization model to fulfill price and energy commitments by the utility company in the energy markets. Radiation received at the solar panel is sensitive to seasons of high overcast or rain. Figure 6 plots the predictions across months during the overcast season and after. The predictions attain a high accuracy for the month after the monsoon in July, with scores MASE1 = 1.86, MAE= 65.14, RMSE = 116.30. Table 1 compares DeepMC’s MAE, MAPE and RMSE scores with other commonly used models.
DeepMC | CNN | LSTM | CNNLSTM | ARIMA | |
---|---|---|---|---|---|
RMSE | 124.5 | 167.4 | 192.3 | 155.6 | 530.60 |
MAE | 68.15 | 111.77 | 130.99 | 90.02 | 397.45 |
MASE | 1.95 | 3.20 | 3.75 | 2.89 | 11.39 |
Phenotyping research: micro-soil-moisture predictions
Vine tomatoes are susceptible to rot if they sit too close to soil with high moisture values. Growers use trellises to raise the vines and provide structural stability, but this adds challenges. Growing tomatoes without trellises critically requires accurate predictions of local soil moisture values. The farmer uses DeepMC to analyze micro-soil-moisture conditions using data from IoT sensors along with the predictors ambient temperature, ambient humidity, precipitation, wind speed, soil moisture and soil temperature and historical soil moisture data from the weather station. The results are shown in Figure 8 with the recorded RMSE value of 3.11 and MAPE value of 14.03% (implying a 85.97% accuracy). Soil moisture values increase rapidly during times of heavy rainfall and slowly decrease during extended dry periods, which is observed in Figure 8. DeepMC tracks these sharp changes fairly accurately, and much better than the weather station forecasts, which demonstrates the robustness of the model.
Discussion, sustainability, and conclusion
Micro-climate predictions through DeepMC generate predictions, using relatively affordable IoT sensors, that help farmers apply chemicals with better timing and effectiveness, which saves money and improves sustainability.
“The ability to quickly apply the results that AI models produce is a great advantage,” says Andrew Nelson, who has deployed FarmBeats on his farm in eastern Washington.
“And the future predictions that AI provides help us maximize our investment of time and money, with larger scale testing of different techniques that have improved profitability, sustainability, and sometimes both.”
In a labor-intensive business like farming, data can help make decisions that would otherwise be too complicated and time consuming, helping farmers optimize their resources and their productivity.
“During busy seasons, we are already working during all available sunlight,” Nelson says. “Any time savings means more time to tend to the crops, which usually leads to higher yields.”
DeepMC also helps make commercial renewable energy production more efficient. Energy utility companies can better fulfill their power and price commitments if they can successfully predict radiation and wind speed at their solar and wind farms.
“Renewable forecasting and decision-making under uncertainty forms the AI foundation for the future of deep decarbonization of electricity grids involving high levels of renewables integration,” says Shivkumar Kalyanaraman, Chief Technology Officer, Energy & Mobility, Azure Global, Microsoft India.
“The DeepMC forecasting engine has demonstrated its accuracy and versatility in handling different types of renewables, such as wind and solar, as well as different configurations, and different geographies. The combination of accuracy, robustness, flexibility and scalability is important to help the renewables industry evolve toward a software-and-AI-driven future.”
DeepMC achieved compelling results on multiple micro-climate prediction tasks. To the best of our knowledge, this is the most versatile study and framework for micro-climate prediction for multiple climatic parameters and multiple geographical conditions. Still, we found many opportunities for further improvement in reliability, robustness, and accuracy. Specifically, the model is brittle on transfer learning. We observe that it requires careful hyper-parameter tuning and initialization to achieve good performance. Additional work using GANs can be explored to increase the transferability of the DeepMC framework. Nonetheless, DeepMC is being used to improve decisions on many farms today.