Poverty rate prediction using multi-modal survey and earth observation data
- Simone Fobi Nsutezo ,
- Manuel Cardona ,
- Elliott Collins ,
- Caleb Robinson ,
- Anthony Ortiz ,
- Tina Sederholm ,
- Rahul Dodhia ,
- Juan M. Lavista Ferres
COMPASS '23: ACM SIGCAS/SIGCHI Conference on Computing and Sustainable Societies |
Published by Association for Computing Machinery
This work presents an approach for combining household demographic and living standards survey questions with features derived from satellite imagery to predict the poverty rate of a region. Our approach utilizes visual features obtained from a single-step featurization method applied to freely available 10m/px Sentinel-2 surface reflectance satellite imagery. These visual features are combined with ten survey questions in a proxy means test (PMT) to estimate whether a household is below the poverty line. We show that the inclusion of visual features reduces the mean error in poverty rate estimates from 4.09% to 3.88% over a nationally representative out-of-sample test set. In addition to including satellite imagery features in proxy means tests, we propose an approach for selecting a subset of survey questions that are complementary to the visual features extracted from satellite imagery. Specifically, we design a survey variable selection approach guided by the full survey and image features and use the approach to determine the most relevant set of small survey questions to include in a PMT. We validate the choice of small survey questions in a downstream task of predicting the poverty rate using the small set of questions. This approach results in the best performance — errors in poverty rate decrease from 4.09% to 3.71%. We show that extracted visual features encode geographic and urbanization differences between regions.