Estimating health outcomes at a neighborhood scale is important for promoting urban health, yet costly and time-consuming. In this paper, we present a machine-learning-enabled approach to predicting the prevalence of six common non-communicable chronic diseases at the census tract level. We apply our approach to the City of Austin and show that our method can yield fairly accurate predictions. In searching for the best predictive models, we experiment with eight different machine learning algorithms and 60 predictor variables that characterize the social environment, the physical environment, and the aspects and degrees of neighborhood disorder. Our analysis suggests that (a) the sociodemographic and socioeconomic variables are the strongest predictors for tract-level health outcomes and (b) the historical records of 311 service requests can be a useful complementary data source as the information distilled from the 311 data often helps improve the models' performance. The machine learning models yielded from this study can help the public and city officials evaluate future scenarios and understand how changes in the neighborhood conditions can lead to changes in the health outcomes. By analyzing where the most significant discrepancies between the predicted and the actual values are, we will also be ready to identify areas of best practice and areas in need of greater investment or policy intervention.
QC 20230731