Abstract

AbstractWith the increasing use of machine learning models in computational socioeconomics, the development of methods for explaining these models and understanding the causal connections is gradually gaining importance. In this work, we advocate the use of an explanatory framework from cooperative game theory augmented with do calculus, namely causal Shapley values. Using causal Shapley values, we analyze socioeconomic disparities that have a causal link to the spread of COVID-19 in the USA. We study several phases of the disease spread to show how the causal connections change over time. We perform a causal analysis using random effects models and discuss the correspondence between the two methods to verify our results. We show the distinct advantages a non-linear machine learning models have over linear models when performing a multivariate analysis, especially since the machine learning models can map out non-linear correlations in the data. In addition, the causal Shapley values allow for including the causal structure in the variable importance computed for the machine learning model.

Tannista Banerjee et al., Causal connections between socioeconomic disparities and COVID-19 in the USA, Sci Rep