Identification and Semiparametric Estimation of Conditional Means from Aggregate Data
https://arxiv.org/pdf/2509.20194
Ecological inference is the challenge of estimating subgroup behavior using only aggregate data like geographic averages. This paper introduces a new semiparametric method using debiased #machineLearning to improve estimate accuracy. The approach formalizes identifying assumptions and uses many covariates to minimize statistical bias. Tools for sensitivity analysis and unit-level estimation ensure results remain #robust under varying conditions. Tests on voting and pollution data show this method outperforms traditional models in precision and speed.
#Rstats package: https://corymccartan.com/seine/
#ecologicalinference #machinelearning #statistics #econometrics