Empirical network studies typically analyze partial samples of the population of interest. However, partially sampled network data bias systematically the properties of observed networks and suffer from non-classical measurement-error problem if applied as regressors. This paper analyzes statistical issues arising from examining non-randomly sampled networks. Combining theory, numerical experiments, and empirical applications, we illustrate the biases in both network statistics and the estimates of network effects under non-random sampling. We then propose a methodology that adapts post-stratification weighting approaches to networked contexts, which allows to recover several network-level statistics and reduces the biases these statistics exert on individual and network-level outcomes. The proposed methodology outperforms the corrections based on randomness proposed in the literature.
Joint work with Chih-Sheng Hsieh (Chinese University of Hong Kong), Stanley I.M. Ko (University of Macau), Trevon Logan (Ohio State University & NBER)