Abstract
We propose a procedure to identify latent group structures in nonlinear panel data models where some regression coefficients are heterogeneous across groups but homogenous within a group and the group number and membership are unknown. To identify the group structures, we consider the order statistics for the preliminary unconstrained consistent estimates of the regression coefficients and translate the problem of classification into the problem of breaks detection. Then we extend the sequential binary segmentation algorithm of Bai (1997) for breaks detection from the time series setup to the panel data framework. We demonstrate that our method is able to identify the true latent group structures with probability approaching one and the post-classification estimators are oracally efficient. In addition, our method has the greatest advantage of easy implementation in comparison with some competitive methods in the literature, which is desirable especially for nonlinear panel data models. To improve the finite sample performance of our method, we also consider an alternative version based on the spectral decomposition of certain estimated matrix and link our group identification issue to the community detection problem in the network literature. Simulations show that our method has good finite sample performance. We apply our method to explore how individuals’ portfolio choices respond to their financial status and other characteristics using the Netherlands household panel data from year 1993 to 2015, and find two latent groups.
JEL Classification: C33, C38, C51.
Keywords: Binary segmentation algorithm, clustering, community detection, network, oracle estimator, panel structure model, parameter heterogeneity, singular value decomposition.