A persistent issue confounding many empirical legal studies using count data involves the "zeros problem," especially when the count data require a log transformation. Contests on how to best deal with this problem persist. Popular approaches include standard OLS regression models that add a "plus one" to each count data so to not lose all cases with a zero count in the log transformation and Poisson regressions models.
While empirical assessments of these two approaches are relatively rare, a recent paper compares these two approaches drawing on replication analyses of five recent corporate patenting papers. In Count Data in Finance, Jonathan Cohn (Texas--Business) et al., find that one OLS regression approach (LOG1PLUS) "using log-transformed outcomes can produce biased and incorrectly signed estimates of economic relationships and provide guidance for future research in finance-related applications involving zero-bounded count data." By comparison, while the Poisson model estimates may "lose efficiency if the model’s conditional mean-variance equality restriction is not satisfied in the data, they remain unbiased and consistent as long as the standard conditional mean independence assumption holds." The paper's abstract follows.
"This paper examines the use of count data-based outcome variables such as corporate patents in empirical corporate finance research. We demonstrate that the common practice of regressing the log of one plus the count on covariates (“LOG1PLUS” regression) produces biased and inconsistent estimates of objects of interest and lacks meaningful interpretation. Poisson regressions have simple interpretations and produce unbiased and consistent estimates under standard exogeneity assumptions, though they lose efficiency if the count data is overdispersed. Replicating several recent papers on corporate patenting, we find that LOG1PLUS and Poisson regressions frequently produce meaningfully different estimates and that bias in LOG1PLUS regressions is likely large."
Recent Comments