The presence of "too many zeros" is a common challenge in empirical legal research. For example, "most" cases do not pursue appeals, the outcome of many civil trials (e.g., a finding of no liability) generates "zero" damages, etc. Thus, the distributions of outcome variables of interest are not infrequently skewed and this data skew warrants attention.
How to addresses this common feature in empirical legal research was the subject of one of Ted Eisenberg's (1947-2014) final scholarly papers (co-written with his son, Thomas, and Marty Wells and Min Zhang). Addressing the Zeros Problem: Regression Models for Outcomes with a Large Proportion of Zeros, with an Application to Trial Outcomes, recently published in JELS (12:1, Mar. 2015), reviews various empirical strategies and compares results from Tobit, Heckman selection, and two-part models. An excerpted abstract follows.
"Tobit models are often applied to deal with the excess number of zeros, but these are more appropriate in cases of true censoring (e.g., when all negative values are recorded as zeros) and less appropriate when zeros are in fact often observed as the amount awarded. Heckman selection models are another methodology that is applied in this setting, yet they were developed for potential outcomes rather than actual ones. Two‐part models account for actual outcomes and avoid the collinearity problems that often attend selection models. A two‐part hierarchical model is developed here that accounts for both the skewed, zero‐inflated nature of damages data and the fact that punitive damage awards may be correlated within case type, jurisdiction, or time. Inference is conducted using a Markov chain Monte Carlo sampling scheme. Tobit models, selection models, and two‐part models are fit to two punitive damage awards data sets and the results are compared. We illustrate that the nonsignificance of coefficients in a selection model can be a consequence of collinearity, whereas that does not occur with two‐part models."
Comments