At the recent American Law and Economics Association Annual Meeting held in Berkeley, John Donohue and Justin Wolfers presented a compelling analysis of the deterrence effect of the death penalty -- or rather, the lack of evidence thereof. Comparing execution rates with homicide rates, as well as using the natural experiment of the Furman abolition period, and contrasting U.S. trends to Canadian trends, Donohue and Wolfers cast considerable doubt on whether the death penalty has any deterrence effect. But Donohue and Wolfers look only at whether the death penalty deters crime, not whether the death penalty affects other criminal law matters, such as encouraging defendants in murder trials to accept plea bargains with harsher terms than they otherwise would, as Ilyana Kuziemko recently showed.
But more interesting than Donohue and Wolfers substantive case -- they ultimately conclude that it is entirely unclear whether the death penalty causes more or less murders -- is their methodology. They replicate the analyses of a handful of central studies that show there is a deterrence effect, and run a variety of robustness checks on each. As such, their paper provides a comprehensive re-examination of the primary data on the deterrence effect of the death penalty on crime.
The
results they provide are startling and concerning. Donohue and Wolfers
suggest that many studies report only those results that are produced
when running the most supportive robustness checks. Running a range of robustness checks on each study shows the results vary wildly. Similarly,
there appears to be a self-serving selectivity in time periods
examined: the considerable variance in different historical phases
leads to oftentimes opposing results.
Some
of these results are unsurprising. The great difficulty for any
empirical test of the deterrence effect of the death penalty is that
any such work necessarily relies on a data set made up of a very
strange collection of cases: those few cases that reach completion --
that is, result in an execution -- despite the labyrinthine Supreme
Court jurisprudence on the topic. The data set is necessarily skewed to begin with.
However
Donohue and Wolfers go further: their results suggest that some of the
authors they analyze may have contributed their own perversions into
the empirics: some studies, when recalibrated for coding errors,
collinearity etc, actually showed the opposite result to that claimed.
Putting
aside the controversies concerning any intent to mislead, this
comprehensive replication of others’ data is a largely new addition to
empirical legal studies. The approach is taken from medical fields, where such conglomeration and re-examination of prior findings is more common. The approach is very time intensive, necessitated in the medical field by the increased dangers of spurious conclusions.
This approach may have come late to the law due to the somewhat lower stakes, but is an exciting new development. The one unintended downside of the approach is that scholars who are still antagonistic to the introduction of empirical scholarship into the legal field have pounced on this showing of the apparent malleability of statistical analysis to reject the empirical endeavor altogether. Unfortunately, this effect may be difficult to overcome, given that there is unlikely to be a publishing market in these comprehensive studies if the results are to confirm the pre-existing orthodoxy. This is not to discourage the practice, merely to note that there are dangers that may result, beyond some bruised academic egos.

I think Gary King is right; we ought to either publish the exact probabilities for estimates or their confidence intervals and ditch the whole idea of conventional probablity standards. This is especially the case for the many, many studies that are not based on datasets drawn from a probability sample to begin with. For them reporting confidence intervals based on bootstrapped estimates would be much better.
And we'll probably get that right after the Revolution and not before. The difficulty seems to be that the journal editing boards are reluctant to accept papers using these methods (usually) and authors are reluctant to use them as a result. I find myself faced with this dilemma fairly often and I almost always end up opting for - yep - conventional standards and conventional probability estimates. OK, ok, I hear you: physician heal thyself! But that dog won't hunt when you want to publish in an increasingly competitive environment; why give the reviewers an excuse for ditching your work? I think what we need is a commitment by editors to actually preferring more exact standards.
Posted by: Tracy Lightcap | May 17, 2006 at 01:14 PM
The .05 standard is purely a product of convention, there is nothing objective about .05 being the standard, rather than say .01 or .10. Most people agree that you have to draw the line somewhere, and .05 captures the balance between avoiding type 1 and type 2 errors. But the suggestion here is that perhaps you do not need to draw the line: let the market determine for itself what articles get cited. I suspect that the .05 standard is now so entrenched that it would act as a powerful focus point anyway -- most academics would still value only those works that satisfy the .05 standard.
But the suggestion made by William of relying more on working papers, and forgetting about formal publication, has benefits beyond the external imposition of the artificial .05 standard, once again by letting the market determine the value of an academic contribution. This may be particularly apt for the law field, where almost everyone has a story of woe about their experiences with student editors. Certainly economics seems to be heading in that direction: papers are still published in final form, but most of the action happens when working papers are posted; but perhaps economists just have particularly high discount factors, and less patience for the lengthy publication period.
Posted by: Tonja Jacobi | May 16, 2006 at 03:38 PM
Tonja,
This is a very nice post. It relates to a post by Bill Ford on the de minimis market for studies that don't hit the conventional .05 p-value. See http://www.elsblog.org/the_empirical_legal_studi/2006/05/journal_of_spur.html
What inference do with draw from (a) the unwillingness of journals to publish studies that do not hit 5%, and (b) the reality that many published studies many not stand up very well with rigorous robustness test?
I almost think we would be better off just publishing everything as working papers. The brash ring would then be building knowledge rather then obtaining a good placement.
Posted by: William Henderson | May 15, 2006 at 10:07 PM
The first article on the death penalty by Professors Donohue and Wolfers was published by the Stanford Law Review in response to Professors Cass Sunstein and Adrian Vermeule's article on the death penalty. The issue also includes an article by Carol Steiker and a response by Sunstein and Vermeule:
Cass R. Sunstein & Adrian Vermeule, Is Capital Punishment Morally Required? Acts, Omissions, and Life-Life Tradeoffs, 58 STAN. L. REV. 703 (2005), available at http://lawreview.stanford.edu/content/issue3/sunstein1.pdf.
Carol S. Steiker, No, Capital Punishment Is Not Morally Required: Deterrence, Deontology, and the Death Penalty, 58 STAN. L. REV. 751 (2005), available at http://lawreview.stanford.edu/content/issue3/steiker.pdf.
John J. Donohue & Justin Wolfers, Uses and Abuses of Empirical Evidence in the Death Penalty Debate, 58 STAN. L. REV. 791 (2005), available at http://lawreview.stanford.edu/content/issue3/donohue.pdf .
Cass R. Sunstein & Adrian Vermeule, Deterring Murder: A Reply, 58 STAN. L. REV. 847 (2005), available at http://lawreview.stanford.edu/content/issue3/sunstein2.pdf.
Posted by: Stanford Law Review | May 15, 2006 at 09:38 PM