Over at PrawfsBlawg, John Pfaff (Fordham) provides a cautionary reminder that most empiricists cannot hear often enough. Re-defining key variables can influence results and re-defined variables are frequently difficult to detect, particularly in large, complex, longitudinal datasets. That is, too often, secondary analyses are undertaken without necessary due-diligence involving the underlying data. Among Pfaff's take-aways:
"If nothing else, this is a strong warning against casually running empirical models, a growing problem in legal scholarship. Legal academics shouldn’t just get their IT departments to install Stata on their computers, download some data, and then start running some regressions. It can take years to fully understand what a dataset looks like, what it is really measuring, its strengths and weaknesses. People who just run some quick regressions and then send them off to a law review are likely moving knowledge backwards, not forwards, since the risk of bad results is too great."