One wonderful thing about this blog is the opportunity to learn about various different methodological approaches available for data analysis. The workshops on methodologies and statistics (e.g. Northwestern's) are enormously helpful, and references for teaching and catching up on statistics have been as well. Experimental work is very important, and relatively uncommon approaches such as MDS, network analysis, propensity score analysis, and others can give good insight into what is going on with both conventional and unconventional data.
But I'd like to highlight one approach that cuts across many of these, and in my view is one of the most important contributions ELS can make. I'm a big fan of meta-analysis--of the quantitative synthesis of existing research studies--and advocate its use by ELS scholars, courts, agencies, practitioners, ice cream truck vendors, etc.
I've made some of my case for meta-analysis in a piece coming out in Temple Law Review, so I will be brief here. First, descriptively, meta-analysis is just that--analysis one level up. That is, primary research uses individual units (person, cases, courts, etc.) as the unit of analysis. The individual units that meta-analysis involves, though, are the empirical studies themselves from a particular body of research. Synthesizing the results of each study, the goals of meta-analysis are (1) to identify the presence or absence of an effect in an existing empirical literature; (2) to evaluate the strength of that effect, for instance by summarizing the average effect across a set or subset of studies; and (3) to identify moderator variables, elements of the various studies that might have reliably affected their outcomes. Thus, a good meta-analysis will identify and synthesize every study in a research area; find the average effect across all those studies in order to summarize the most current state of knowledge; and then systematically compare and contrast across studies, in order to identify methodological and substantive aspects of the various studies in a discipline that might be associated with the effect sizes they report.
Meta-analysis has a number of benefits, especially relative to the traditional narrative lit review: it is more comprehensive; it better avoids subjective judgments about what to include in a review; it is better at identifying "small" effects; it avoids over-emphasis on misleading statistical significance and p-values (focusing on effect sizes instead); emphasizes moderator variables; and, in my view, gives a better grounding for policy inferences.
Meta-analyses, unfortunately, are still rare in law reviews (Chris Guthrie and Dan Orr had one recently, but few others appear; they are cited sometimes, though). They are used in court, usually in mass tort cases, but are often misunderstood by courts and experts. My hope is that use of meta-analyses in ELS will increase--they serve well to summarize a body of research; they allow (or force) researchers and policy-makers to speak a "common language"--by making cherry-picking from existing studies more difficult--and, identifying moderator variables, they can prompt further research questions.
Indeed, there's more that we agree about than disagree with. One small note: You mention that it's better to use effect sizes instead of p values -- but wouldn't you agree that it's exceedingly rare for original research to report effect sizes (irrespective of the supposed requirement by many professional organizations, such as APA, that effect sizes must be reported?)
Posted by: Steve | 01 June 2007 at 02:05 PM
I think we agree about more than is easily conveyed over blog posting. A few more thoughts:
First, although it’s true that more data can more easily lead to spurious findings of statistical significance, that’s exactly the reason MA focuses (should focus) on effect sizes rather than p-values. I’d go so far as to say I’m not sure combining p-values in a MA is useful, because, as you say, it’s so easy to come up with overall significant results across a large pool of studies. Rather, we should work with the effect sizes from those studies.
Second, what I meant about weighting is allowing certain studies to “count” more when, for instance, evaluating the average effect--based on features identified a priori. So, as a facile instance, if a meta-analyst is working with experimental studies, random assignment to condition is crucial. A study that used it would be rated high on that feature of study quality; one that didn’t would be rated low. Those studies that did use random assignment would be weighted more heavily in subsequent analyses—calculating a summary statistic in particular. The criteria for evaluating such quality should be made explicit, and, I think, it’s up to the “consumer” to assess whether the meta-analyst was fair. I actually think both weighted and unweighted analyses should be provided, and, ideally, not only should the studies a meta-analyst used be identified, but an Appendix or other source could be provided that includes the data used.
And I think you weren’t suggesting that the MA process is unethical, just the potential for it to be so used. This is, of course, a problem in all empirical (and even non-empirical) research. The transparency of providing objective inclusion and quality criteria, as well as even providing the data, IMHO, does as much as possible to address this.
Last, I do think a MA gives a better sense of what is known than individual studies. More important, I think, is my “common language” and “cherry-picking” point—if researchers (courts, agencies, practitioners) work from the same set of studies, it can be easier to avoid focusing on only the studies that help one side or the other.
I appreciate your helping me clarify my thoughts here.
Posted by: Jeremy A. Blumenthal | 01 June 2007 at 09:12 AM
Great post Jeremy. Before I respond, I just want to make it clear that I'm not trashing MA -- I think they can be great studies, but there a limitations.
First, I'm not sure more data is always better. As you know, the more data one has the more likely it is that a significant relationship will be found BY CHANCE. In the world of empirical research, quality is better than quantity. I've never heard of MA using wighted variables to allow "better" studies allowing for quantification of quality. Perhaps I've misunderstood your point. Nonetheless, that still gives the author a lot of leeway in determining the what is good v. bad quality.
And along that vein, I wasn't suggesting that authors are unethical when using MA to get results (although clearly the NEJM article was); rather, I was pointing out that a crucial step in conducting MA is that the author's must make a decision about what the inclusion criteria are -- and this is always problematic because no criteria are perfect. Of course, no empirical study is perfect either, but since few people actually read and critically think about the inclusion criteria, MA has the risk of presenting an issue as settled when it not be. Just my 2 cents.
Posted by: Steve | 01 June 2007 at 08:48 AM
These are good points, and certainly a couple of the more oft-levied concerns about MA. One facile response--though, I think, nevertheless accurate--is that these problems are even worse with traditional narrative reviews. More narrowly, though:
On the GIGO concern: first, I think that as a general proposition when reviewing the state of research, having more data is better than having less, for several reasons: increasing power in the MA's subsequent analyses; reporting most accurately what is known in an area; not "wasting" data; avoiding (in part) criticisms of selection bias. . . . Second, I'm not sure I agree that a MA can't control for poor methodology, in this sense: again, an advantage of MA is the opportunity to look at what variables of the primary studies are correlated with their outcomes. A meta-analyst can quantify study quality as one factor that might affect the results. Finding that there is a relationship allows quantification of that influence, and also gives justification for weighting higher-quality studies more heavily in subsequent analyses. (Of course, finding that there is NO relationship between study quality and observed effect size is helpful as well; there is some evidence that in the typical MA, there is little such relationship.)
That approach gets at the second issue you raise, a selection concern. First, of course, although I understand your point about the authors' ethics--and that will apply to primary researchers and meta-analysts--again, dealing with such behavior is not limited to MA; any secondary user will have to decide how far to go in evaluating whether primary research (empirical or not) was conducted appropriately. I would respond to your substantive point about variation in study quality and approaches, though, as above, encouraging a meta-analyst to use study quality as a moderator variable.
How? There are at least two ways. One traditional approach is for coders to evaluate a study's methodology and weight each study by that rating. This has been criticized as introducing overly subjective factors, though I think that having multiple coders and reliability checks addresses that. A more objective approach is to develop an a priori checklist of what makes a good study (e.g., random assignment in an experiment), then rate the studies on each checklist feature. In both cases we can see whether the features correlate with the study's outcome.
Of course, in any such approach--and, when reporting what studies were included or excluded or any such decision or judgment a meta-analyst makes--the author should report the basis for those ratings and decisions.
Thanks for raising these issues--I don't mean that MA is a panacea, but I do think it's a very valuable but under-used methodological tool.
Posted by: Jeremy A. Blumenthal | 31 May 2007 at 02:12 PM
Prof. Blumenthal,
I'd be curious about your opinions of the limitations of meta-analysis. As you probably know, the prevailing criticisms of this method center around two issues.
First, the results of meta-analysis are only as good as the quality of the original research. The notion of "garbage in, garbage out" is a noted weakness of this method. Since meta-analysis takes reported outcome values from original research to arrive at an effect size, it's quite easy for a meta-analysis to present inflated or invalid estimates. An original research article may do a poor job with its own methodology but arrive at impressive results. A subsequent meta-analysis has no way of controlling for the poor methodology from the original study.
Second, another major criticism revolves around inclusion criteria. How does one decide which studies to include. Most studies have substantial disparities in terms of sample characteristics, analysis, and variable construct validity. Authors of meta-analysis articles have wide discretion in determining which studies to include and which to exclude and such decisions can have profound effects on the subsequent effect sizes. Just look at the Vioxx article in the New England J. of Medicine. It later came out that the authors of that study removed a study in order for the effect size to reach statistical significance (discovered by embedded data in the word document). These seems like major limitations for meta-analysis.
Posted by: Steve | 30 May 2007 at 07:02 PM