One of the interesting questions of empirical analysis is whether to approach a research question from a macro-level or a micro-level perspective. Should we aggregate data, ignore relatively small differences, and look for the big picture? Or should we look at the individual level, take account of particulars, and try to leverage the differences between cases in order to increase the sophistication of our explanations?
The approach one takes, I suspect, has a lot do with the type of question one hopes to address, but I’m particularly interested in the tradeoffs that exist between the two for those of us who are interested in formulating an empirical view of law and courts. Is one approach inherently preferable for what we study? Does one approach make quantification more problematic?
My impression is that those of us who do empirical work on the law have, on balance, been lumpers more often than splitters. Certainly my fellow political scientists do lots of convenient aggregating of data -- across judges, courts, states, and years, for example -- to test various notions about the intersection between law and politics. As a consequence, I think we have some very good evidence relating to, among other things, judicial decision making and the relationship between courts and other political actors.
In treating courts as political institutions, however, we are sometimes criticized for failing to consider more specific legal factors, such as doctrinal tests, legal customs, methods of statutory interpretation, and other variables that are regarded as significant for explaining what courts do. The consistency of these criticisms should, I think, give us some pause.
Given the growth of empirical legal scholarship among people with fairly diverse training and orientations, I am interested to see how both lumping and splitting will be brought to bear to offer some new insights. For instance, I can imagine that splitters, who see the need to test the kinds of individual differences that lumpers often ignore, will end up devising refined explanatory models with which the lumpers will have to come to terms. Indeed, an interesting body of work is already developing along those lines.
Obviously, I’ve oversimplified things a good deal, but my point, as I start off this week, is to ask others to think about which analytic approach will yield the biggest benefits and when. Which questions of law and courts are better handled from the macro-perspective, and which are more profitably addressed at the micro-level?
My thanks to Jason Czarnezki for inviting me to serve as a guest-blogger. I look forward to participating throughout the week.
May I suggest that we not choose between lumpers and splitters, that we recognize that all forms of social science research, done with reasonable care, are likely to cast some light on judicial phenomenon and that no method in isolation is likely to solve all the serious questions we ask about courts and law. The decision to be a lumper or a splitter, I suspect, is likely to be influenced as much by predispositions (I was always more interested in history than math) and graduate training as by neutral analysis on which method explains the most (the attitudinal model of scholarly dispositions will work very well here). As long as we have lots of lumpers and splitters doing good work, I suspect our knowledge of law and courts will progressively improve.
Posted by: Mark A. Graber | 12 September 2006 at 02:14 PM
Kevin, you might want to look at my SSRN paper (if you haven't already).
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=922183
The problem with lumping is that the game of inference becomes too hazardous. For example, you can change the dependant-variable data significantly in an ideology model, yet still receive statistically-significant and seemingly robust results. You can make the values for liberalism polarized or squished, and the "lumped" model still seems to "work." Yet, subtract the 10 most extreme of the lumped values, and you lose statistical significance completely. Lumping, therefore, seems to introduce instability into the models and, I think, causes researchers and their audience to become fooled into thinking that they have better results than they really do.
All of this makes me very suspicious of regression with grouped aggregates. I don't know of good methodologists outside of political science who enjoy grouped-aggregate models so much, especially when the data is readily available for analysis at the level at which it is observed. We all know the pitfalls and risks of ecological inference. It makes the job of inference become tricky.
Also, remember that aggregated models only analyze the index of aggregates that is being created by the modeler, not the things that comprise each aggregate. This gets to be tricky when interpreting fit. Once again, see my paper. (If you have already read it or will read it, feel free to email me about your thoughts. I do not think my assertions are in error.)
Regards ... and nice post!
Posted by: Sean Wilson | 11 September 2006 at 01:35 PM