Continuing my discussion about data-gathering and variable coding in empirical study of judges and court decisions, I want today to touch briefly, and necessarily incompletely, on a few variables that have proven “tricky” for researchers in recent studies. By “tricky,” I mean either that finding sources of information has been problematic or that translating raw data into an objectively measured factor that well estimates a variable is a particularly difficult and controversial task.
While every researcher can compile a list of troublesome variables that have troubled him or her, I’ve selected only three to discuss today, based on heightened attention given to the subject in academic debates studies and my informal poll of colleagues around the country. Those three variables are: ideology, reputation of educational institutions, and judicial involvement at various stages of litigation.
Ideology:
Because it is central to so many studies about judges, and is linked to age-old and hotly debated questions about whether judges decide cases according to legal constraints and principles or attitudes and preferences, finding an objective and efficient measurement of judicial ideology is the Holy Grail of empirical study of judicial decisionmaking. Especially for those who study the lower federal courts, much ink has been spilled on this question. (For my own (along with my co-author Michael Heise) take on the question of ideology in the context of lower federal judges, see Gregory C. Sisk & Michael Heise, Judges and Ideology: Public and Academic Debates About Statistical Measures, 99 N.W. U. L. Rev. 743 (2005).)
Rather than address this subject at great length in this forum, I want here only to note the continuing debate and offer my general sense as to where it stands at the moment. The traditional proxy for judicial ideology of lower federal court judges has long been the party-of-appointing president, a measure that admittedly was crude and failed to account for the ideological diversity among the cohort of judges appointed by any single president. Based upon past and continuing experience, this surrogate for judicial ideology remains a generally effective variable, where ideology is a secondary element of an otherwise fully-specified model and of course where partisan aspects are a legitimate focus of a study. But it appears to be fading in use, as better alternatives emerge.
The alternative measure of judge policy preferences that has received the most attention among researchers and been regularly validated in studies is the common-space scores approach adapted by Professors Micheal Giles, Virginia Hettinger, and Todd Peppers, political science professors at Emory University, the University of Connecticut, and Roanoke College, respectively, for use in study of judicial behavior. (See Micheal Giles, Virginia A. Hettinger & Todd Peppers, Picking Federal Judges: A Note on Policy and Partisan Selection Agendas, 54 Pol. Res. Q. 623 (2001); Micheal Giles, Virginia A. Hettinger & Todd Peppers, Measuring the Preferences of Federal Judges: Alternatives to Party of the Appointing President (2002)). This measure is based upon “common space” ideological scores derived for Senators and Presidents and involves assigning the ideological score for the home state Senator to a federal judge being appointed to a vacancy in that state when the Senator is of the same party as the President (thus assuming that senatorial courtesy would apply) and otherwise assigning the ideological score of the President. Others have employed variations on this common-space measure. (See Susan W. Johnson & Donald R. Songer, The Influence of Presidential Versus Home State Senatorial Preference on the Policy Output of Judges on the United States District Courts, 36 L. & Soc’y Rev. 657, 661–65 (2002).).
In my view and based on my reading of the literature, the common-space score approach is very much on the ascendancy within the academy and is likely to become the standard proxy for judge ideology in empirical research on the courts. At present, the primary focus appears to be less on the validity and efficiency of the measure than on regular updating and refinements of that basic measure. Micheal Giles has been most generous in sharing the data and keeping interested researchers apprised of developments, which not only has been of great assistance to researchers but has enhanced the increased use of this alternative in ongoing empirical work.
Reputation of Educational Institutions:
The influence of educational background on judges traditionally has focused upon legal education. For example, researchers regularly have hypothesized that federal judges who graduated from elite law schools are more likely to reach a liberal outcome, based upon what James Brudney and Corey Ditslear have described as “consistent with perceptions of elite law school faculties—and their graduates—as ideologically liberal and inclined to favor government regulation.” James J. Brudney & Corey Ditslear, Designated Diffidence: District Court Judges on the Court of Appeals, 35 Law & Soc’y Rev. 565, 598 (2001). Rankings of law schools have been a growth industry—initiated by the development of the notorious U.S. News & World Report annual ranking and continued by competing rankings, such as measures of the most prolific or most cited faculty. Thus, the problem here is not in finding information but in wisely selected the most efficient measure among them. Moreover, because all law schools are included in the typical ranking, each source provides a comparative measure of each unit within the entire universe.
By contrast, measures of prestige or reputation of undergraduate educational institutions are harder to find and less comparable when they are discovered. Although U.S. News also ranks other educational institutions, they are divided into separate categories and are not readily comparable across categories.
Following the lead of James Brudney, Sara Schiavoni, and Deborah Merritt, my collaborators and I previously have adopted the scale for selectivity of an undergraduate institution calculated by educational sociologist Alexander Astin. See James J. Brudney, Sara Schiavoni & Deborah J. Merritt, Judicial Hostility Toward Labor Unions? Applying the Social Background Model to a Celebrated Concern, 60 Ohio St. L.J. 1675, 1696, 1700 (1999). Alexander W. Astin, Who Goes Where to College? 57–83 (1965). For purposes of many studies, the Astin scale, dated as it is from 1962, has been contemporaneous or nearly so with the time period during which federal judges on the bench in the mid-1980s to mid-1990s would have received an undergraduate education. Moreover, continued adherence to the Astin scale could be justified on the grounds that the selectivity or prestige of a college or university is likely to change only slowly over time. But the shelf-life of the Astin scale is clearly nearing an end, if it has not already reached it.
I would welcome thoughts from other researchers on how measurements or proxies for reputation or exclusiveness of undergraduate institutions.
Judicial Involvement at Various Stages of Litigation:
From those who study trial courts in particular, I have learned that it is difficult to find and systematically summarize information about how judicial time and attention is focused and what effects such judicial actions may have at various stages of litigation.
As researchers, we have easy (and for the federal courts fairly complete) access to information about the beginning and ending stages of trial litigation, but much less data about what happens along the way. At one end of the process, we know how many cases have been filed, together with general information about the subject matter of and kinds of parties involved. At the other end, we have information about dispositions, that is, the outcome of cases. Aside for anecdotal information gleaned from conversations and interviews with judges, we know much less in a systematic way about the roles that judges play along the road to the eventual outcome of a lawsuit.
For example, we know that there is a much greater emphasis on encouraging settlement today than at points in the past, as evidenced by settlement conferences, settlement commissioners at some courts, etc. But how much do we know about how judges employ their judicial time and authority to promote settlement discussions (and do so both effectively and fairly)? As another example, the common understanding is that trial judges increasingly play a managerial role, intervening in discovery disputes earlier and more regularly, holding status conferences to remain apprised of developments and to resolve problems that arise at the earliest possible point. But is there a source of data that supports these conclusions and allows an empirical analysis of the managerial role of trial judges?
In sum, where could we find information that could be quantitatively summarized about how judges promote settlement, what happens in a status conference, how much judicial time is devoted to such activities, how judges decide when and how energetically to intervene in a case and how early, and what effect all of this has on the disposition?
For my fourth and final guest post, either tomorrow or the next day, I’ll share some thoughts based upon my experience and those of other researchers who’ve responded to my inquires about variables that have been neglected in recent empirical studies of the courts and that deserve to be considered as part of a model, or even as a central factor of inquiry, in future work.
Greg Sisk
test
Posted by: test person | 16 October 2008 at 08:26 PM
My is a help rather,comments.Please,Sir,do help me do an 'EMPERICAL REVIEW'on Determinants of security prices on the Nigerian Stock Exchange.
Sir,ur help will reat assistant.
Thanks
Posted by: TOR,BARILEDUM YIRADEE | 05 August 2008 at 12:09 PM
I think that the discussion within the ELS community on measurement of ideology is focusing on the wrong question. Party-of-appointing-president and Giles-Hettinger-Peppers (GHP) scores are both useful proxies in a variety of situations, but there are other ways of studying ideology that are often preferable using either of these variables.
Proxy variables are not causal variables; a judge's GHP score does not "cause" her to vote in a particular way. Proxy variables capture ideology (the true causal variable) with measurement error, which means that the regressions will yield biased estimators. In a linear regression, the ideology estimate will be biased toward zero, but in nonlinear regressions (as is usually the case for GHP scores), the bias can be in any direction. My concern is that the measurement error can in many instances be very large. This means that the estimates from a regression based on these proxies can be biased in any direction, and so can the standard errors. I have no idea how severe this problem is in typical applications, and I'm not aware of any papers that have given this question serious thought.
There is another way to measure ideology that is used quite regularly, but doesn't get as much attention as it should: just use dummy variables for each particular judge. This is done all the time in studies of the Supreme Court. We don't just look at Republicans vs. Democrats; we typically examine each justice's votes individually. This is also done occasionally in studies of district judges and administrative judges.
There are many advantages to this approach. We completely avoid the problem of measurement error. We can still examine differences by party, gender, prior experience, or other demographic variables. And sometimes there are interesting results about outlier judges that we might have missed with proxy variables.
When the ideology estimates aren't central to the study, we can treat the individual judges as fixed or random effects. There are many studies of sentencing, for instance, that treat judges as fixed effects. The random effects approach is rarer, but one example is Anderson, Kling, and Stith (JLE 1999), who model district judges as random effects in their study of the impact of the Sentencing Guidelines. In that paper, they are interested in the distribution of judge sentencing severity, not in the individual judges. There is no way they could have gotten the same results using proxy variables.
Empirical studies of circuit courts are a bit trickier because of the collegial nature of the decisions. If you are interested, take a look at my paper that addresses this question. I estimate individual judge's voting propensities in sex discrimination cases, accounting for a "cost" of dissent for both sides. I'm sure that there are other approaches that could be used as well for multi-member courts. You can find the current version of my paper here:
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=912299
The estimates for individual judges at the end of the paper are derived from the judge's actual voting behavior, but I have found that they have strong predictive power for judges' votes in other areas of law. (Much better than party or GHP scores). So another possibility, that hasn't been exploited much so far, is to use measures derived from judges' voting records in other areas of law to proxy for ideology. This can't be used in all applications, and still leads to problems of measurement error and biased estimators, but at least the problem will be less severe.
Still, there are times that party or GHP scores are the best measures of ideology that wll be available. I agree that GHP scores seem to be on the ascendancy, but I don't think that they are always the superior measure. I have found situations where GHP scores outperform party, but I have also found situations where the opposite is true. There are also several advantages to using party-of-appointing-president. First, it is easily interpretable. The
coefficient on the party variable is the difference between the average Republican and the average Democrat. Do you know off the top of your head what a coefficient of 0.4 means on a GHP score? Second, if all of the right-hand-side variables are discrete, you can use a linear probability model, which is easier to interpret (and you know the direction of the measurement error bias). GHP scores are seldom used in linear regressions, because they generally don't have a linear effect on votes.
Posted by: Josh Fischman | 14 December 2006 at 09:55 PM
Quantifying what judges are doing during the course of cases is a real problem. The difficulty showed up regularly when I worked in judicial administration. Our office was charged with recommending which circuits would be eligible for new judgeships each year, a recommendation that went straight from us to the state judicial council and on to the legislature. Problem: how do you determine the actual workload burdens of judges in circuits that had differing jurisdictions? We had some general jurisdiction courts that heard traffic cases and some that didn't, ditto with some misdemeanors and juvenile cases. We had some general jurisdiction courts that were being dog-robbed of many of their civil cases by limited jurisdiction courts with some concurrent jurisdiction too. What's a hard working court statistician to do?
We solved this problem by using a Delphi technique to get the judges to estimate how much time they spent on different kinds of case and disposition combinations. This required getting the judges to make individual estimates, then to compare their previous figures with mean figures for other judges in their circuit or administrative district. The second round corrections were then averaged and applied to different case combinations (how many hours for a bench trial misdemeanor, how many for an uncontested divorce, ect.). We could then estimate the number of judge years needed to dispose of the caseload in a circuit based on a mutually agreed judge year standard.
Lot of work, right? True, but think of the alternative. And when we actually went out and surveyed the court records to see if the estimates were on track, lo and behold they were! (We were flabbergasted, btw.)
So how does this figure in on Greg's post? We might, just might, be able to get some sort of national sample of judges using a 3 or 4 stage cluster design and convince the judges to make such estimates. Or, more likely, get a national sample of recently retired judges to do it. Result: estimates of judicial time spent on different phases of cases that are at least as good as the measures of judicial ideology mentioned above and a big fat database with many research uses.
Well, just an idea.
Posted by: Tracy Lightcap | 14 December 2006 at 01:02 PM