« New Book - Graber on Dred Scott | Main | Lott v. Levitt: Replication Revisited »

14 July 2006


Jeffrey Segal

Sean Wilson writes:,
“But if you read the 1995 article, you will see this is not true. (Just go look at it).
What they did is put in place a construct that analyzed only index variance, but announced their findings as though they had analyzed voting variance.”

He’s right, of course. We wanted people to think we had analyzed voting variance rather than the percentaged index. That must have been what we tried to do when we wrote “Our dependent variables are the percentages of formally decided civil liberties and economics cases from the start of the Vinson Court . . .in which the justices took a liberal positon.”

But in case anybody missed that, the fact that we provide the RMSE’s of the regession, and a scatterplot of actual versus predicted votes, and show only one observation per justice (not hundreds) would have made it clear to anyone that we were running OLS on the percentaged data that we provided in table 2.

So please, do what Wilson says, look at the article.

Sean Wilson


I see your point. But what I was trying to say was something more subtle. It's a technical point.

The unit of analysis in the aggregated model is the INDEX, not the votes. The votes in the data set are the binary observations. If you extract a percentage from these observations use it as a dependent variable in an ecological regression, you have a couple of problems. One, the votes used to generate each percentage "dot" is not equal. (Goldberg has 153 votes; Rehnquist has over 2,000 -- yet each of them is rendered equal). Secondly, you transform someone who doesn't affiliate with the empirically-observed trait (bias) into "equals" with those who do affiliate for purposes of goodness of fit analysis. The net result is that aggregate models having perfect prediction cannot distinguish between polarized or leptokurtic voting frequency distributions.

In response to this, Jeff has suggested that he was never really trying to analyze votes (the binary markings) apart from his analysis of the INDEX. The idea is that he was only talking about index variance, no matter that it has modeling flaws. That's why Bert is saying, "hey we always knew the votes had lower numbers." (Bert is trying to help him). But if you read the 1995 article, you will see this is not true. (Just go look at it).

What they did is put in place a construct that analyzed only index variance, but announced their findings as though they had analyzed voting variance. That's why they are backtracking now. It only takes 12% of the votes in Jeff's aggregate model to cause 41% of the variation in the index. There is a fallacy known in statistics called "the ecological fallacy." It occurs when you assume that variation in percentages are the same as the things comprising the percentage. Anyone who treats an individual case as though it is like its average commits this fallacy. Averages are something different from the things that comprise it.

So perhaps I was not clear in what I was saying. I just wanted it cleared up that Jeff indeed thought he was analyzing voting variance when, in fact, he was only analyzing index variance. The truth is that voting variance is about 1/3 of what he thought it was.

Bill Ford

"I don't understand how you can say that your models never tried to analyze votes.... Secondly, you know darn well that you guys thought you were analyzing votes."

Sean, whether they aggregated the votes or not, they are still analyzing votes. It's fine to point out the consequences of various modeling approaches, but I don't think the sentence, "we are analyzing votes," is ONLY valid/appropriate when one is analyzing votes at the individual case level.

frank cross

Dworkin didn't invent the common law.

Sean Wilson


1. Everyone knows that if you take either a logit or ecological model and add more statistically-significant variables that the fit will go up. That's not the issue. The issue is whether a one-variable model of grouped aggregates artificially raises fit for binary data that is analyzed at the level it is observed. It clearly does because aggregation transforms cases that resist the binary stimuli into good "line fitters."

2. Take a look at my most recent web entry. This should help you see the point. I have four models on the table: one where justice values are autonomous, two where they are dependent and the one that exists in reality. Go look at each model. For the first three models, statistical significance is .000 and the R-squared is 1.0. In other words, the R-squared in the ecological regression can't distinguish between polarized or "squished" voting. It is a completely bogus statistic for assessing the fit of votes. Just take a look at it and see if you can understand why aggregation is, in effect, "cooking the books:"


3. I don't understand how you can say that your models never tried to analyze votes. You seem to think that analyzing a handful of percentages of votes excuses you from charges of modeling flaws. Firstly, analyzing the percentages was a wrong way to analyze ANY aspect of voting behavior, be it variance or otherwise. Secondly, you know darn well that you guys thought you were analyzing votes. Just read your 1995 article right here (it has ecological fallacy written all over it):


4. On the issue of Dworkin, you said the following:

"I think we are better off conceiving of law as a potential and potentially measurable influence on judges’ behavior. To determine the impact of law is not much different than determining the impact of other social phenomena. Simply put, judges’ decisions should change -- not deterministically, but at the margins -- as law or legal arguments change, holding alternative phenomena constant."

Jeff, do you even understand Dworkin? He's the one that INVENTED the idea that you could validly have a concept of law that didn't entail only rules. It could involve standards and things that structure, rather than dictate, decision making. Anyone who shows that a justices uses a decision constituence that itself is transformative of preference is doing work with a Dworkin paradigm. I call it "structuralism." You were for the idea a few days ago -- what happened?

Jeffrey Segal

1. Attitudes alone do not fit extremely well at the case level, but a) at the case level this is an incomplete account of what the attitudinal model expects, and b) I would nevertheless bet that attitudes alone explain the justices' individual level votes better than any other single variable.

3. I'm not sure what we overstated. Perhaps this claim is about individual versus aggregate levels of analysis. While we do state in SCAMR that "we use as our dependent variable the votes of all justices . . . in civil liberties cases," we also present the data that we use (table 8.2) and those data are in aggregate form. We also present the OLS slope coefficient and, for four of the justices, the root mean square error. While the sentence quoted above could have, in isolation, been ambiguous, no one with a basic knowledge of OLS could read the full passage as if we ran the analysis on the individual-level votes.

4. Don't know where we say that our model supports a Dworkin view of judging.

Finally, the question "what would justices have to do on a career basis for Jeff's model not to support it?" Easy: vote inconsistently with their attitudes as measured prior to their coming on the Court.

Sean Wilson

You know what, Frank? I've been thinking about this, and I have become dumbfounded over something. This is like trying to chase a ghost. What Jeff (and unfortunately Bert to some extent) have said is that: (1) Yes, Jeff's work doesn't show an especially good model at the individual level of analysis (you will note that neither of them dispute this); (2) this is ok because aggregate analysis never attempted this as its goal; (3)we just overstated our case in a few pages in the book; and (4) oh by the way, our model supports a Dworkin view of judging and perhaps other ones too.

The problem with this maneuver from Jeff's point of view is that it conceeds the farm to pay the mortgage. For example, if the attitudinal model showed only that its ostensible competitor models were just as good, why is anyone even talking about this research? Why is it in the New York times? Why do law students care about the model? Mark Graber recently said that Jeff created an empirical model of realism. Did Dworkin do that, too? How can Jeff be with Dworkin, the realists, John Austin, the critical legal studies people, and the pragmatists all at the same time? What would justices have to do on a career basis for Jeff's model not to support it?

You know, statistics and metaphysics really don't mix. You have the data at the level you observed it, and it says what it says. It really doesn't say anything else. And what it says is: "so so."

frank cross

I think both approaches are perfectly legitimate, you just need to be transparent about what you are doing.

I think Sean's legitimate complaint is that the S/S results were sometimes misused in their description.

Sean Wilson

Jeff, let me offer now my reply to you, not Bert. First, your R-squared is manufactured by the fact that you aggregated a binary variable. That is quite clearly explained here:


In a binary model, what drives the fit downward are justices that do not systemically affiliate with your measure of bias. When you aggregate the data, you transform non-directional justices into optimally-biased justices. Hence, what you have done from the standpoint of model specification is to create a model that treats people who do not favor liberal outcomes (50% liberal) the same in terms of bias as those do favor or not favor such outcomes (those who vote 80% or 20% liberal). This is objectionable not only because it leaves justices no way to vote throughout their career to avoid the claim of bias, but also because bias is supposed to be an empirically observed phenomenon, not a manufactured one. You have a binary variable that you have chosen not to analyze. That is very strange. Instead you analyze a summary statistic and pretend that you have a good fitting model. In truth, your model has an anemic fit. This is demonstrated here:


And also here:


Finally, yor R-squared even in an ecological model is only 41%. The only way you can get it higher is by cherry-picking justices. You just don't have a model that produces the level of explanation that you want.

Sean Wilson

Jeff -- Bert Kritzer's post did not deal with issue appropriately. Just so ELS readers can see the problem, I've posted my reply to Bert below:

.. just so there is no confusion, the R-squared between "aggregates" from 1946-2004 is 41%, but it is driven by only 12.5% of the votes. Here is the analysis on that:


Stated another way, on its own terms the aggregate model reports that only 12.5% of votes are necessary to change the liberal index by 41%. This is what was not understood by many political science scholars. The R-squared in an ecological model explains only the variance in the numbers that comprise an index, not the variance in the votes. Hence, statements such as these are factually incorrect:

When it turned out that [Segal and Spaeth] could explain more than 60 percent of the variation in civil liberties votes based solely on the justices’ policy preferences, the researchers concluded that justices come to the bench with a set of policy preferences, which they pursue through their votes, at least in civil liberties cases.”

The truth is that the amount of variation in the votes that political science can attribute to a propensity for political direction -- assuming one wants to rely upon a dichotomous model -- is about 24%. (For today's Court, the number is about 9%). That's not to say that this level is or is not what a better model might show; it is only to state what is empirically accurate. So I would say that there is plenty that is "news," at least so far as the publications of political science reveal.

I also hope that "turning down the temperature" is not equated with "do not show people this error." If there is a more kind way to show the mistake, I'm all for it. But I think we have an obligation not to tell graduate students and other academics that our models explain 60% or more of what the Court does in civil liberties or any area of voting. At least not until we have a model that does that.


Dr. Sean Wilson, Esq.
Penn State University

Jeffrey Segal

As noted, here are Bert Kritzer's comments from the lawcourt's listserv, reprinted with Bert's permission.

The issue being discussed on the listserve is how well does some measure of judicial attitudes explain judicial behavior?; That question can be asked at two levels: in the aggregate for a given judge and at the level of the individual case for individual judges. These are different questions, complicated by the fact that for one we typically do an OLS regression and for the other we do logistic regression.
Comparing fit statistics across these two methods is at best an approximation. Moreover, aggregate patterns are always better explained than are individual level patterns because in the aggregation process randomness cancels out. There are really two different questions here: how do we explain a judge's specific votes and how do we explain a judge's pattern of voting? Most people are smart enough to understand the difference.

So, if you take Segal Spaeth's search and seizure data (from SCAM) and run a simple regression (OLS) at the level of justice-vote (n=1550), you get and R2 of .205 (which goes up to .275 if you add in the usual control variables). If you run logistic regression and look at the R2 analogues that you can compute from logistic regression, the values are in the same general range. For research on individual level behavior, this is quite good.
If you aggregate the data and compute the percentage of the time each justice upholds the search, and then run a regression predicting this vote pattern using the justice's attitude as the sole predictor, you get an R2 of .741 (n=15 for fifteen justices included in the data set). For aggregate level analyses, this is the level of relationship I would expect to see.
None of this is news or surprising.

The comments to this entry are closed.


April 2018

Sun Mon Tue Wed Thu Fri Sat
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30          

Site Meter

Creative Commons License

  • Creative Commons License
Blog powered by Typepad