A recent discussion on the Statistical Modeling, Causal Inference, and Social Science Blog (SMCISS Blog?) provides an interesting discussion of some issues associated with tests of statistical significance. The discussion was motivated by a hypothetical situation involving a direct mail marketing campaign, the text of which I have slightly edited:
Group 1: 50,000 customers are mailed a catalog. Of these, 100 made a purchase, and the mean of their spending was $50 with a standard deviation of $10.
Group 2: 50,000 customers are mailed a catalog. Of these, 120 made a purchase, and the mean of their spending was $55 with a standard deviation of $11.
Some questions: Is there a statistically significant difference in the mean purchases associated with these two groups of customers? What is the appropriate N to use in testing for a difference? (If it’s 50,000, is there any reason to even bother with the test?) More importantly, for a decision-maker in this hypothetical company, what turns on whether the difference is statistically significant? These questions are discussed here and then here.
It’s reassuring to find people who are much wiser in the ways of quantitative methods struggling over “basic” questions—primarily, the important question of the value of hypothesis testing in this situation.