I think the way hypothesis testing is presented to students and justified makes no sense. Suppose we know that x has been drawn from a normal distribution with unit variance, and we want to test H_0: mu=0. Suppose x=2. Then they say “given H_0, the chance of seeing |x|>1.96 is less than 5%, so since x=2>1.96 we reject H_0.” Why does this make sense? Given H_0, the chance of seeing |x|<.01 is less than 5% too. Would you reject the null if our sample were x=0? No! So then they talk about x being “extreme,” i.e. far from the mean. What exactly does distance from the mean have to do with it? Suppose we knew that x was drawn from a uniform distribution on some interval [mu-1/2,mu+1/2], and again we wanted to test H0: mu=0. If x=.4999 would you reject? No, that makes no sense, because given H0, x=.4999 is no less likely than x=0. You could reject if x=.5001, but not if x is in the interval (-1/2,1/2). You could easily find an example of a bimodal distribution where the pdf at the mean is zero. Then you should reject if the sample is near the mean! Distance from the mean is not in general relevant.

I am being pedantic, but hypothesis testing works e.g. for the standard normal distribution f because if x>1.96, then f(x) is much less than values of f near x=0, not because of areas at the tails or distance from the mean.

## 0 comments:

Post a Comment