Hypothetical Statistics Question

Let \(\phi(x,\mu,\sigma)\) denote the normal distribution with mean \(\mu\) and standard deviation \(\sigma\) evaluated at \(x.\) Consider the model distribution $$p(x,\mu)=(1-t)\cdot \phi(x,\mu-3t,1)+t\cdot \phi(x,\mu+3\cdot(1-t),t^2),$$ for some known, fixed, but very small \(t>0.\) Since \(t\) is small, \(p\) looks very much like a normal distribution with variance \(1\) and mean \(\mu,\) except for a tall spike around \(\mu+3.\) See the picture below. The mean of \(p\) is \(\mu.\) Note in the last term, the variance is \(t^4.\) The total mass of the spike is \(t,\) thus small, and so the cumulative distribution for \(p\) will look very much like the standard cumulative distribution for \(\phi(x,\mu,1).\)

Suppose we have a sample \(x_0,\) assumed to be from a distribution \(p\) of the form above, but of unknown mean. To repeat, \(t\) is known. Consider the hypothesis: $$H_0: \mu=x_0-3\cdot(1-t)$$ Do we reject \(H_0\)?

If \(H_0\) is true, then \( p(x_0,\mu)\approx t \cdot \phi(x_0,x_0,t^2) = 1/(t \sqrt{2\pi}),\) and this is very large, about \(1/t\) times larger than \(p(\mu,\mu).\) Indeed, \(x_0-3\cdot(1-t)\) is very close to the maximum likelihood estimate for \(\mu.\) These seem to be reasons not to reject \(H_0.\)

On the other hand, the overall mass of the spike is small, and the entire spike is well out on the tail of \(p.\) Thus it is in some sense unlikely to get an \(x_0\) out there, assuming \(H_0\) is true. Since the cumulative distribution looks very much like the normal cumulative distribution for \(\phi(x,\mu,1),\) it makes sense to apply the usual test and reject \(H_0.\)

I find that not rejecting in this case is the right thing to do, but I am not sure what others might think. I also wonder how often such a question is relevant.

 

0 comments: