We eat a lot of Indian food at home, and often look up recipes online. Common ingredients include semolina and farina. Indian words for farina include sooji and rava, and it is also commonly referred to as Cream of Wheat, though this is a brand name and not exactly the same thing. Strange then that many expert Indian cooks don't seem to recognize the difference between these ingredients! I see the ingredients referred to incorrectly everywhere on the web.

What most American cooks call semolina is not sooji. Semolina is yellow, is made from hard durum wheat, and is the primary ingredient of Italian pasta. The American word for sooji is farina. Farina is white, is made from soft wheat, and is the primary ingredient of Cream of Wheat.
If you make a semolina cake with farina, the result will be mushy. And what you want for upma is not semolina, it's farina, i.e. sooji or rava. It is a difference that matters.
Indian stores tend to sell both products. That's another reason why it is strange that so many experts don't seem to know the difference!

## Hypothetical Statistics Question

Saturday, October 19, 2013
Posted by
SteveBrooklineMA

Let \(\phi(x,\mu,\sigma)\) denote the normal distribution with mean \(\mu\) and standard deviation \(\sigma\) evaluated at \(x.\) Consider the model distribution
$$p(x,\mu)=(1-t)\cdot \phi(x,\mu-3t,1)+t\cdot \phi(x,\mu+3\cdot(1-t),t^2),$$
for some known, fixed, but very small \(t>0.\) Since \(t\) is small, \(p\) looks very much like a normal distribution with variance \(1\) and mean \(\mu,\) except for a tall spike around \(\mu+3.\) See the picture below. The mean of \(p\) is \(\mu.\) Note in the last term, the variance is \(t^4.\) The total mass of the spike is \(t,\) thus small, and so the cumulative distribution for \(p\) will look very much like the standard cumulative distribution for \(\phi(x,\mu,1).\)

Suppose we have a sample \(x_0,\) assumed to be from a distribution \(p\) of the form above, but of unknown mean. To repeat, \(t\) is known. Consider the hypothesis: $$H_0: \mu=x_0-3\cdot(1-t)$$ Do we reject \(H_0\)?

If \(H_0\) is true, then \( p(x_0,\mu)\approx t \cdot \phi(x_0,x_0,t^2) = 1/(t \sqrt{2\pi}),\) and this is very large, about \(1/t\) times larger than \(p(\mu,\mu).\) Indeed, \(x_0-3\cdot(1-t)\) is very close to the maximum likelihood estimate for \(\mu.\) These seem to be reasons not to reject \(H_0.\)

On the other hand, the overall mass of the spike is small, and the entire spike is well out on the tail of \(p.\) Thus it is in some sense unlikely to get an \(x_0\) out there, assuming \(H_0\) is true. Since the cumulative distribution looks very much like the normal cumulative distribution for \(\phi(x,\mu,1),\) it makes sense to apply the usual test and reject \(H_0.\)

I find that not rejecting in this case is the right thing to do, but I am not sure what others might think. I also wonder how often such a question is relevant.

Suppose we have a sample \(x_0,\) assumed to be from a distribution \(p\) of the form above, but of unknown mean. To repeat, \(t\) is known. Consider the hypothesis: $$H_0: \mu=x_0-3\cdot(1-t)$$ Do we reject \(H_0\)?

If \(H_0\) is true, then \( p(x_0,\mu)\approx t \cdot \phi(x_0,x_0,t^2) = 1/(t \sqrt{2\pi}),\) and this is very large, about \(1/t\) times larger than \(p(\mu,\mu).\) Indeed, \(x_0-3\cdot(1-t)\) is very close to the maximum likelihood estimate for \(\mu.\) These seem to be reasons not to reject \(H_0.\)

On the other hand, the overall mass of the spike is small, and the entire spike is well out on the tail of \(p.\) Thus it is in some sense unlikely to get an \(x_0\) out there, assuming \(H_0\) is true. Since the cumulative distribution looks very much like the normal cumulative distribution for \(\phi(x,\mu,1),\) it makes sense to apply the usual test and reject \(H_0.\)

I find that not rejecting in this case is the right thing to do, but I am not sure what others might think. I also wonder how often such a question is relevant.

## Clueless

Saturday, October 12, 2013
Posted by
Auntie Ann

Our kid's school has a "Smart Lab", which is a place where kids, generally in 6th-8th grade, can play with technology.

Here is a transcript-by-memory of a conversation I had with the instructor:

What's amazing is that the

Meanwhile, I asked the other kid's 8th grade computer

Here is a transcript-by-memory of a conversation I had with the instructor:

Me: Do any of the kids do any actual programming?Me: (Beating my head against the wall)...

Instructor: Oh, yeah. Have you heard of Scratch?

Me: Yes. That's basically like MindStorms; you tell it to move the figure three steps this way, two steps that way. I mean actual coding.

Ins: Sure! Some kids even make webpages.

Me (trying very hard not to laugh in his face): No, I mean actual programming, like goto subroutine, if this then that.

Ins: Oh, no, not really.

What's amazing is that the

*teacher didn't even know what I meant when I said either "programming" or "coding".***technology**Meanwhile, I asked the other kid's 8th grade computer

**teacher whether they taught any other languages than Java, and if students were taught to document their code. Answers: Other languages are offered later, in high school (the AP test is only Java, so that's the emphasis), and documenting is not stressed in the level 1 class, but in the second class they do.***programming*## What is an Unusual Event?

Friday, October 11, 2013
Posted by
SteveBrooklineMA

My random thoughts on random events posted as a blog comment:

Suppose we have disjoint events \( (E_1,E_2,\dots,E_N),\) and corresponding probabilities for these events \( (p_1,p_2,\dots,p_N),\) where \( p_n=\mbox{Prob}(E_n)\) and \(\sum_{n=1}^N p_n=1.\) If a particular event \(E_k\) occurs, what would make us think this was in some sense "unusual" or perhaps "suspicious"? It's not enough that \(p_k\) be small, since for large \(N\), even a uniform distribution on the \(E_n\) will have \(p_k=1/N\) small. Nor is it enough that \(p_k\) be much less than \( \max_n p_n,\) since it is possible that all \(p_n\) are equal except for one event having many times larger yet still tiny probability. It's not enough if \(p_k\) is less than nearly all the other \(p_n\), because all the \(p_n\) could be very nearly equal.

What does seem to work in the cases I can think of is to choose some factor \(R \gt 1\), calculate \(\sum\{p_n: p_n \gt R p_k\}\), and see if this is close to \(1.\) To work this into a hypothesis test, we could reject the null hypothesis \(H_0\) if $$\sum\{p_n: p_n\gt R p_k\} \gt (1-1/R),$$ though the expression on the right-hand side is rather arbitrary. With this setup, what value should \(R\) be? Let \(x_0\) be a sample we have collected, and consider the standard normal and the \(p=.05\) rule, where \(\mbox{Prob}(|x_0|\gt 1.96)=0.05.\) Then \(R=3.71,\) since \(3.71\cdot \phi(1.96) = \phi(1.1),\) and \(\mbox{Prob}(|x_0|\gt 1.1)=1/3.71.\) If we wanted \(R=20,\) we would need to use a cutoff \( |x_0|\gt 3.135749,\) which corresponds to a very small standard \(p\)-value of \(0.001714.\)

Clearly, given any \(p\) cutoff, a.k.a \(\alpha\), we can find a corresponding factor \(R,\) and vice-versa. Since the \(p=.05\) rule is arbitrary, I don't see what difference it makes for the most common cases. Thus, \(p\)-value analysis seems generally ok to me in practice. My concern here is with its justification.

Suppose we have disjoint events \( (E_1,E_2,\dots,E_N),\) and corresponding probabilities for these events \( (p_1,p_2,\dots,p_N),\) where \( p_n=\mbox{Prob}(E_n)\) and \(\sum_{n=1}^N p_n=1.\) If a particular event \(E_k\) occurs, what would make us think this was in some sense "unusual" or perhaps "suspicious"? It's not enough that \(p_k\) be small, since for large \(N\), even a uniform distribution on the \(E_n\) will have \(p_k=1/N\) small. Nor is it enough that \(p_k\) be much less than \( \max_n p_n,\) since it is possible that all \(p_n\) are equal except for one event having many times larger yet still tiny probability. It's not enough if \(p_k\) is less than nearly all the other \(p_n\), because all the \(p_n\) could be very nearly equal.

What does seem to work in the cases I can think of is to choose some factor \(R \gt 1\), calculate \(\sum\{p_n: p_n \gt R p_k\}\), and see if this is close to \(1.\) To work this into a hypothesis test, we could reject the null hypothesis \(H_0\) if $$\sum\{p_n: p_n\gt R p_k\} \gt (1-1/R),$$ though the expression on the right-hand side is rather arbitrary. With this setup, what value should \(R\) be? Let \(x_0\) be a sample we have collected, and consider the standard normal and the \(p=.05\) rule, where \(\mbox{Prob}(|x_0|\gt 1.96)=0.05.\) Then \(R=3.71,\) since \(3.71\cdot \phi(1.96) = \phi(1.1),\) and \(\mbox{Prob}(|x_0|\gt 1.1)=1/3.71.\) If we wanted \(R=20,\) we would need to use a cutoff \( |x_0|\gt 3.135749,\) which corresponds to a very small standard \(p\)-value of \(0.001714.\)

Clearly, given any \(p\) cutoff, a.k.a \(\alpha\), we can find a corresponding factor \(R,\) and vice-versa. Since the \(p=.05\) rule is arbitrary, I don't see what difference it makes for the most common cases. Thus, \(p\)-value analysis seems generally ok to me in practice. My concern here is with its justification.

## Misandry

Monday, October 7, 2013
Posted by
Auntie Ann

This must be one of the most sexist paragraphs I've read in a while (FYI, it was written by a woman):

Here, society is blameless. The constant denigrating of men, the constant focus on pushing girls ahead without ever telling boys that they can succeed too, the assumption that men are violent, shiftless slackers, has no culpability in this woman's eyes for the diminishing of men's prospects.

If women are behind, it's not their fault, but society's. If men are behind, it's because they're slackers.

As for education, it won’t do much good for people who aren’t motivated or disciplined enough to acquire it. These people are mainly men. We all know that low-skilled men will be our world’s biggest losers, but it’s often not lack of skills that holds them back. It’s lack of the aptitudes and attitudes required for success. These are the men who can’t stay in school, can’t apply themselves, can’t take direction or defer rewards, can’t be reliable and can’t function well in teams. “Young male hotheads who just can’t follow orders are pretty well doomed,” economist Tyler Cowen says inLet's flip the wording shall we.Average is Over, a sharp and sobering book on who will get ahead, and why.

As for education, it won’t do much good for people who lack the mental capacity to acquire it, or will leave the workforce to raise children, or who will take jobs away from hard-working men. These people are, of course, women. We all know that women will be our world’s biggest losers, sucking on the teat of others--whether husbands or public assistance, but it’s often not lack of skills that holds them back. It’s lack of the aptitudes and attitudes required for success. These are the women who get pregnant early, are more irrational than men, can’t apply themselves, can’t take direction or defer rewards, can’t be reliable and can’t function well in teams except when they goWhen girls were falling behind in schools and weren't choosing difficult educational or career paths, at least some of the blame was placed on the society which sent them messages, discouraged them, and treated them like second class people.en masseto the bathroom. (Why did we ever give them the vote?!) “Young women who just can’t follow orders or lack a man's intelligence are pretty well doomed,” economist Wylma Bullen says inAverage is XY, a sharp and sobering book on who will get ahead, and why.

Here, society is blameless. The constant denigrating of men, the constant focus on pushing girls ahead without ever telling boys that they can succeed too, the assumption that men are violent, shiftless slackers, has no culpability in this woman's eyes for the diminishing of men's prospects.

If women are behind, it's not their fault, but society's. If men are behind, it's because they're slackers.