There is one rule of thumb that I find quite useful and happen to use fairly often. It is probably not widely known nor described in textbooks (I stumbled upon it on my own), so I regularly have to explain it. Next time I'll just point out to this post.

The rule is the following: a proportion estimate obtained on a sample of points should only be trusted up to an error of .

For example, suppose that you read in the newspaper that "25% of students like statistics". Now, if this result has been obtained from a survey of 64 participants, you should actually interpret the answer as , that is, , which means that the actual percentage lies somewhere between 12.5% and 37.5%.

As another example, in machine learning, you often get to see cases where someone evaluates two classification algorithms on a test set of, say, 400 instances, measures that the first algorithm has a precision of 90%, the second a precision of, say, 92%, and boldly claims the dominance of the second algorithm. At this point, without going deeply into statistics, it is easy to figure that should be somewhere around 5%, hence the difference between 90% and 92% is not too significant to celebrate.

### The Explanation

The derivation of the rule is fairly straightforward. Consider a Bernoulli-distributed random variable with parameter . We then take an i.i.d. sample of size , and use it to estimate :

The 95% confidence interval for this estimate, computed using the normal approximation is then:

What remains is to note that and that . By substituting those two approximations we immediately get that the interval is at most

### Limitations

It is important to understand the limitations of the rule. In the cases where the true proportion estimate is and is large enough for the normal approximation to make sense (20 is already good), the one-over-square-root-of-n rule is very close to a true 95% confidence interval.

When the true proportion estimate is closer to 0 or 1, however, is not close to 0.5 anymore, and the rule of thumb results in a conservatively large interval.

In particular, the true 95% confidence interval for will be nearly two times smaller (). For the actual interval is five times smaller (). However, given the simplicity of the rule, the fact that the true is rarely so close to 1, and the idea that it never hurts to be slightly conservative in statistical estimates, I'd say *the one-over-a-square-root-of-n rule* is a practically useful tool in most situations.