Four Years Remaining

On Science and Superstition

Posted by Konstantin 10.12.2011 No Comments
It seems to be popular among some of my (very scientifically-minded) friends to somewhat provocatively defend a claim, that "fundamentally, there is no difference between a scientific worldview and a superstitious or a religious one". The typical justification is that the desire of one person to believe the scientific methodology and only trust in the "material world" has the same roots as the desire of another person to believe in the supernatural. Consequently, it should be unfair to claim that one set of beliefs must have a priority over the other. In the words of a well-versed narrator, this line of argument can be made very convincing and leave one wondering: if the choice of basic beliefs is indeed just a matter of taste and cultural traditions, why does it still seem that the "scientific worldview" is somewhat more "practical". Is it just an illusion? The following is an attempt to justify that it is not, and that there are some simple reasons for the objectivist worldview to be superior.

In order to avoid misunderstanding, so common in philosophical discourses, we must start by defining some very basic terms.
- First, an obvious yet necessary definition of us. By "us" I mean just myself and you, the reader. The existence of me requires no proof for me and I presume this is symmetric with you. The existence of you I can observe personally when we have a chance to meet. Thus, we might agree there is this thing called "us".
- By a subjective reality I mean everything perceivable by me. This, again, requires no proof for me. I presume that there is an equivalent subjective reality for you. Most importantly, although our two realities may not necessarily be equal, they have a lot of things in common. For example, the text of this post obviously belongs to both. Thus, the set of things shared by our subjective realities shall be referred to as (our) objective reality.
- By an individual we shall mean some entity, capable of communicating with us. That is, an individual is something capable of perceiving our signals and reacting to them. We do not assume any other properties for an individual - it may be a person, a fairy, or an alien. As long as we can communicate in some way, it fits the definition.
- By an individual's worldview we shall regard the set of all the possible reactions of an individual to all the possible stimuli. That is, I declare the individual's worldview to be exactly defined by its observable behaviour. In order for two individuals to have differing worldviews they must behave (or, to be more precise, communicate) differently in at least one situation.
Communication can take multiple forms. Let us limit ourselves mainly with verbal communication. Consideration of nonverbal worldviews would complicate the discussion for no good reason (those would be things such as musical and culinary tastes).

So, suppose we have an individual. We can talk to it, and it can answer something. Its worldview is then just the set of answers to the questions in a given context (-Do fairies exist? -Yes/No/I don't know; -Is this liquid poisonous? -Yes/No/I don't know; etc).

Obviously, according to such a definition, every person in this world is an individual and has its own unique worldview. The worldviews of some people are perhaps "more scientific" and those of others are perhaps "more superstitious". In the following I am going to show that there is one worldview which is "most informative" for the purposes of our communication with that individual, and this will be precisely the "scientific" one.
- By a basic stimulus we shall call a stimulus, to which an individual responds in a way which is reproducible to some extent. Any other stimuli (those that provide a purely random response) are essentially uninformative and we would have no use for them in communication with the individual. The simplest example of a basic stimulus is a constant response. For example, most normal people, when pointed to a glass of milk and asked "Is this milk?", should provide a reproducible answer "Yes". Thus, "Is this milk?" is a basic stimulus.
- Next, we shall say that the basic stimulus is objective, if there is at least some observable property of (our) objective reality, changing which would (again, reproducibly) influence the reaction to that stimulus. Any non-objective stimuli are of no interest to us, because they bear no relation to observable objective reality, and thus we have no way of interpreting the answer.
  For example, an individual might constantly respond "Boo!" to a query-phrase "Baa?", which means that "Baa?" a basic stimulus. However, we have no way of associating the answer "Boo!" with anything material, and thus have no way of understanding this response. Hence, from the point of view of communication, the query "Baa?" makes no sense. Similarly, if a person answers to the question "Do fairies exist" by "No", no matter what, we shall call this question non-objective.
Now that we are done with the preliminary definitions, let me state the main claims.
- We say that a worldview is ideally scientific, if it has the maximum possible set of basic objective stimuli (and any reactions to them).
Most probably no human has an ideally scientific worldview simply because no human has explored all of the quirks of our objective reality. But each individual's worldview can be regarded as more or less scientific considering the amount of its basic objective stimuli. Contrarily,
- We say that a worldview is ideally dogmatic, if it has the maximum possible set of basic non-objective stimuli.
An ideally dogmatic worldview is, of course, meaningless, as it basically means that it has a fixed response to any question. However, we can say that each person's worldview has it's own degree of dogmaticity.

Here lies the answer to the difference between a "scientific" and a "superstitious" (i.e. dogmatic or religious) worldviews. The former one is the only worldview, which makes sense for establishing communication between individuals (irrespectively of what "realities" each of them might live or believe in). This is the reason, which sets such a worldview forth as the "practical" one.

It is important to understand that, in principle, there is nothing in this exposition that excludes the possibility for some people to actually have "their own" part of reality. For example, if person X tells me that he sees little green men all around, it may very well be something truly objective in his part of reality, which is, for some reason, not shared with me. Consequently, I will have to regard his statement as non-objective and as long as I do so, such a statement will be useless for communication and thus non-scientific (for the reality, which includes me and person X). I will thus say to him: "I do not refuse to believe in you seeing those little green men, but as long as I myself cannot observe their existence, discussing them with me is non-scientific, i.e. uninformative".
Tags: Philosophy
One Over Square Root of N

Posted by Konstantin 04.12.2011 1 Comment

There is one rule of thumb that I find quite useful and happen to use fairly often. It is probably not widely known nor described in textbooks (I stumbled upon it on my own), so I regularly have to explain it. Next time I'll just point out to this post.

The rule is the following: a proportion estimate obtained on a sample of $n$ points should only be trusted up to an error of $\frac{1}{\sqrt{n}}$ .

For example, suppose that you read in the newspaper that "25% of students like statistics". Now, if this result has been obtained from a survey of 64 participants, you should actually interpret the answer as $0.25\pm\frac{1}{\sqrt{64}}$ , that is, $0.25\pm 0.125$ , which means that the actual percentage lies somewhere between 12.5% and 37.5%.

As another example, in machine learning, you often get to see cases where someone evaluates two classification algorithms on a test set of, say, 400 instances, measures that the first algorithm has an accuracy of 90%, the second an accuracy of, say, 92%, and boldly claims the dominance of the second algorithm. At this point, without going deeply into statistics, it is easy to figure that $1/\sqrt{400}$ should be somewhere around 5%, hence the difference between 90% and 92% is not too significant to celebrate.

The Explanation

The derivation of the rule is fairly straightforward. Consider a Bernoulli-distributed random variable with parameter $p$ . We then take an i.i.d. sample of size $n$ , and use it to estimate $\hat p$ :

$\hat p = \frac{1}{n}\sum_i X_i$

The 95% confidence interval for this estimate, computed using the normal approximation is then:

$\hat p \pm 1.96\sqrt{\frac{p(1-p)}{n}}$

What remains is to note that $1.96\approx 2$ and that $\sqrt{p(1-p)} \leq 0.5$ . By substituting those two approximations we immediately get that the interval is at most

$\hat p \pm \frac{1}{\sqrt{n}}$

Limitations

It is important to understand the limitations of the rule. In the cases where the true proportion estimate is $p=0.5$ and $n$ is large enough for the normal approximation to make sense (20 is already good), the one-over-square-root-of-n rule is very close to a true 95% confidence interval.

When the true proportion estimate is closer to 0 or 1, however, $\sqrt{p(1-p)}$ is not close to 0.5 anymore, and the rule of thumb results in a conservatively large interval.

In particular, the true 95% confidence interval for $p=0.9$ will be nearly two times smaller ( $\approx 0.6/\sqrt{n}$ ). For $p=0.99$ the actual interval is five times smaller ( $\approx 0.2/\sqrt{n}$ ). However, given the simplicity of the rule, the fact that the true $p$ is rarely so close to 1, and the idea that it never hurts to be slightly conservative in statistical estimates, I'd say the one-over-a-square-root-of-n rule is a practically useful tool in most situations.

Use in Machine Learning

The rule is quite useful to quickly interpret performance indicators of machine learning models, such as precision, accuracy or recall, however, you should make sure you understand what proportion is actually being computed for each metric. Suppose we are evaluating a machine learning model on a test set of 10000 elements, 400 of which were classified as "positive" by the model. We measure the accuracy of the model by computing a proportion of correct predictions on the whole set of 10000 elements. Thus, the $n$ here is 10000 and we should expect the confidence intervals of the resulting value to be under 1 percent point. However, the precision of the model is measured by computing the proportion of correct predictions among the 400 positives. Here $n$ is actually 400 and the confidence intervals will be around 0.05.

The rule can also be used to choose the size of the test set for your model. Despite the popular tradition, using a fraction of your full dataset for testing (e.g. "75%/25% split") is arbitrary and misguided. Instead, it is the absolute size of the test sample that you should care most about. For example, if you are happy with your precision estimates to be within a 1% range, you only need to make sure your test set would include around 10000 positives. Wasting an extra million examples for this estimate will increase the quality of your estimate, but you might be better off leaving these examples for model training instead.

Tags: Machine learning, Probability theory, Statistics