• ## Skype “removed” messages

Posted by Konstantin 22.01.2015 1 Comment

Yesterday I happened to attend a discussion about the security and privacy of information stored locally in Skype and Thunderbird profiles. It turns out, if you obtain a person's Skype profile directory, you will be able to log in as him without the need to know the password. In addition, Dominique made a remark that Skype does not really delete the messages that are marked as "removed" in the chat window. I found that curious and decided to take a closer look.

Indeed, there is a bunch of *.dat files in the chatsync subdirectory of the Skype's profile, which preserve all messages along with all their edits or deletions. Unfortunately, the *.dat files are in some undocumented binary format, and the only tool I found for reading those lacks in features. However, hacking up a small Python parser according to what is known about the format, along with a minimalistic GUI is a single evening's exercise, and I happened to be in the mood for some random coding.

Skype Chatsync Viewer

Now, if you want to check out what was that message you or your conversation partner wrote before it was edited or deleted, this package will help. If you are not keen on installing Python packages, here is a standalone Windows executable.

• ## Float Precision

Posted by Konstantin 13.01.2015 No Comments

I haven't updated this blog for quite some time, which is a shame. Before I resume I wanted to reproduce here a couple of my old posts from other places. This particular piece stems from a post on our research group's webpage from more than 8 years ago, but is about an issue that never stops popping up in practice.

Precision of floating point numbers is a very subtle issue. It creeps up so rarely that many people (me included) would get it out of their heads completely before stumbling upon it in some unexpected place again.

Indeed, most of the time it is not a problem at all, that floating point computations are not ideally precise, and no one cares about the small additive noise that it produces, as long as you remember to avoid exact comparisons between floats. Sometimes, however, the noise can severely spoil your day by violating the core assumptions, such as "distance is always greater than zero", or "cosine of an angle never exceeds 1".

The following is, I think, a marvelous example, discovered by Alex, while debugging an obscure problem in one Python program. The choice of the language is absolutely irrelevant, however, so I took the liberty of presenting it here using Javascript (because this lets you reproduce it in your browser's console, if you wish). For Python fans, there is an interactive version available here as well.

A cosine distance metric is a measure of dissimilarity of two vectors, often used in information retrieval and clustering, that is defined as follows:

$\mathrm{cdist}(\mathbf{x},\mathbf{y}) = 1 - \frac{\mathbf{x}^T\mathbf{y}}{|\mathbf{x}| \; |\mathbf{y}|}$

A straightforward way to put this definition into code is, for example, the following:

function length(x) {
return Math.sqrt(x[0]*x[0] + x[1]*x[1]);
}

function cosine_similarity(x, y) {
return (x[0]*y[0] + x[1]*y[1])/length(x)/length(y);
}

function cosine_distance(x, y) {
return 1 - cosine_similarity(x, y);
}

Now, mathematically, the cosine distance is a valid distance function and is thus always positive. Unfortunately, the floating-point implementation of it presented above, is not the same. Check this out:

> Math.sign(cosine_distance([6.0, 6.0], [9.0, 9.0]))
< -1

Please, beware of float comparisons. In particular, think twice next time you use the sign() function.

Tags: , ,

• ## Structure of a Python Project

Posted by Konstantin 19.03.2014 No Comments

Despite the popularity of Python, I find that many of its best practices are not extremely well-documented. In particular, whenever it comes to starting a new Python project, quite a lot of people would follow whatever the very first Python tutorial taught them (or whatever their IDE creates), and start churning out .py files, perhaps organizing them into subdirectories along the way. This is not a good idea. Eventually they would stumble upon problems like "how do I distribute my code""how do I manage dependencies", "where do I put documentation""how (and when) should I start writing tests for my code", etc. Dealing with those issues "later" is always much more annoying than starting with a proper project layout in the first place.

Although there is no unique standard for a project layout, there are some established best practices. In particular (and this seems to be not very widely known), one of the easiest ways to create a new Python package (i.e., develop anything in Python that will have to be distributed later), is to make use of the paster or cookiecutter tools. Simply running

$paster create <package_name> or $ cookiecutter https://github.com/audreyr/cookiecutter-pypackage.git

will ask you some questions and initialize a well-formed setuptools-based package layout for you. A slightly more involved yet still minimalistic starter code is provided by additional paster/cookiecutter templates, such as modern-package-template:

$pip install modern-package-template$ paster create -t modern_package <package_name>

Naturally, every developer will tend to customize the setup, by adding the necessary tools. Moreover, the preferred setup evolves with time, as various tools and services come in and out of existence. Ten years ago, buildout or git were not yet around. Five years ago, there was no tox and nose was better than py.test. Services like Travis-CI and Github are even younger yet.

Although I tend to experiment a lot with my setup, over the recent couple of years I seem to have converged to a fairly stable Python environment, which I decided to share as a reusable template and recommend anyone to make use of it.

Thus, next time you plan to start a new Python project, try beginning with:

$pip install python_boilerplate_template$ paster create -t python_boilerplate <project_name>

or, alternatively,

$pip install cookiecutter$ cookiecutter https://github.com/konstantint/cookiecutter-python-boilerplate

More information here (for paster) or here (for cookiecutter). Contributions and constructive criticism welcome via Github.

• ## Strong Typing

Posted by Konstantin 12.03.2014 No Comments

Whenever you write a program, you want this program to behave correctly and do what you want it to do. Thus, programming always goes together with the mental act of proving to yourself (and sometimes to other people as well), that the code you write is correct. Most often this "proof" is implicit, dissolved in the way you write your code and comment it. In fact, in my personal opinion, "good code" is exactly the one, where a human-reviewer is able to verify its correctness without too much effort.

It is natural to use computers to help us verify correctness. Everyone who has ever programmed in a strictly-typed programming language, such as Java or Haskell, is familiar with the need to specify types of variables and functions and follow strict rules of type-safety. But of course, ensuring type-safety is just the compiler's way to help you ensure some basic claims about the program, such as "this variable will always contain an integer" or "this function will always be invoked with exactly three parameters".

This is very convenient, yet can be somewhat limiting or annoying at times. After all, type-safety requires you to write code that "can be type-checked". Although very often this is expected of "good code" anyway, there are situations where you would like some more flexibility. For this reason some languages impose no type-safety rules at all (e.g. Python or Javascript), and some languages let you disable type-checking for parts of code.

Rather than disabling the type checker, another principled way to allow more flexibility is to make the type-checker smarter. This is the promise of dependent types. In principle, a language, which supports dependent types, would let you make much more detailed statements about your program and have your program automatically checked for correctness with respect to those statements. Rather than being limited to primitive claims like "this variable is an integer", the use of dependent types enables you to assert things like "this is a sorted list", or "this is an odd integer", and so on up to nearly arbitrary level of detail, in the form of a type annotation. At least that much I found out during a course at the recent winter school.

The course was based on the Agda programming language, and the first thing I decided to try implementing in Agda is a well-typed version of the following simple function:

f t = if t then true else 0

It might look like something trivial, yet most traditional strictly typed languages would not let you write this. After all, a function has to have a return type, and it has to be either a Boolean or an Integer, but not both. In this case, however, we expect our function to have a parameter-dependent type:

f : (t : Bool) → if t then Bool else Integer

Given that Agda is designed to support dependent types, how complicated could it be to implement such a simple function? It turns out, it takes a beginner more than an hour of thinking and a couple of consultations with the specialists in the field. The resulting code will include at least three different definitions of "if-then-else" statements and, I must admit, some aspects of it are still not completely clear to me.

IF-THEN-ELSE in Agda, including all the boilerplate code

This is the longest code I've ever had to write to specify a simple if-then-else statement. The point of this blog post is to share the amusement and suggest you to check out Agda if you are in the mood for some puzzle-solving.

As for dependent types, I guess those are not becoming mainstream any time soon.

• ## Bitcoin-based Timestamping

Posted by Konstantin 20.01.2014 No Comments

Bitcoin is a cryptographic currency, that has gained a lot of hype in the last year. From a technical perspective, it is simply a distributed timestamping scheme, fully dedicated to establishing the order of monetary transactions, by creating a long block chain.

Adding a new block to the block chain requires extremely expensive distributed computations. Thus, in terms of the amount of energy, invested by the users worldwide into its creation, the block chain is, at the moment, probably the most expensive computer-generated file in human history. A monument to the raw "computation for the sake of computation". The Bitcoin network by now includes hundreds of thousands of users, most of whom keep a full copy of the block chain and contribute to its further growth.

Timestamping hash, published in a paper

All that means that including any information into the block chain can act as a solid timestamp, proving the existence of this information at a particular point in time. It is nearly impossible to fake or revoke. Even if the Bitcoin network would cease working, the block chain would probably be kept around at least as a curious artifact (as well as an object of interest for data miners) . The idea is equivalent to a popular practice of timestamping information by publishing it in a widely distributed newspaper. However, publishing in a popular newspaper may be costly, while getting transactions into the block chain is nearly free and accessible to anyone.

Consequently, it seems obvious that sooner or later the bitcoin block chain must begin to be used for timestamping things other than transactions. Because trusted timestamping is a big deal. Everyone in Estonia knows that.

Unfortunately, it seems that although the idea has been mentioned before, there do not seem to be any convenient services developed for that, apart from BTProof, which is somewhat too simplistic, given the potential importance of the task at hand. In an attempt to perhaps inspire someone to consider imlementing a more serious service of this kind, let me give a brief overview of the ways to get your data into the block chain.

### Smuggling your data into the block chain

If only Bitcoin transactions were allowed to have textual descriptions assigned to them, the task would be trivial: any piece of information you want to timestamp could be simply mentioned in the description of a transaction. However, this functionality is not part of the Bitcoin protocol, so we have to use tricks. At least three different techniques are possible here.

1.  Specifying your data as a destination address.

Each Bitcoin transaction includes a "destination address", which is a 34-character string in hex-like encoding. This address may be specified arbitrarily. Thus, by transferring any amount to an address, which itself is the hash of the information you need timestamped, you will have the fact of information's existence mentioned in the block chain for future generations to behold. This is the idea behind BTProof. There are several problems with this method. Most importantly, anything you transfer to a nonexistent address will get lost forever, with no one being able to claim it. This makes the process non-free, because you cannot have transactions of size 0. Moreover, very small transactions are unfavoured by the Bitcoin network and take a long time to get verified. Finally, leaving "unclaimed" transactions forever hanging in the block chain is somewhat indecent in the first place, isn't it?

The abovementioned drawbacks may be addressed using multi-signature transactions. Those are a special type of transactions, which allow the funds to be claimed by any one of several addresses. In this case one address can be used to encode the hash and another one — to reclaim the funds spent in the transaction back. This concept has been suggested as the way to carry arbitrary data on top of Bitcoin in the MasterCoin project.

2. Specifying your data as a destination private key.

Rather than converting the data you need timestamped into a (non-existent) address, you can turn it into a private key. You can then perform two transactions. The first one  transfers funds to an address, corresponding to this private key, and next one uses the private key to withdraw the funds back to you. As there is a fixed mapping from your data to the private key to the address that the funds went through, you have just included a trace of your data into the block chain. This method is mentioned here. The drawback is the need for two transactions, and the overall complexity of the scheme.

3. Specifying your data in the script.

Finally, the last two places in a bitcoin transaction, which allow custom data, are the two "script" fields. Namely, the act of depositing funds to a transaction in Bitcoin is not as simple as providing the target address. Instead, it is a script, that, when executed, is supposed to check the right of the receiver to obtain the funds. Similarly, the act of withdrawing funds from Bitcoin is expressed by a script that proves the rights of the owner.

For example, a typical "deposition script" looks as follows:

OP_DUP
OP_HASH160
OP_EQUALVERIFY
OP_CHECKSIG

This script means that in order to withdraw the funds, the receiver must push on the stack a signature of the transaction, followed by his public key. The script then starts executing by first duplicating (OP_DUP) the top value on the stack, the public key. It then applies a hash function (OP_HASH160) to the top value on the stack (this converts the public key to an address). Then another value is pushed onto the stack (OP_PUSHDATAx). Next, two top values are popped and checked for equality (OP_EQUALVERIFY). This verifies, that the receiver's address matches <target_address>. Finally, the OP_CHECKSIG command pops another two values from the stack (those are the signature and the public key now, remember), and verifies the correctness of the signature.

The beauty of the system is that it lets you create various rules for claiming funds apart from simply owning a private key to an address. For example, it is possible to create transactions which require multiple parties to collude to withdraw them. Or you may require the receiver of the funds to solve a puzzle. Or you may even put the funds up for anyone to take freely, etc.

What is important for our purposes, however, is that the scripting language is rather flexible. In particular, it lets you add useless commands, such as "push this data onto stack, then drop the top value from stack, then continue as normal:"

OP_PUSHDATAx <any_data>
OP_DROP
OP_HASH160
OP_EQUALVERIFY
OP_CHECKSIG

This logic could be included in either the "depositing" or the "receiving" script, letting you essentially provide arbitrary "notes" in transactions and thus do timestamp data in the most reasonable way. This lets you timestamp using a single transaction, which recurrently transfers any amount from an address to itself.

Unfortunately, it seems that the freedom of scripting has been severely limited in the recent versions of Bitcoin software. Namely, transactions with any nonstandard scripts are simply declined from inclusion into a block (at least, none of my attempts to try this out succeeded). Even the fate of multi-signature transactions, mentioned in point 1 (which are just a particular kind of a script) is not completely clear. In any case, the Bitcoin specification will most probably evolve to eventually allow storing dedicated data packets in the block chain without the need to resort to hacks. And if not Bitcoin, perhaps such functionality will become part of one of the competing similar cryptocurrencies.

It seems unreasonable to run a huge distributed timestamping algorithm, and not let people use it for general-purpose timestamping, doesn't it?

Update: Given the recent problems related to the transaction malleability aspect of the protocol, it is easy to predict that the freedom of scripting will probably be limited even further in the future. However, eventually support must be added for storing a custom nonce into the signed transaction (as it seems to be the only reasonable way to make transactions uniquely identifiable despite malleability of their hash). That nonce would be a perfect candidate for general-purpose timestamping purposes.

• ## Atmospheric Semantics of Scientific Presentations

Posted by Konstantin 01.04.2013 1 Comment

CrapCon is a fun evening session, which traditionally takes place at the annual Estonian Winter School on Computer Science (EWSCS), where participants of EWSCS are welcome to make short random nonsensical talks, prepared earlier on the same day during the lectures.

The following is my presentation from this year's CrapCon. The first part of it, as is customary for most CrapCon talks, makes fun of the topics being taught during the school, hence it might not make much sense outside the context of the event. The second part is fairly general (and totally serious!), though.

If you find that amusing, certainly check out the talk by Taivo Lints, which is, I think, among the best examples of computer-science-related abstract humor.

Tags: , ,

• ## The Curse of Genomic Coordinates

Posted by Konstantin 25.02.2013 No Comments

Most of bioinformatics revolves, in one way or another, around the genome. Even in the largely "non-genomic" areas, such as systems biologyproteomics, or metabolomics, the key players are still genes, proteins, and their regulatory regions, which can be associated with particular points on the genome. Consequently, no matter what kind of data you work with, if you do bioinformatics, you will sooner or later have to deal with genomic coordinates.

To interpret genomic coordinates you need to know the reference genome version and coordinate conventions used. Does the data file mention those?

Surprisingly, despite being of such central importance to bioinformatics, the whole genomic coordinate business seems to be in a rather unfriendly state nowadays. Firstly, there are several ways to refer to genomic positions (e.g. 0-based vs 1-based indexing), and for every possible combination of conventions, there is at least one file format that will be using it. Then, of course, you have to deal with several versions of the reference genome, and, to make your life harder yet, most data files will not tell you what genome version should be used to interpret the coordinates stored there. Finally, if you decide that you need to unify the coordinates among your different data files by converting them to the same reference genome version, you will find out that your only tools here are a couple of heavyweight web apps and a command-line UCSC liftOver utility. Should you look for R or Python libraries to script your task, you will discover that those do no smarter than forward all the conversion tasks to that same liftOver.

I found this is all to be extremely wrong and hacked up a tiny Python tool recently, which will happily convert your coordinates without the need to invoke an external liftOver process. It does not fully replace liftOver (yet?), as it does not convert regions - a task just a bit more tricky than lifting over single points. However it lets you do single-point conversion in the simplest way possible:

from pyliftover import LiftOver
lo = LiftOver('hg17', 'hg18')
lo.convert_coordinate('chr1', 1000000, '-') # 0-based indexing

As usual, install via: easy_install pyliftover (or pip install pyliftover)

For more information consult the PyPI page.

• ## The Substring Paradox

Posted by Konstantin 25.02.2013 No Comments

If anyone tells you he or she understands probability theory, do not believe them. That person simply does not know enough of it to admit, that probability theory is riddled with paradoxes, where common sense must step aside and wait in silence, or your brain will hurt. Substring statistics is probably among the lesser-known, yet magically unintuitive examples.

Consider a sequence of random coin flips. Each coin flip is either a "heads" or a "tails", hence the sequence might written down as a sequence of H and T-s: HTHTHHTT...

It is easy to show that the probability of the sequence to begin with, say, HHH is equal to P(HHH) = 1/8th, as is the case with any other three-letter combination: P(HHT) = P(THH) = P(THT) = 1/8, etc. Moreover, by symmetry, the probability of seeing a particular three-letter combination at any fixed position in the sequence is still always 1/8-th. All three-letter substrings seem to be equivalent here.

But let us now play a game, where we throw a coin until we see a particular three-letter combination. To be more specific, let us wait until either HHT or HHH comes up. Suppose I win in the first case and you win in the second one. Obviously, the game first proceeds until two heads are flipped. Then, whichever coin flip comes up next determines the winner. Sounds pretty fair, doesn't it? Well, it turns out that, surprisingly, if you count carefully the expected number of coin flips to obtain HHT, it happens to be 8, whereas for HHH it is 14! Ha! Does it mean I have an advantage? Suprisingly again, no. The probability of HHT occuring before HHH in any given sequence is still precisely 0.5 and, as we reasoned initially, the game is still fair.

We can, however, construct even more curious situations with four-letter combinations. Suppose I bet on HTHT and you bet on THTT.  The expected number of coin flips to obtain my combination can be computed to be 20. The expected number of flips to get your combination is smaller: 18 flips. However, it is still more probable (64%) that my combination will happen before yours!

If this is not amusing enough, suppose that four of us are playing such a game. Player A bets on the string THH, Player B bets on HHT, player C on HTT and player D on TTH. It turns out that A's combination will occur earlier than B's with probability 75%. B's combination, however, wins over C's with probability 66.7%. C's combination, though, wins over D's with probability 75%. And, to close the loop, D wins over A with probability 66.7%! This is just like the nontransitive dice.

Hopefully, you are amazed enough at this point to require an explanation for how this all might happen. Let me leave it to you as a small puzzle:

• Explain in simple terms, how can it happen so that the expected time to first occurrence of otherwise equiprobable substrings may be different?
• Explain in simple terms, how can it be so that one substring has higher than 50% chance of occuring earlier than some other substring.
• Finally, explain why the two effects above are not strictly related to each other.

PS: The theory used to compute actual probabilities and expected times to occurrence of a substring is elegant yet somewhat involved. For the practically-minded, here is the code to check the calculations.

• ## On Gödel’s Theorems and the Universe

Posted by Konstantin 27.01.2013 6 Comments

Most results that are published in mathematics and natural sciences are highly technical and obscure to anyone outside the particular area of research. Consequently, most scientific publications and discoveries do not get any media attention at all. This is a problem both for the public (which would benefit from knowing a bit more about our world), the science in general (which would benefit from wider dissemination) and the scientists themselves (which would certainly appreciate if their work reached more than a couple of other people). The issue is usually addressed by the researchers trying to popularize their findings — provide simple interpretations, which could be grasped by the layman without the need to understand the foundations. Unfortunately, if the researcher was lucky to grab enough media attention, chances are high the process of "popularization" will get completely out of hands, and the message received by the public will have nothing to do with the original finding.

There are some theorems in pure mathematics, which happened to suffer especially seriously from this phenomenon. My two most favourite examples are the Heizenberg's uncertainty principle and Gödel's incompleteness theorems. Every once in a while I stumble on pieces, written by people who have zero knowledge of basic physics (not even mentioning quantum mechanics or Fourier theory), but claim to have clearly understood the meaning and the deep implications of the uncertainty principle upon our daily lives. Even more popular are Gödel's theorems. "Science proves that some things in this world can never be understood or predicted by us, mere humans!!!" is a typical take on Gödel by the press. Oh, and it is loved by religion enthusiasts!

This post is my desperate attempt to bring some sanity back into this world by providing a (yet another, but maybe slightly different) popular explanation for Gödel's ideas.

### Gödel's theorems explained (with enough context)

Kurt Gödel

In our daily lives we primarily use symbols to communicate and describe the world. Those may be formulas in mathematics or simply sentences in the English language. Symbol-based systems can be used in different ways. From the perspective of arts and humanities, one is welcome to construct arbitrary sequences of words and sentences, as long as the result is "beautiful", "inspiring" or "seems smart and convincing". From a more technically-minded point of view, however, one is only welcome to operate with "true and logical" statements. However, it is not at all obvious which statements should be universally considered "true and logical". There are two ways the problem can be resolved.

Firstly, we can determine the truth by consulting "reality". For example, the phrase "All people are mortal, hence Socrates is mortal" is true because, indeed (avoiding a lecture-worth of formalities or so), in all possible universes where there are people and they are mortal, Socrates will always also be mortal.

Secondly, we can simply postulate a set of statements, which we shall universally regard as "the true ones". To distinguish those postulated truths from the "actual truths", mentioned in the paragraph above, let us call them "logically correct statements". That is, we can hypothetically write down an (infinite) list of all the "logically correct" sentences in some hypothetical "Book of All Logically Correct Statements". From that time on, every claim not in the book will be deemed "illogical" and hence "false" (the question of whether it is false in reality would, of course, depend on the goodness and relevance of the book we created).

This is really convenient, because now in order to check the correctness of the claim about Socrates we do not have to actually visit all the possible universes. It suffices to look it up in our book! If the book is any good, it will provide good answers, so the concept of "logical correctness" may be close or even equivalent to "actual truth". And in principle, there must exist "The Actual Book of Truth", which would list precisely the "actually true" statements.

Unfortunately, infinite books are somewhat inconvenient to print and carry around. Luckily, we can try to represent an infinite book in a finite way using algorithms. We have all seen how a short algorithm is capable of producing an infinite amount of nonsense, haven't we? Thus, instead of looking for an (infinitely-sized) "book of truth", we should look for a (finitely-sized) "algorithm of truth", which will be either capable of generating the whole book, or checking for any given statement, whether it belongs to the book.

Frege's propositional calculus

For clarity and historical reasons,  the algorithms for describing "books of truths" are usually developed and written down using a particular "programming language", which consists of "axioms" and "inference rules". Here is one of the simplest examples of how a "program" might look like.

This is where a natural question arises: even if we assume that we have access to "the actual book of truth", will it really be possible to compress it down to a finite algorithm? Isn't it too much to ask? It turns out it is. There is really no way to create a compact representation for any but the most trivial "books of truths".

Besides the "asking too much" intuition, there are at least two other simple reasons why this is impossible. Firstly, most of the readers here know that there exist problems, that are in principle uncomputable. Those are usually about statements, the correctness of which cannot be checked in finite time by a finite algorithm, with the most famous example being the halting problem, as well as everything related to uncomputable numbers.

Naturally, if we can't even use a finite algorithm to list "all truly halting programs", our hopes for getting a finite description for any more complicated "book of truths" is gone. But there is another funny reason. Once we have decided to define the book with all true statements and devise an algorithm for generating it, we must not forget to include into the book some statements about the book and the algorithm themselves. In particular, we must consider the following phrase: "This sentence will not be included in the book generated by our algorithm."

If our algorithm does include this sentence in the generated book, the claim turns out to be false. As a result, our precious book would have false statements in it: it would be inconsistent. On the other hand, if our algorithm produces a book which does not include the above sentence, the sentence turns out to be true and the book is incomplete. It is just the liar's paradox.

And so it is — we have to put up with the situation, where it is impossible to produce a finite representation for all true statements without stumbling into logical paradoxes. Or, alternatively, that there are mathematical statements, which cannot be checked or proven within a finite time.

This describes what is known as "Gödel's first incompleteness theorem" (namely, the claim that in most cases our algorithm is only capable of producing either an incomplete or an inconsistent "book"). A curious corollary may be derived from it, which is known as the "Gödel's second incompleteness theorem". The theorem says that if our "book-generating algorithm" happens to include into it exactly the phrase "This book is consistent", it must necessarily be a false claim, and the algorithm must actually be inconsistent. This is often interpreted that no "honest" theory is capable of claiming it's own correctness (there are caveats, though).

Gödel's theorems have several interesting implications for the foundations of mathematics. None the less, those are still theorems about the fundamental impossibility of using a finite algorithm for generating a book of logically correct statements without stumbling into logical paradoxes and infinite loops. Nothing more and nothing less. They have nothing to do with the experimental sciences: note how we had to cut away all connections to reality in the very first paragraphs of the explanation. And thus, Gödel's theorems are not related to the causes of "human inability to grasp the universe", "failure or success of startup business plans", or anything else of that kind.

• ## Frequentists vs Bayesians (xkcd strip 9.11.2012)

Posted by Konstantin 12.11.2012 No Comments

This relates nicely to several previous posts here.

Frequentists vs Bayesians