Four Years Remaining

The Minimum Plan for 2009

Posted by Konstantin 01.01.2009 2 Comments
1. Submit at least 2 papers.
2. Attend at least 2 significant conferences (ICML and ISMB, hopefully).
3. Make at least 2 important acquaintances.
4. Participate as a reviewer at least twice.
5. Write at least 2 pieces of software (useful to at least 2 people each).
6. Read at least 2 non-technical books and at least 2 technical books.
7. Complete at least 2 of these multiple side-projects that eagerly wait for completion.
8. Supervise at least 2² students to successful defence.
9. Keep this blog running (2⁶ blog entries or so).
10. Keep in touch with friends and relatives, have a fair dose of fun, sports, music and all the rest that belongs here (you go windsurfing, you call me, OK?).
Did I miss something?
Tags: Plan
Everything New is a Well-forgotten Old

Posted by Konstantin 25.12.2008 No Comments

A long time ago, information was stored and transmitted by people who passed it in the form of poems from mouth to mouth, generation to generation, until, at some moment, writing was invented. Some poems were lucky enough to be carefully written down in letters on the scrolls of papyrus or pergament, yet a considerable number of them was left unwritten and thus lost. Because in the age of writing, an unwritten poem is a non-existent poem. Later on came the printing press and brought a similar revolution: some books from the past were diligently reprinted in thousands of copies and thus preserved for the future. The remaining ones were effectively lost, because in the age of the printing press, an unpublished book is a nonexistent book. And then came the Internet. Once again, although a lot of the past knowledge has migrated here, a large amount hasn't, which means that is has been lost for most practical purposes. Because in the age of the Internet, if it is not in the Internet, it does not exist. The tendency is especially notable in science, because science is essentially about accumulating knowledge.

The effect of such regular "cleanups" (and I am sure these will continue regularly for as long as humankind exists) is twofold. On one hand, the existing knowledge is reviewed and only the worthy pieces get a chance to be converted into the new media format. As a result, a lot of useless crap is thrown away in an act of natural selection. On the other hand, a considerable amount of valuable information gets lost too, simply because it seemed useless at that specific moment. Of course, it will be reinvented sooner or later anyway, but the fact that it was right here and we just lost it seems disturbing, doesn't it.

I'm still browsing through that old textbook from the previous post and enjoying the somewhat unfamiliar way the material is presented. Bayesian learning, boolean logic and context-free-grammars are collected together and related to decision theory. If I did not know the publication date of the book, I could easily mistake this "old" way of presenting the topic, for being something new. Moreover, I guess that, with an addition of a minor twist, some ideas from the book could probably be republished in a low-impact journal and thus recognized as "novel". It would be close to impossible to figure out the copy, because a pre-Internet-era non-English text simply does not exist.

Tags: History, Pattern, Weltschmertz
Decreased Expectations

Posted by Konstantin 19.12.2008 3 Comments

The day before I've accidentally stumbled upon an old textbook on pattern analysis written in Russian (the second edition of a book originally published in 1977, which is more-or-less the time of the classics). A brief review of its contents was enormously enlightening.

It was both fun and sad to see how smallishly incremental the progress in pattern analysis has been for the last 30 years. If I wasn't told that the book was first published in 1977, I wouldn't be able to tell it from any contemporary textbook. I'm not complaining that the general approaches and techniques haven't changed much, these shouldn't have. What disturbs me is that the vision of the future 30 years ago was not significantly different from what we have today.

Pattern recognition systems nowadays are getting more and more widespread and it is difficult to name a scientific field or an area of industry where these are not used or won't be used in the nearest future...

Further on the text briefly describes the application areas for pattern analysis that range from medicine to agriculture to "intellectual fifth-generation computing machines" and robots that were supposed to be here somewhere around nineties already. And although machines did get somewhat more intelligent, we have clearly failed our past expectations. Our current vision of the future is not significantly different from the one we had 30 years ago. It has probably become somewhat more modest, in fact.

Interesing, is this situation specific to pattern analysis or is it like that in most areas of computer science?

Tags: History, Pattern analysis, Weltschmertz

SVG is Underrated

Posted by Konstantin 13.12.2008 No Comments

It is somewhat sad to see that the Scalable Vector Graphics (SVG) format, despite its considerable age and maturity, has not yet gained too much popularity in the internet, with a lot of Adobe Flash all over instead. Here are some points you should know about it, so that maybe you consider taking a couple of hours to get acquainted with it one day.

SVG is an open source vector graphics format.
SVG supports most of what you'd expect from a 2D graphics language: cubic splines, bezier curves, gradients, nested matrix transformations, reusable symbols, etc.

SVG is XML-based and rather straightforward. If you need a picture with a line and two circles, you write a <line> tag and two <circle> tags:

<svg xmlns="http://www.w3.org/2000/svg">
    <line x1="0" y1="0" x2="100" y2="100" 
          stroke-width="2" stroke="black"/>
    <circle cx="0" cy="0" r="50"/>
    <circle cx="100" cy="100" r="20" 
            fill="red" stroke="black"/>
</svg>

Most vector graphics editors can write SVG. For example, Inkscape is one rather usable open-source software piece.
SVG supports Javascript. Basically, if you know HTML and Javascript, you are ready to write SVG by hand, because SVG is also just XML + Javascript. This provides considerable freedom of expression.
SVG can be conveniently embedded into HTML webpages and is supported out-of-the-box by most modern browsers.

My personal interest towards SVG is related to the observation, that it seems very suitable for creating interactive data visualizations (charts, plots, graphs) right in the browser. And although the existing codebase devoted to these tasks can't be called just enormous, I'm sure it will grow and gain wider adoption. Don't miss it!

Tags: Graphics, Language, SVG, Visualization, Web

Artificial Intuition

Posted by Konstantin 07.12.2008 7 Comments
Logic versus Statistics

Consider the two algorithms presented below.

Algorithm 1:
```
   If, for a given brick B,
      B.width(cm) * B.height(cm) * B.length(cm) > 1000
   Then the brick is heavy
```
Algorithm 2:
```
   If, for a given male person P,
      P.age(years) + P.weight(kg) * 4 - P.height(cm) * 2 > 100
   Then the person might have health problems
```
Note that the two algorithms are quite similar, at least from the point of view of the machine executing them: in both cases a decision is produced by performing some simple mathematical operations with a given object. The algorithms are also similar in their behaviour: both work well on average, but can make mistakes from time to time, when given an unusual person, or a rare hollow brick. However, there is one crucial difference between them from the point of view of a human: it is much easier to explain how the algorithm "works'' in the first case, than it is in the second one. And this is what in general distinguishes traditional "logical" algorithms from the machine learning-based approaches.

Of course, explanation is a subjective notion: something, which looks like a reasonable explanation to one person, might seem incomprehensible or insufficient to another one. In general, however, any proper explanation is always a logical reduction of a complex statement to a set of "axioms". An "axiom" here means any "obvious" fact that requires no further explanations. Depending on the subjective simplicity of the axioms and the obviousness of the logical steps, the explanation can be judged as being good or bad, easy or difficult, true or false.

Here is, for example, an explanation of Algorithm 1, that would hopefully satisfy most readers:
- The volume of a rectangular object can be computed as its width*height*length. (axiom, i.e. no further explanation needed)
- A brick is a rectangular object. (axiom)
- Thus, the volume of a brick can be computed as its width*height*length. (logical step)
- The mass of a brick is its volume times the density. (axiom)
- We consider the density of a brick to be at least 1g/cm³ and we consider a brick heavy if it weighs at least 1 kg. (axiom)
- Thus a brick is heavy if its mass > width*height*length > 1000. (logical step, end of explanation)
If you try to deduce a similar explanation for Algorithm 2 you will probably stumble into problems: there are no nice and easy "axioms" to start with, unless, at least, you are really deep into modeling body fat and can assign a meaning to the sum of a person's age with his weight. Things become even murkier if you consider a typical linear classification algorithm used in OCR systems for deciding whether a given picture contains the handwritten letter A or not. The algorithm in its most simple form might look as follows:
```
   If 
   Then there is a letter A on the picture,
```
where $a_{ij}$ are some real numbers that were obtained using an obscure statistical procedure from an obscure dataset of pre-labeled pictures. There is really no good way to explain why the values of $a_{ij}$ are what they are and how this algorithm gets the result, other than to present the dataset of pictures it was trained upon and state that "well, these are all pictures of the letter A, therefore our algorithm detects the letter A on pictures".

Note that, in a sense, such an "explanation" uses each picture of the letter A from the training set as an axiom. However, these axioms are not the kind of statements we used to justify Algorithm 1. The evidence they provide is way too weak for traditional logical inferences. Indeed, the fact that one known image has a letter A on it does not help much in proving that some other given image has an A too. Yet, as there are many of these "weak axioms", one statistical inference step can combine them into a well-performing algorithm. Notice how different this step is from the traditional logical steps, which typically derive each "strong" fact from a small number of other "strong" facts.

So to summarize again: there are two kinds of algorithms, logical and statistical.
The former ones are derived from a few strong facts and can be logically explained. Very often you can find the exact specifications of such algorithms in the internet. The latter ones are based on a large number of "weak facts" and rely on induction rather than logical (i.e. deductive) explanation. Their exact specification (e.g. the actual values for the parameters $a_{ij}$ used in that OCR classifier) does not make as much general sense as the description of classical algorithms. Instead, you would typically find general principles for constructing such algorithms.

The Human Aspect

What I find interesting, is that the mentioned dichotomy stems more from human psychology than mathematics. After all, the "small" logical steps as well as the "big" statistical inference steps are all just "steps" from the point of view of maths and computation. The crucial difference is mainly due to a human aspect. The logical algorithms, as well as all of the logical decisions we make in our life, is what we often call "reason" or "intelligence". We make decisions based on reasoning many times a day, and we could easily explain the small logical steps behind each of them. But even more often do we make the kind of reason-free decisions that we call "intuitive". Take, for example, visual perception and body control. We do these things by analogy with our previous experiences and cannot really explain the exact algorithm. Professional intuition is another nice example. Suppose a skilled project manager says "I have doubts about this project because I've seen a lot of similar projects and all of them failed". Can he justify his claim? No, no matter how many examples of "similar projects" he presents, none of them will be considered as reasonable evidence from the logical point of view. Is his decision valid? Most probably yes.

Thus, the aforementioned classes of logical (deductive) and statistical (inductive) algorithms seem to directly correspond to reason and intuition in the human mind. But why do we, as humans, tend to consider intuition to be inexplicable and thus make "less sense" than reason? Note that the formal difference between the two classes of algorithms is that in the former case the number of axioms is small and the logical steps are "easy". We are therefore capable of representing the separate axioms and the small logical steps in our minds somehow. However, when the number of axioms is virtually unlimited and the statistical step for combining them is way more complicated, we seem to have no convenient way of tracking them consciously due to our limited brain capacity. This is somewhat analogous to how we can "really understand" why 1+1=2, but will have difficulties trying to grasp the meaning of 121*121=14641. Instead, the corresponding inductive computations can be "wired in" to the lower, unconscious level of our neural tissue by learning patterns from experience.

The Consequences

There was a time at the dawn of computer science, when much hope was put in the area of Artificial Intelligence. There, people attempted to devise "intelligent" algorithms based on formal logic and proofs. The promise was that in a number of years the methods of formal logic would develop to such heights, that would allow computer algorithms to attain "human" level of intelligence. That is, they would be able to walk like humans, talk like humans and do a lot of other cool things that we humans do. Half a century has passed and this still didn't happen. Computer science has seen enormous progress, but we have not found an algorithm based on formal logic that could imitate intuitive human actions. I believe that we never shall, because devising an algorithm based on formal logic actually means understanding and explaining an action in terms of a fixed number of axioms.

Firstly, it is unreasonable to expect that we can precisely explain much of the real world, because, strictly speaking, there exist mathematical statements that can't in principle be explained. Secondly, and most importantly, this expectation contradicts the assumption that most "truly human" actions are intuitive, i.e. we are simply incapable of understanding them.

Now what follows is a strange conclusion. There is no doubt that sooner or later computers will get really good at performing "truly human" actions, the trend is clear already. But, contradictory to our expectations, the fact that we shall create a machine that acts like a human will not really bring us closer to understanding how "a human" really "works". In other words, we shall never create Artificial Intelligence. What we are creating now, whether we want it or not, is Artificial Intuition.
Tags: Artificial intelligence, Computer science, Machine learning, Philosophy, Statistics
How to Tame Python and Uninstall Packages in Mac OS X

Posted by Swen 27.11.2008 4 Comments
Note: This is again a post for sharing personal computer-taming experiences, that aims to help others who might stumble upon simialr problems. Also note, that this is a guest posting kindly presented by Swen Laur.

We as dumb users are often hurting ourselves through ignorance. One of many such traps is Python install on Mac OS X Leopard. First, for most users a new install is completely unnecessary, since Leopard comes with pre-installed Python. However, as top Google hits kindly suggest to install MacPython, we as ignorant users follow the advice. As a result, there will be two versions of Python in the computer. Now if we install setup-tools and easy_install, spooky things start to happen — packages that are installed through easy_install are suddenly inaccessible and nothing seems to work properly.

The explanation behind this phenomenon is simple though a bit technical. Obviously, after having installed MacPython, we have two distributions of Python in our computer:
- /System/Library/Frameworks/Python.framework/Versions/Current/bin/
- /Library/Frameworks/Python.framework/Versions/Current/bin/
which are accessible from the command line through the links in the directories /usr/bin and /usr/local/bin, respectively. On installation, MacPython modifies .bash_login file so that the new path is

/Library/Frameworks/Python.framework/Versions/Current/bin/;${PATH}
and thus the command line invocation of python is translated to

/Library/Frameworks/Python.framework/Versions/Current/bin/python

which, naturally, corresponds to MacPython.

As a second clue, note that setup-tools and easy_install packages come with Leopard's Python itself. More precisely, the corresponding command line tool is /usr/bin/easy_install. However, if you install setup-tools for your new MacPython, the corresponding binary will be /usr/local/bin/easy_install. Now note that in the default configuration, /usr/bin goes before /usr/local/bin in the PATH variable. As a direct consequence, each invocation of easy_install puts packages to Apple Python (into the directory /Library/Python/<version>/site-packages, to be precise), and these packages are naturally not accessible by the MacPython, that you normally execute from a shell. MacPython search the packages from the directory

/Library/Frameworks/Python.framework/Versions/<version>/lib/python2.5/site-packages.

As a simple solution, we must change the configuration of easy_install. For that we have to change the configuration .pydistutils.cfg located at your home directory. There should be the following lines
```
[install]
install_lib=/Library/Frameworks/Python.framework/Versions/Current/lib/python$py_version_short/site-packages/
install_scripts =/Library/Frameworks/Python.framework/Versions/Current/bin
```
The first line specifies where all Python modules should be installed and the second line specifies where all executables should be located (this directory is in the first place in the PATH and thus MacPyton executables are the first to be executed). After that one should reinstall setuptools and easy_install.During the installation of setuptools, the program might comlain. In this case you should silence it by modifying PYTHONPATH. As a result, easy_install is located

/Library/Frameworks/Python.framework/Versions/Current/bin

and thus the packages are put into correct directory. As such, we solve the problem with easy_install but are still vulnerable to other easter-eggs coming from the fact that we do not use the Apple distribution of Python. In particular, you cannot do any user-interface related programming, since the corresponding PyObjC module is missing from non-Apple Python distributions.

Another, more complex solution is to remove MacPython cleanly. The latter is not an easy thing to do, since package management of open-source tools is more than unsatisfactory in Mac OS X (in fact, open-source tools are often difficult to install and uninstall under many non-Linux operating systems). The situation is even worse when you have installed some other Apple packages (.mpkg files) to extend basic configuration of MacPython. In the following, we describe how to do a successful uninstall in such cases. This procedure is not for the fainthearted — if you have a time-machine copy of the system state before installing MacPython, then it might be easier to revert to this state.

The key to successful uninstallation procedure is a basic understanding what is a package and how the Installer.app works. First, note that the package is nothing more than a directory (one can expand the contents by right-clicking the package) with the following structure
```
Contents/Archive.bom
Contents/Info.plist
Contents/Packages/..
Contents/PkgInfo
Contents/Resources
```
The file Info.plist provides an xml-description how to install the package. The file Archive.bom contains in semi-binary description of all files that will be copied to various install locations. The directories Packages and Resources contain sub-packages and files to be copied to the system. The installer.app reads Info.plist and Archive.bom to create necessary files and then copies the package without directories Resources to /Library/Receipts. The receipts are in principle enough to do a clean uninstall if the files are not used by other programs. Normal practice for the Apple software requires that all files should be stored in the directory
/Applications/Installed-Application.app or in some place in /Library if the corresponding piece of software is shared by several applications. Mostly, the corresponding directories are stored in /Library/Frameworks. The integrity of all files installed under /System during the system updates are not guaranteed and therefore sane persons do not store files there. The latter is not true for open-source software, which often tries to write into directories /usr/local/bin and /usr/bin (extremely dangerous).

To uninstall packages, we have to first find out, where it was written. The key IFPkgFlagDefaultLocation or IFPkgRelocatedPath in the Info.plist contains the default or actual installation directory. With the command line tool lsbom we can also see the list of all files and directories that where installed. To be precise, the lsbom gives out relative paths with respect to install directory.

Now applying this knowledge, we can deduce that MacPython consists of five packages, which create a directory /Library/Frameworks/Python.framework
and write the following files
```
idle
idle2.5
pydoc
pydoc2.5
python
python-config
python2.5
python2.5-config
pythonw
pythonw2.5
smtpd.py
smtpd2.5.py
```
to the directory /usr/local/bin in addition to /Application/MacPython <>. Hence, if we remove these files, we restore the initial state unless we installed some other packages. In this case the procedure can be more tedious.

Finally, we should restore the old ~/.bash_login from the file ~/.bash_login.pysave.

To summarize, uninstallation of packages is more difficult than application-bundles (yet nevertheless, it is not hopeless). But still, I think applications without installers are the way to go.

Important update: How to install scypy from source code

Scipy is a nice extension of Python that provides the power of Matlab functions. By some odd reasons the maintainers of scipy package suggest to install non-Apple distribution of Python. In other words, they force you to abandon PyObjC library that is a core graphical library on Mac OS X. Fortunately, it is still possible to have both by installing scipy package from the source. There are some hiccups in the procedure but in general it is doable
1. Install unfpack thogether with UFconfig and AMD libraries from the source
  1. Download source files from the url http://www.cise.ufl.edu/research/sparse/umfpack/
  2. Change the configuration file UFconfig according your computer. Flag
    
    UMFPACK_CONFIG = -DNBLAS
    
    is a safe though a bit slow choice
  3. Run make and pray
  4. Copy libumfpack.a and libamd.a
  5. Copy all files from Include directories of UMPACK, UFconfig and AMD:
    
    umfpack*.h UFconfig.h amd.h amd_internal.h
2. Download and install numpy and scipy packages. You might change the compilation target by by typing the following commandexport MACOSX_DEPLOYMENT_TARGET=10.4in the shell before compiling and installing. The target value 10.5 is not good choice, since it automatically creates compile problems (joys of using freeware). Note that the current tarball scipy-0.7.0b1 is incomplete (joys of using freeware) and thus you have to use svn trunc for that. But otherwise, the installation guide http://www.scipy.org/Installing_SciPy/Mac_OS_X is correct.
Important update: An update to the update

Seems that they have now fixed the 10.5 target and you can use the command

export MACOSX_DEPLOYMENT_TARGET=10.5

before compiling if you have Leopard
Tags: How to, Mac, Python
Maximum Likelihood for Incomplete Data

Posted by Konstantin 13.11.2008 1 Comment

It is good to attend lectures from time to time, even if these are on the topics that you "already know". Because, once in a while, you get to find out something new about things that are so "simple" and "familiar", that you thought there was nothing new you could possibly discover about them. So, today I've had an accidental revelation regarding a simple yet amusing property of the maximum likelihood estimation method. I think I should have been told that years ago, right on the first statistics course.

Maximum Likelihood

Maximum likelihood estimation is one of the simplest and most popular methods for fitting mathematical models to given data. It is most easily illustrated by an example. Suppose you are given a list of numbers $(x_1, x_2, \dots, x_n)$ , which, you believe, are all instances of a normally distributed random variable with mean $\mu$ and variance $\sigma^2$ . The problem is that the value of $\mu$ is unknown to you and you would like to estimate it from the data. How do you do it? The maximum likelihood method suggests you to search for a value of $\mu$ , for which the probability (or probability density) of obtaining exactly this set of numbers would be maximal. That is:

$\hat\mu = \mathrm{argmax}_{\mu} P[x_1, x_2, \dots, x_n\,|\,\mu]$

In most cases this turns out to be quite a reasonable suggestion, which leads to good estimates (although, there are some rare exceptions). Here, for example, it turns out that to estimate the mean you should simply compute the average of the given numbers:

$\hat\mu = \frac{1}{n}\sum_i x_i$

This is more-or-less all what a typical textbook has to say about maximal likelihood, but there's more to it.

Counting Lightbulbs

Consider the following example. Suppose you are willing to measure the average lifetime of a common lightbulb (that is, the amount of time it will glow until burning out). For that you buy 1000 lightbulbs in a general store, switch all of them on and wait for a year, keeping track of the timepoints at which the individual lightbulbs burn out. Suppose the year has passed and you managed to record the lifetime of just 50 "dead" lightbulbs (the remaining 950 keep on glowing even after the whole year of uninterrupted work). What do you do now? Do you wait for other lightbulbs to burn out? But this can take ages! Do you just discard the 950 lightbulbs and only consider the 50 for which you have "available data"? This is also not good, because, clearly, the knowledge that 95% of lightbulbs live longer than a year is important. Indeed, from this fact alone you could infer that the average lifetime must be between 18 and 21 years with 95% confidence (an interested reader is invited to verify this claim). So what do you do?

Assume that the lifetime of a lightbulb can be modeled using exponential distribution with mean $\lambda$ (this is a natural assumption because exponential distribution is well-suited for describing the lifetime of electric equipment). Then the probability (probability density, in fact) that lightbulb i burns out after $t_i$ years can be expressed as:

$P[X=t_i\,|\,\lambda] = \frac{1}{\lambda}e^{-\frac{t_i}{\lambda}}$ .

Naturally, the likelihood of lightbulbs 1,2,3,...,50 burning out after $t_1,t_2,\dots,t_{50}$ years correspondingly, is just the product of the corresponding expressions:

$\prod_{i=1}^{50}\frac{1}{\lambda}e^{-\frac{t_i}{\lambda}}$ .

Now what about the lightbulbs that are still alive? The likelihood that a lightbulb will glow longer than a year can be easily expressed as:

$P[X > 1\,|\,\lambda] = e^{-\frac{1}{\lambda}}$ .

Therefore, the total likelihood of having 50 lightbulbs burn out at time moments $t_1,t_2,\dots,t_{50}$ and 950 lightbulbs keep working longer than a year is the following product:

$P[X_1 = t_1, X_2 = t_2, \dots, X_{50} = t_{50}, X_{51} > 1, \dots, X_{1000} > 1|\,\lambda] =$
$= \left(\prod_{i=1}^{50}P[X_i = t_i\,|\,\lambda]\right) \cdot \left(P[X > 1 \,|\,\lambda]\right)^{950} =$
$= \left(\prod_{i=1}^{50}\frac{1}{\lambda}e^{-\frac{t_i}{\lambda}}\right)\cdot e^{-\frac{950}{\lambda}}$ .

Finding the estimate of $\lambda$ that maximizes this expression is now a simple technical detail of minor importance. What is important is the ease with which we managed to operate the "uncertain measurements" for the 950 lightbulbs.

Final Remarks

Note that the idea can be extended to nearly arbitrary kinds of uncertainty. For example, if, for some lightbulb we knew that its lifetime was a number between a and b, we would include this piece of knowledge by using the term $P[a < X < b]$ in the likelihood function. This inherent flexibility of the maximum likelihood method looks like a feature quite useful for the analysis of noisy data.

Surprisingly, however, it does not seem to be applied much in practice. The reason for that could lie in the fact that many nontrivial constraints on the data will make the likelihood function highly nonconvex and hard to optimize. Consider, for example, the probability density

$P[X = x_1 \vee X = x_2 \vee X = x_3 \,|\,\lambda] = P[x_1\,|\,\lambda] + P[x_2\,|\,\lambda] + P[x_3\,|\,\lambda]$

for a normally distributed variable X. Depending on the choice of $x_1$ , $x_2$ , and $x_3$ , the likelihood function will have either three different peaks at $\lambda = x_1, x_2, x_3$ , two peaks or just one. Thus, finding an exact optimum is much harder than in the typical textbook case. But on the other hand, there are various cases where the likelihood function is nicely convex (consider the interval constraint mentioned above) and besides there are numerous less efficient algorithms around anyway. So maybe the trick is not used much, simply because it's not in the textbook?

Tags: Data analysis, Probability theory, Statistics
The Rule of The Empty Inbox

Posted by Konstantin 06.11.2008 No Comments
Effective time management is an important and an interesting issue. It is important because, theoretically, it should provide a way to be more efficient and thus spend less time doing work and more time having fun. It is interesting because of the wide range of methodologies, techniques and advices lying around: there is clearly no universal method that fits everyone. Some people say that the key trick to efficiency is being goal-oriented and managing priorities, others believe it is all about proper todo-lists, calendars and reminders. Some say you must constantly motivate yourself to work hard, others believe you better learn to go with the flow and follow your internal desires. There are also those who say that being organized is an inherent character trait and not a skill that one can acquire.

I strongly disagree with the latter statement. Being in general a lazy, careless, unmotivated and not at all a responsible person, I still manage to be somewhat productive from time to time, and I think a significant role in that is due to simple tricks. I realized it most clearly amidst master's studies, when my lecture schedule was suddenly over and I had to figure out myself what to do with my time.

Now, the whole "time management theory" is actually quite trivial, just a bunch of obvious pieces of advice. One can read through a book or two of those in a pair of days. It takes much longer, however, to figure out which of those pieces of advice would work for you personally. Take for example the popular suggestion to keep a notebook with a todo-list and spend 5 minutes every day to plan the day. I tried that several times, and although it does seem to help temporarily, I just can't make myself follow the routine. First of all, the requirement to write trivialities into the notebook is demanding. The mind resists it and tries to optimize the notebook away as quickly as possible. Secondly, the need to follow some routine "5 minutes every day" is a disciplinary burden too heavy for my lazy personality.

Here's another popular suggestion that I found to be somewhat useless: plan your projects, set yourself deadlines and keep up to them. This doesn't work for me for two reasons. Firstly, there is not much added value in explicit planning for most day-to-day tasks, intuition does a pretty good job here anyway. And then, no matter how detailed a plan you would have for each project, you'll still have the problem of scheduling your time among these, as well as handling all those tiny routine events and unexpected tasks. Finally, the requirement of keeping up to deadlines sounds more like a problem than a solution to me.

Fortunately, there are tricks that work much better, and it is them that I was planning to write about. The best time management ideas that I know of, have been nicely summarized by D. Allen in his GTD ideology. His advice is based on three obvious (as usually), but nonetheless insightful observations:
1. A time management system won't work unless you put everything in it. This includes every smallest idea that you ever had to think about, including "I want to go see that stupid movie some day" and all the less important things. Although it might seem like an overkill, it's not. These small ideas, unless materialized somehow, can use up quite a lot of your brainpower by just "being on your mind". As soon as you write them out in a safe place, your mind gets much clearer. Also, you can only manage your schedule if you really know that everything is there.
2. If there's an idea, it is not immediately clear what action it implies. For example, "I want to see a movie" is an idea. The first action that needs to be done to make it happen is actually "Call a friend". It's not a big deal to figure it out, but nonetheless there's some amount of thought involved. Surprisingly often, this small amount of thought is a terrible hurdle while undone. Indeed, isn't it much more pleasant to know that you need to "call a friend" than to have a vague "go see a movie" entry on the todo-list? What GTD essentially suggests is to make this "first action" decision as early as possible and keep the todo-lists in terms of actions, not ideas.
3. A proper file management/archiving system is crucial. There has to be a place where you can store all these ideas, "first actions", reference materials, schedule, reminders, etc. Ideally, this storage has to be organized in an associative manner, i.e. you should be able to retrieve things in context easily. "I'm in a mood to write an email now, do I have a task that fits this context?", "I'm working on this project now, what are the related materials?". Once again, if you can trust your system to store things for you, your mind is relieved.
The GTD ideology describes a simple workflow process based on these observations. The whole process uses heavily the notion of "inboxes" and goes as follows:
1. All the ideas or tasks you might have arrive at your "IN" inbox.
2. You review this inbox regularly, and decide for each item it's first action. Upon decision, you must move items out of the "IN" inbox, and you have the following limited set of options:
  1. There is nothing to do about it. Delete and forget.
  2. This is some useful information. Archive for future reference.
  3. This requires you to do something small (it'll take 2 minutes or so). Do it now. Delete message.
  4. This requires you to do something that will take longer than 2 minutes. Then:
    
    You can schedule doing it to a fixed date and time (add an entry to the calendar). Delete message or archive for reference.
    
    You can move the message into the "ASAP" inbox. You'll review that inbox next time you have some free time. Note that the choice between (1) and (2) allows to balance conveniently between "100% planning" and "no scheduling at all" while still keeping track of all your stuff.
    
    You can move it into the "Someday" inbox. You'll review that inbox someday later.
    
    This is something you have to delegate. Then you move the item to the "Pending" inbox, so that you won't forget about it.
3. Besides the inboxes "IN", "ASAP", "Later" and "Pending" you also need to keep a list of ongoing projects as folders with relevant information. You can review this list from time to time to make sure you haven't forgotten anything.
The best part in this whole process (and this is what actually makes it suitable for me personally), is the fact that it can be implemented using e-mail. This way it does not force much additional discipline: I'm reading email several times a day anyway as part of my compulsory procrastination activities. Most of my tasks, even the smallest ones, are in one way or another reflected in the e-mail. It is then only a matter of keeping the proper folder structure and moving emails out of INBOX according to the above protocol, remembering to think about the "next action" in the process. That is quite easy, it forces to manage and schedule more or less all of my activities, and it keeps the mind reasonably free, too.
Tags: Personal, Time management
LEGO is NP-complete

Posted by Konstantin 28.10.2008 2 Comments

The amusing fact that Minesweeper was proven to be NP-complete is quite well known in certain circles. However, it seems that no one has yet cared to publicly demonstrate the (probably obvious, but not a single bit less amusing) fact, that constructing LEGO toys is also an NP-complete task. Here I fix this disturbing flaw and present a reduction of SAT to LEGO by a simple example.

Suppose you are given the following set of LEGO blocks

and you wish to construct the following structure from them (using only this picture as a reference):

It turns out that devising such a configuration is equivalent to finding a satisfying assignment for the boolean formula $(A\vee\neg B)\wedge(\neg A \vee B)$ , and it is possible to devise an analogous configuration for any other logical formula in CNF. Thus, building LEGO must be at least as hard as satisfying boolean formulas, that is, NP-hard.

Indeed, let the red brick correspond to the variable $A$ , the blue brick let be $B$ , the pink brick $\neg A$ and the light blue brick $\neg B$ .
Consider the part I of the target configuration (the one in the upper left). It contains one red brick and one light blue brick, but because of the black block in front of them we don't know whether the red and the light blue bricks are of height 1 or 2. There are three possible options:

We say that this configuration corresponds to the term $A\vee \neg B$ , where having a higher brick for a given variable corresponds to setting the value of that variable to True. Clearly, for the configuration to be possible either the red brick ( $A$ ) or the light blue brick ( $\neg B$ ) must be high, and therefore a proper choice of bricks here corresponds to a choice of values in the expression $A\vee \neg B$ . Analogously, part II corresponds to the term $\neg A\vee B$ and the fact that both parts are possible translates to the full equation $(A\vee\neg B)\wedge(\neg A \vee B)$ .

Now what about parts III and IV? These guarantee that if we choose the high red brick when constructing parts I and II, we shall necessarily have to use the shorter pink brick there, and similarly for the blue and light blue bricks. Indeed, due to the choice of white bricks there are two ways of constructing part III:

In the first case we use up the high red brick and the short pink brick for constructing part III, and therefore end up with the short red brick and high pink brick for building parts I and II. This corresponds to picking the value False for $A$ . Similarly, the second way of constructing part III would enforce the value True for the variable $A$ .

A short reflection should convince you now, that the same scheme can be used to represent any other CNF boolean expression. As it is well known that finding a satisfying assignment for a boolean formula is NP-complete, it follows that LEGO is NP-complete too.

Q.E.D.

Tags: Complexity theory, Fun, LEGO
Prospective Topics in Soft Computing

Posted by Konstantin 24.10.2008 No Comments

This week I had a chance to participate in a project meeting related to soft computing. A part of the meeting was a brainstorm-like session, dedicated to the generation of relevant topics for future research in the field. I was listening very carefully, writing everything down, and trying to generalize somewhat. By the end of the meeting my generalization attempts succeeded and now I can easily generate relevant topics in arbitrary numbers without any external help. In an act of openness, unprecedented elsewhere in scientific circles, I'm sharing this secret technology with you here.

Fresh Topics in Soft Computing

Generate more topics: Easy, Difficult

For the curious: here is the source code.

Tags: Fun, Language, Project, Soft computing