• Posted by Konstantin 09.09.2008 3 Comments

    Every time you visit this page, a piece of Javascript code will run within your browser, render a small part of the picture below (which is, for the sake of beauty and simplicity, a fragment of the Mandelbrot fractal) and submit the resulting pixels to the server. After 100 visits the whole picture will be complete (and the rendering restarts). If I hadn't told you that, you wouldn't have the slightest chance of noticing how this page steals your CPU cycles, and that is why one might refer to such practice as parasitic or leech computing.

    Mandelbrot fractal

    The Mandelbrot fractal

    In this simple example, I am probably not winning much by outsourcing the rendering procedure. The computation of each pixel requires about 800 arithmetic operations on average, and this is comparable to the overhead imposed by the need to communicate the results back to the server via HTTP. However, if I chose to render somewhat larger chunks of the image at higher precision, the gains would be much more significant. Additionally, the script could be written so that it would keep running continuously for as long as you are staying at the page, thus sacrificing the user experience somewhat, yet blatantly robbing you of CPU power.

    It seems that this approach to distributed computing has not reached the masses yet. I believe, however, that we are going to see the spread of such parasitic code someday, because it is the second easiest way to monetize website traffic. Indeed, we are already used to watching ads in return for free service. Moreover, quite a lot of the ads are rather heavy Flash applications that spend your CPU cycles with the sole purpose of annoying you. Now, if someone replaced that annoying Flashing banner with a script, that computed something useful behind the scenes, you wouldn't be too disappointed, would you? And that someone could then sell his website traffic not in terms of "banner displays", but in terms of "CPU seconds". Or, well, he could sell both.

    Of course, not every distributed computation can be easily implemented within such an environment. Firstly, it should be possible to divide the problem into a large number of independent parts: this is precisely the case when you need to compute the values of a certain function f for a large number of parameters. The Mandelbrot example above fits this description. Here is one other similar problem. Less obviously, various other tasks could be fit within the framework with the help of the Map-Reduce trick.

    Secondly, the computation of each value f(x) should be reasonably complex, preferably superlinear, i.e. Ω(n^2) or worse. Otherwise, the overhead of sending the inputs (which is O(n)) would offset the benefits too much.

    Thirdly, the description of the function f should be reasonably compact, otherwise the overhead of transferring it to each visitor would be too costly. Note, however, that this issue slightly depends on the kind of traffic being leeched upon: if a website has a small number of dedicated users, each user would only need to download the function definition once and refer to the cached version on his subsequent visits to the site.

    Finally, the function f, as well as its inputs and outputs must be public. This restriction severely limits the use of the approach. For example, although numerous data analysis tasks could satisfy the above conditions, in many practical contexts the data is private and it is thus not possible to openly distribute it to arbitrary visitors of an arbitrary website.

    Besides the theoretical difficulties, there are some technical issues that need to be solved before the whole thing can work, such as the security aspects (you can't trust the results!), implementation (Linear Algebra libraries for Javascript or Flash, please?), ethical concerns and some more.

    Nonetheless, the whole thing still looks rather promising to me, and is at least as worthy of academic and industrial attention, as are all of these overhyped Grid, P2P and SOA technologies around.

    PS: By the way, I find the topic well-suitable for a proper student project/thesis.


    Tags: , , ,