• Posted by Konstantin 25.07.2016 No Comments

    The web started off with the simple idea of adding hyperlinks to words within text documents. The hyperlinks would let the reader to easily "jump" from one document or page to another, thus undermining the need to read the text sequentially, page by page, like a book. Suddenly, the information available to a computer user expanded from single documents into a whole network of disparate, interconnected pages. The old-school process of reading the pages in a fixed order became the process of browsing this network.

    Hyperlinks could be added to the corresponding words by annotating these words with appropriate mark-up tags. For example, "this sentence" would become "this <a href="other_document">sentence</a>". This looked ugly on a text-only screen, hence browsers were soon born - applications, which could render such mark-up in a nicer way. And once you view your text through a special application which knows how to interpret mark-up (a serious advance back in the 90s), you do not have to limit yourself to only tagging your text with hyperlinks - text appearance (such as whether it should be bold, italic, large or small) could just as well be specified using the same kind of mark-up.

    Fast-forward 25 years, most documents on the web consist nearly entirely of mark-up - some have nearly no text at all. The browsers now are extremely complex systems whose job goes way beyond turning hyperlinks into underlined words. They run code, display graphics and show videos. The web experience becomes more graphical and interactive with each year.

    There is one area, however, that the web has not yet fully embraced - 3D graphics. First attempts to enrich the web experience with 3D go back as far as 94, when VRML was born. It picked some traction in the scientific community - projects were made which used VRML to, say, visualize molecules. Unfortunately, the common web developers of the time mostly regarded VRML as an arcane technology irrelevant to their tasks, and the layman user would not care to install a heavyweight VRML plug-in in order to view a molecule in the browser. Now, if it were possible to make, say, an addictive 3D shooter game with VRML, things would probably be different - a critical mass of users would be tempted to install the plug-in to play the game, and the developers would become tempted to learn the arcane tech to develop a selling product for that critical mass. Apparently, no selling games were created with VRML.

    It took about fifteen years for the browsers to develop native support for 3D rendering technology. It is now indeed possible to create addictive 3D games, which run in the browser. And although a whole market for in-browser 3D applications has been born, the common web developer of our time still regards 3D as an arcane technology irrelevant to their tasks. It requires writing code in an unfamiliar API and it does not seem to interoperate with the rest of your webpage well. The successors of VRML still look like rather niche products for the specialized audience.

    I have recently discovered the A-Frame project and I have a feeling that this might finally bring 3D into the web as a truly common primitive. It interoperates smoothly with HTML and Javascript, it works out of the box on most browsers, it supports 3D virtual reality headsets, and it relies on an extremely intuitive Entity-Component approach to modeling (just like the Unity game engine, if you know what that means). Let me show you by example what I mean:

    <a-scene>
        <a-cube color="#d22" rotation="0 13 0">
            <a-animation attribute="position"
                         dur="1000"
                         easing="ease-in-out-quad"
                         direction="alternate"
                         to="0 2 0"
                         repeat="indefinite">
             </a-animation>
         </a-cube>
    </a-scene>
    

    This piece of markup can be included anywhere within your HTML page, you can use your favourite Javascript framework to add interactivity or even create the scene dynamically. You are not limited to any fixed set of entities or behaviours - adding new components is quite straightforward. The result seems to work on most browsers, even the mobile ones.

    Of course, things are not perfect, A-Frame's version number is 0.2.0 at the moment, and there are some rough edges to be polished and components to be developed. None the less, the next time you need to include a visualization on your webpage, try using D3 with A-frame, for example. It's quite enjoyable and feels way more natural than any of the 3D-web technologies I've tried so far.

    Tags: , , ,

  • Posted by Konstantin 13.12.2008 No Comments

    It is somewhat sad to see that the Scalable Vector Graphics (SVG) format, despite its considerable age and maturity, has not yet gained too much popularity in the internet, with a lot of Adobe Flash all over instead. Here are some points you should know about it, so that maybe you consider taking a couple of hours to get acquainted with it one day.

    1. SVG is an open source vector graphics format.
    2. SVG supports most of what you'd expect from a 2D graphics language: cubic splines, bezier curves, gradients, nested matrix transformations, reusable symbols, etc.
    3. SVG is XML-based and rather straightforward. If you need a picture with a line and two circles, you write a <line> tag and two <circle> tags:
      <svg xmlns="http://www.w3.org/2000/svg">
          <line x1="0" y1="0" x2="100" y2="100" 
                stroke-width="2" stroke="black"/>
          <circle cx="0" cy="0" r="50"/>
          <circle cx="100" cy="100" r="20" 
                  fill="red" stroke="black"/>
      </svg>
    4. Most vector graphics editors can write SVG. For example, Inkscape is one rather usable open-source software piece.
    5. SVG supports Javascript. Basically, if you know HTML and Javascript, you are ready to write SVG by hand, because SVG is also just XML + Javascript. This provides considerable freedom of expression.
    6. SVG can be conveniently embedded into HTML webpages and is supported out-of-the-box by most modern browsers.

    My personal interest towards SVG is related to the observation, that it seems very suitable for creating interactive data visualizations (charts, plots, graphs) right in the browser. And although the existing codebase devoted to these tasks can't be called just enormous, I'm sure it will grow and gain wider adoption. Don't miss it!

    Tags: , , , ,

  • Posted by Konstantin 08.10.2008 No Comments

    I received a number of "why" and "how" questions regarding the pri.ee domain name of this site and I thought the answers are worth a post. The technically savvy audience can safely skip it, though.

    The pri.ee subdomain is reserved by EENet for private individuals, who have an estonian ID code. The registration is free of charge and very simple: you just fill in a short form and wait a day or two until your application is processed. As a result you end up with a simple affiliation-free way of designating your site. Of course, it does not have the bling of a www.your-name.com, but I find it quite appropriate for an aspiring blog (and besides, I'm just too greedy and lazy to bother paying for the privilege of a flashy name for my homepage).

    Now on to the "how" part. The only potentially tricky issue of the registration process is the need to fill in the "Name servers" field. Why do you need that and why can't you just directly provide the IP address of the server where you host your site? Well, if you could register the specific IP of your server with EENet, you would have to to contact EENet every time your hosting provider changed, right? In addition, you would need to bother EENet about any subdomain (i.e. <whatever>.yourname.pri.ee) you might be willing to add in the future. Certainly not the most convenient option. Therefore, instead of providing an IP address directly, you specify a reference to an intermediate server, which will perform the mapping of your domain name (and any subdomains) to IP addresses. That's how the internet domain naming system actually works.

    So which name server should you choose? Most reasonable hosting providers (that is, the ones that allow to host arbitrary domains) allow you to use their name servers for mapping your domain name. The exact server names depend on the provider and you should consult the documentation. For example, if you were hosting your site at 110mb.com (which is here just an arbitrarily chosen example of a reasonable free web hosting I'm aware of), the corresponding name servers would be ns1.110mb.com and ns2.110mb.com.

    However, using the name server of your provider is, to my mind, not the best option. In most cases the provider will not allow you to add subdomains and if you change your hosting you'll probably lose access to the name server, too. Thus, a smarter choice would be to manage your domain names yourself using an independent name server. Luckily enough, there are several name servers out there that you can use completely free of charge (or for a symbolic donation): EveryDNS and EditDNS are two examples of such services that I know of.

    After you register an account with, say, EveryDNS, you can specify the EveryDNS nameservers (ns1.everydns.net, ..., ns4.everydns.net) in the pri.ee domain registration form. You are now free to configure arbitrary address records for yourname.pri.ee or <whatever>.yourname.pri.ee to your liking.

    To summarize, here how one can get a reasonable website with a pri.ee domain name for free:

    1. Register with a reasonable web hosting provider
      • 110mb.com is one simple free option (with an exception that they charge $10 once if you need MySQL)
      • other options
    2. Register a DNS account
    3. Fill out this form.
      • If you chose EveryDNS in step 2, state ns1.everydns.net, ns2.everydns.net, ns3.everydns.net, ns4.everydns.net as your name servers.
      • Wait for a day or two.
    4. Suppose you applied for yourname.pri.ee (the domain is still free, by the way!), then:
      • Add this domain in your hosting's control panel and upload your website.
      • Add this domain to your DNS account.
        • You can add an A ("address") record mapping yourname.pri.ee to an IP address.
        • Alternatively, you can add a NS ("name server") record referencing yourname.pri.ee further to ns1.110mb.com (or whatever name server your hoster provides).
    5. Profit!

    Tags: ,

  • Posted by Konstantin 09.09.2008 3 Comments

    Every time you visit this page, a piece of Javascript code will run within your browser, render a small part of the picture below (which is, for the sake of beauty and simplicity, a fragment of the Mandelbrot fractal) and submit the resulting pixels to the server. After 100 visits the whole picture will be complete (and the rendering restarts). If I hadn't told you that, you wouldn't have the slightest chance of noticing how this page steals your CPU cycles, and that is why one might refer to such practice as parasitic or leech computing.

    Mandelbrot fractal

    The Mandelbrot fractal

    In this simple example, I am probably not winning much by outsourcing the rendering procedure. The computation of each pixel requires about 800 arithmetic operations on average, and this is comparable to the overhead imposed by the need to communicate the results back to the server via HTTP. However, if I chose to render somewhat larger chunks of the image at higher precision, the gains would be much more significant. Additionally, the script could be written so that it would keep running continuously for as long as you are staying at the page, thus sacrificing the user experience somewhat, yet blatantly robbing you of CPU power.

    It seems that this approach to distributed computing has not reached the masses yet. I believe, however, that we are going to see the spread of such parasitic code someday, because it is the second easiest way to monetize website traffic. Indeed, we are already used to watching ads in return for free service. Moreover, quite a lot of the ads are rather heavy Flash applications that spend your CPU cycles with the sole purpose of annoying you. Now, if someone replaced that annoying Flashing banner with a script, that computed something useful behind the scenes, you wouldn't be too disappointed, would you? And that someone could then sell his website traffic not in terms of "banner displays", but in terms of "CPU seconds". Or, well, he could sell both.

    Of course, not every distributed computation can be easily implemented within such an environment. Firstly, it should be possible to divide the problem into a large number of independent parts: this is precisely the case when you need to compute the values of a certain function f for a large number of parameters. The Mandelbrot example above fits this description. Here is one other similar problem. Less obviously, various other tasks could be fit within the framework with the help of the Map-Reduce trick.

    Secondly, the computation of each value f(x) should be reasonably complex, preferably superlinear, i.e. Ω(n^2) or worse. Otherwise, the overhead of sending the inputs (which is O(n)) would offset the benefits too much.

    Thirdly, the description of the function f should be reasonably compact, otherwise the overhead of transferring it to each visitor would be too costly. Note, however, that this issue slightly depends on the kind of traffic being leeched upon: if a website has a small number of dedicated users, each user would only need to download the function definition once and refer to the cached version on his subsequent visits to the site.

    Finally, the function f, as well as its inputs and outputs must be public. This restriction severely limits the use of the approach. For example, although numerous data analysis tasks could satisfy the above conditions, in many practical contexts the data is private and it is thus not possible to openly distribute it to arbitrary visitors of an arbitrary website.

    Besides the theoretical difficulties, there are some technical issues that need to be solved before the whole thing can work, such as the security aspects (you can't trust the results!), implementation (Linear Algebra libraries for Javascript or Flash, please?), ethical concerns and some more.

    Nonetheless, the whole thing still looks rather promising to me, and is at least as worthy of academic and industrial attention, as are all of these overhyped Grid, P2P and SOA technologies around.

    PS: By the way, I find the topic well-suitable for a proper student project/thesis.


    Tags: , , ,

  • Posted by Konstantin 03.09.2008 4 Comments
    Google Chrome logo

    Geeks all over the world have just gained a new hot topic to flame or panic about. Web designers now have to verify their applications and websites against a yet another browser. Developers learned about that new open-source embeddable Javascript engine. All the normal people will have a choice of a yet another, hopefully well-made, browser to work with. Thus groweth the Church of Google.

    What is it to Google? Apart from the obvious increase in the user base and the potential to advertise suggest websites right in the address bar, wide distribution of Chrome should somewhat increase the amount of user-generated traffic flowing into Google servers. Indeed, the default configuration of Chrome, equipped with that marvelous auto-suggestion feature, seems to start sending stuff out as soon as you type your first character into the address line.

    Although the term "privacy violation" is the first one to pop out, let's keep that aside for now. The really interesting question concerns the nature of this constant influx of half-typed URLs and "search terms", annotated with timestamps and host IPs. Firstly, it most certainly contains additional value over whatever is already indexed in the web: global events, current social trends, new websites, ideas and random creative thoughts leave a mark on your address line. It therefore makes sense to look into this data and search for patterns. Secondly, the volume of this data stream is probably quite large whilst the noise is significant, hence it does not pay off to store it somewhere. And that's where we reach a nice observation: for many purposes you don't need to store this data.

    If the bandwidth of the stream is constantly high, you can afford to throw it away. If at any moment you should need data, just turn on the sniffer "put your bucket in", and you'll collect megabytes of interesting stuff in a matter of seconds, if not less. A simplistic version of such kind of "stream analysis" looks as follows: you ask Google "what do people read right now", it then listens to the stream for a second and responds with something meaningful, and I believe much cooler things can be thought of. Anyway, the important point is that no "global" indexing or data collection is needed to perform this service, just a thumb on the pulse of the web.

    Tags: , , ,