• Posted by Konstantin 02.01.2024 No Comments

    This article has been cross-posted to Hackster.

    What is the best way to learn (or teach) introductory robotics? What could be the "hello world" project in this field, that would be simple for a beginner to get started with, complex enough to be exciting and qualify as a "robot", and deep enough to allow further creative extensions and experiments? Let us try solving this problem logically, step by step!

    First things first: what is a "robot"?

    Axiom number one: for anything to feel like a robot, it must have some moving parts. Ideally, it could have the freedom move around, and probably the easiest way to achieve that would be to equip it with two wheels. Could concepts other than a two-wheeler qualify as the simplest robot? Perhaps, but let us proceed with this one as a reasonably justified assumption.

    The wheels

    To drive the two upcoming wheels we will need two motors along with all that driver circuitry, ugh! But wait, what about those small continuous rotation servos! They are simple, cheap, and you just need to wire them to a single microcontroller pin. Some of them are even conveniently sold along with with a LEGO-style wheel!

    The microcontroller

    Rule number two: it is not a real robot, unless it has a programmable brain, like a microcontroller. But which microcontroller is the best for a beginner in year 2024? Some fifteen years ago the answer would certainly have the "Arduino" keyword in it. Nowadays, I'd say, the magic keyword is "CircuitPython". Programming just cannot get any simpler than plugging it into your computer and editing code.py on the newly appeared external drive. No need for "compiling", "flashing" or any of the "install these serial port drivers to fix obscure error messages" user experiences (especially valuable if you need to give a workshop at a school computer class). Moreover, the code to control a wheel looks as straightforward as motor.throttle = 0.5.

    OK, CircuitPython it is, but what specific CircuitPython-compatible microcontroller should we choose? We need something cheap and well-supported by the manufacturer and the community. The Raspberry Pi Pico would be a great match due to its popularity, price and feature set, except it is a bit too bulky (as will become clear below). Check out the Seeedstudio Xiao Series instead. The Xiao RP2040 is built upon the same chip as the Raspberry Pi Pico, is just as cheap, but has a more compact size. Interestingly, as there is a whole series of pin-compatible boards with the same shape, we can later try other models if we do not like this one. Moreover, we are not even locked into a single vendor, as the QT Py series from Adafruit is also sharing the same form factor. How cool is that? (I secretly hope this particular footprint would get more adoption by other manufacturers).

    The body

    OK, cool. We have our wheels and a microcontroller to steer them. We now need to build the body for our robot and wire things together somehow. What would be the simplest way to achieve that?

    Let us start with the wiring, as here we know an obviously correct answer - a breadboard. No simpler manual wiring technology has been invented so far, right? Which breadboard model and shape should we pick? This will depend on how we decide to build the frame for our robot, so let us put this question aside for a moment.

    How do we build the frame? The options seem infinite: we can design it in a CAD tool and laser cut it from acrylic sheets, print on a 3D printer, use some profile rails, wooden blocks, LEGO bricks,... just ask Google for some inspiration:

    Ugh, this starts to feel too complicated and possibly inaccessible to a normal schoolkid.... But wait, what is it there on the back of our breadboard? Is it sticky tape? Could we just... stick... our motors to it directly and use the breadboard itself as the frame?

    If we now try out a few breadboard models, we will quickly discover that the widespread, "mini-sized", 170-pin breadboard has the perfect size to host our two servos:

    By an extra lucky coincindence, the breadboard even has screw holes in the right places to properly attach the motors! If we now hot glue the servo connectors to the side as shown above, we can wire them comfortably, just as any other components on the breadboard itself. Finally, there is just enough space in the back to pack all the servo wires and stick a 3xAAA battery box, which happens to be enough to power both our servos and the Xiao RP2040:

    This picture shows the FS90R servos. These work just as well as the GeekServo ones shown in in the picture above.

    ... and thus, the BreadboardBot is born! Could a breadboard with two motors and a battery stuck to its back qualify as the simplest, lowest-tech ever robot platform? I think it could. Note that the total cost of this whole build, including the microcontroller is under $20 (under $15 if you order at the right discounts from the right places).

    But what can it do?

    Line following

    Rule number 3: it is not a real robot if it is not able to sense the environment and react to it somehow. Hence, we need some sensors, preferably ones that would help us steer the robot (wheels are our only "actuators" so far). What are the simplest steering techniques in introductory robotics? I'd say line following and obstacle avoidance!

    Let us do line following first then. Oh, but how do we attach the line tracking sensors, we should have built some complicated frame after all, right? Nope, by a yet another happy design coincidence, we can just plug them right here into our breadboard:

    Thanks to the magic of CircuitPython, all it takes now to teach our robot to (even if crudely) follow a line is just three lines of code (boilerplate definitions excluded):

    while True:
      servo_left.throttle = 0.1 * line_left.value
      servo_right.throttle = -0.1 * line_right.value

    Obstacle avoidance

    OK, a line follower implemented with a three-line algorithm is cool, can we do anything else? Sure, look how conveniently the ultrasonic distance sensor fits onto our breadboard:

    Add a few more lines of uncomplicated Python code and you get a shy robot that can follow a line but will also happily go roaming around your room, steering away from any walls or obstacles:

    Note how you can hear the sonar working in the video (not normally audible).

    ... and more

    There is still enough space on the breadboard to add a button and a buzzer:

    .. or splash some color with a Xiao RGB matrix:

    .. or plug in an OLED screen to give your BreadboardBot a funny animated face:
    The one on the image above also had the DHT11 sensor plugged into it, so it shows humidity and temperature while it is doing its line following. Why? Because it can!

    Tired of programming and just want a remote-controllable toy? Sure, plug in a HC06 bluetooth serial module, program one last time, and go steer the robot yourself with a smartphone:

    Other microcontroller boards, BLE, Wifi and FPV

    Xiao RP2040 is great, but it has no built-in wireless connectivity! Plug in its ESP32-based cousin Xiao ESP32S3 and you can steer your robot with a Bluetooth gamepad, or via Wifi, by opening a webpage served by the robot itself!
    If you have the Xiao ESP32S3 Sense (the version with a microphone and camera), that webpage can even show you the live first-person view feed of the robot's camera (conveniently positioned to look forward):

    In case you were wondering, the resulting FPV bot is not a very practical solution for spying on your friends, as the microcontroller heats up a lot and consumes the battery way too fast, but it is still a fun toy and an educational project.

    Finally, although the Xiao boards fit quite well with all other gadgets on the mini breadboard you are not limited by that choice either and can try plugging in any other appropriately-sized microcontroller board. Here is, for example, a BreadboardBot with an M5Stack Atom Lite running a line-follower program implemented as an ESPHome config (which you can flash over the air!).


    Conclusion

    The examples above are just the tip of the iceberg. Have any other sensors lying around from those "getting started with Arduino" kits that you did not find sufficiently exciting uses for so far? Maybe plugging them into the BreadboardBot is the chance to shine they were waiting for all this time! Teaching the robot to recognize familiar faces and follow them? Solving a maze? Following voice commands? Self-balancing? Programming coordinated movement of multiple robots? Interfacing the robot with your smart home or one of these fancy AI chatbots? Organizing competitions among multiple BreadboardBots? Vacuuming your desk? The sky (and your free time) is the limit here!

    Consider making one and spreading the joy of breadboard robotics!

    Detailed build instructions along with wiring diagrams and example code are available on the project's Github page. Contributions are warmly welcome!

    Tags: , , , , , , ,

  • Posted by Konstantin 28.06.2018 No Comments

    I had to replace my SIM card today. The old one was mini-sized and I needed a micro-sized one. My previous SIM card stayed with me for about 20 years or so. I have gone through high school, university, and traveled around to countless places with it. I changed my phone several times, starting from an older, bulkier Nokia through several candy-bar Nokias and a couple of smartphones. My personal computer evolved from a 200MHz Pentium with 16MB RAM to a 2.8GHz quad-core with 32GB RAM. But my SIM card always stayed the same. Let me dedicate this post to the memory of this glorious tiny computer - the most long-lasting and reliable piece of computing equipment I have ever used.

    My retiring SIM card. The mobile operator changed its name from Radiolinja to Elisa in 2000, but the SIM stayed.

    My retiring SIM card. The mobile operator changed its name from Radiolinja to Elisa in 2000, but the SIM card did not care.

    What is a SIM card?

    Your SIM card is just another smart card, technologically similar to any of the "chip-and-pin" bank cards in your wallet, your ID card and even your contactless public transport tickets (which communicate using the same set of protocols, just wirelessly). Note, however, that banks started to switch from magnetic to "chip-and-pin" credit cards and countries began issuing smart-card-based identity documents only about 15 years ago or so. SIM cards, on the other hand, had been in wide use much earlier, long before the term "smart card" became a popularized buzzword.

    All smart cards are built to serve one primary purpose - identification. The card stores a secret key (which is impossible to retrieve without disassembling the card using ultra-high-tech lab equipment), and provides a way to execute the following challenge-response protocol: you send a random string (a challenge) to the card. The card encrypts the string using the stored key and returns the response. The correctness of this response can now be verified by a second party (e.g. the mobile operator, the bank or a website using ID-card authentication). The actual details of the computation differ among the various smart cards. For example, some use symmetric while other - asymmetric cryptography. Some cards provide additional services, such as creating digital signatures or storing information on the card. None the less, identification is always the core function. This explains the name of the SIM card: Subscriber Identity Module. It is a module, which identifies you (the subscriber) to the network provider.

    What is inside a SIM card?

    The best way to understand what is inside a SIM (or any other type of smart-) card is to connect it to your computer and "talk" to it. Many laptops have integrated smart-card readers nowadays, so if you find a suitable frame for your nano/micro/mini-SIM, you may simply put it into the reader just as you would do with an ID or a bank card.

    Mini-SIM in a frame

    Old mini-SIM in a frame from a newer card

    Note, though, that if your frame is flimsy and not fitting perfectly (as is mine on the photo above) you run the risk of losing the chip somewhere in the depths of the card reader slot before you even manage to slide it in completely. Hence, in my case, a tiny cross-style USB card reader was a more convenient and reliable option:

    SIM in a frame in a reader

    SIM in a frame in a reader

    Plug it in, wait until the system recognizes the device, and we are ready to talk. There are many tools for "talking" to the card on a low level, but one of the best for the purposes of educational exploration is, in my opinion, the program called CardPeek. Let us fire it up. At start up it asks for the card reader to use (note that you can use it to explore both contact and contactless cards, if you have the necessary reader):

    Select reader screen

    Select reader screen

    We can now click "Analyzer → GSM SIM", provide the PIN, wait a bit, and have the program extract a wealth of information stored on the card:

    SIM card analyzed

    SIM card analyzed

    Fun, right? Let us now see where all this data came from and what it actually means.

    How does it work?

    At the hardware level, a smart card has a very simple connector with six active pins, which are designed for sending and receiving data to and from the card:

    Smart card connectors

    Smart card connectors (source)

    Four pins (VCC, GND, CLK, I/O) are used for the basic half-duplex synchronous serial communication. RST is the reset pin, and VPP is used for supplying the higher voltage when (re)programming the card. When you first connect to the card, the protocol requires to zero the "reset" pin, to which the card will reply by sending a fixed sequence of bytes, identifying the card and its capabilities. It is known as the card's ATR ("Answer To Reset") string. You can see this string listed as the "cold ATR" entry in the screenshot named "SIM card analyzed" above.

    Besides providing pre-packaged analysis scripts for various kinds of smart cards, CardPeek allows to send custom commands to the card by using its Lua scripting interface directly. This is what the "Command" text input at the bottom of the screen is for. Let us now switch to the "logs" tab and try sending some commands. First of all, we need to establish the connection to the card. This is done via the card.connect() call:

    card.connect()

    card.connect() example

    The ATR string is received by the reader when the connection is first established. We can obtain it via card.last_atr() and print out to the log window in hex-encoded form using log.print and bytes:format (the documentation for all these APIs is available here):

    ATR

    ATR example

    As we see, the ATR for my card happens to be 3BBA9400401447473352533731365320 (in hex).  If you search the web, you will find that this particular ATR is known to be  a signature of the Elisa SIM cards. It is not a random string, though, and every byte has a meaning. In particular:

    • 3B is the "initial byte", and this particular value identifies the smart card as a SIM card.
    • BA is the "format" byte. Its first four bits (1011) tell us that we have to expect fields TA1, TB1 and TD1 to follow. The last four bits denote the number 10 - the number of "historical bytes" at the end of the ATR.
    • 94 is the field TA1, specifying the clock rate of the serial protocol
    • 00 is the field TB1, specifying the programming voltage (apparently, the card is not re-programmable)
    • 40 tells us that we have to read out another byte field TC2 (this is in the left-side part of the byte, 4 ) and that the card uses T=0 protocol (this is in the right-side part, 0).
    • 14 is the promised TC2 field (not sure what it is meant for),
    • the last 10 bytes are the "historical bytes", providing card manufacturer-specific information.

    Greeting your Card

    Now that we are connected, we can send various commands to the card. Let us proceed by example. The first command we might want to send is "VERIFY CHV", which is essentially greeting the card by providing our PIN1 code ("CHV" stands for "Card Holder Verification").

    VERIFY CHV

    VERIFY CHV (source)

    Every smart card command starts with a two-byte identifier (for example, A0 20 is the (hex) identifier of the VERIFY CHV command). It is followed by two parameter bytes P1 and P2. For example, the parameter P1 for VERIFY CHV is always 0, and P2 must indicate the number of the PIN we are submitting (i.e. 1 for PIN1, 2 for PIN2). Next comes P3 - a byte, specifying the length of the data which follows. For VERIFY CHV the data is the provided PIN itself, and it is always 8 bytes long. If the PIN is shorter than 8 bytes, it must be padded by bytes FF. The PIN itself is encoded in plain ASCII (i.e. 1234 would be 31 32 33 34).

    Now, supposing my PIN1 is, in fact "1234", I can authenticate myself to the card via CardPeek as follows:

    sw, resp = card.send(bytes.new(8, "A0 20 00 01 08 31 32 33 34 FF FF FF FF"))

    Here, card.send is the sending command and bytes.new(8, ...) constructs an array of 8-bit bytes from a hex string (see CardPeek reference).

    The sw and resp are the two components of the T=0 protocol response. For VERIFY CHV we only care that sw is equal to 9000, which means "OK". Note how this is printed in the log.

    VERIFY CHV

    VERIFY CHV example

    Beware that if you do not receive a 9000 response, it means that the card denied your PIN for some reason. Trying to submit a wrong PIN three times in a row will block the card.

    Reading the Data

    Now that we have identified ourselves to the card, let us try to read the data stored on it. The data on the card is organized in a hierarchy of files. It is this exact hierarchy that you can observe as the output of the "Analyzer" script. The root file is called "MF", it has the ICCID, TELECOM and GSM sub-files, which, in turn, have a number of predefined sub-files themselves, etc. The names are just conventions, the card itself uses two-byte identifiers for each file. For example, "MF" is actually 3F 00, "TELECOM" is 7F 10, etc. While the card is connected you can navigate around the file structure just like you would do in a normal operating system using the cd command, except that in smart-card lingo the corresponding command is called SELECT.

    The binary form of the SELECT command is A0 A4 00 00 02 {x} {y}, where {x} {y} is the file identifier. Just like before, A0 A4 is the command code, 00 00 are the ignored P1 and P2 parameters, and 02 tells us that exactly two bytes must follow.

    Consequently, if we wanted to select the file "MF (3f 00)/TELECOM (7F 10)/ADN (6F 3A)", which contains the address book records, we could achieve it by sending the following sequence of commands via CardPeek:

    card.send(bytes.new(8, "A0 A4 00 00 02 3F 00"))
    card.send(bytes.new(8, "A0 A4 00 00 02 7F 10"))
    card.send(bytes.new(8, "A0 A4 00 00 02 6F 3A"))

    Selecting files is a common task, and CardPeek provides a convenient shorthand: card.select("#7f10"). For some cards (not mine, unfortunately) it should also be possible to do things like card.select("/7f10/6f3a").

    Once we have selected the ADN ("Abbreviated Dialing Numbers") file, we may read out the individual phone numbers from it using the READ RECORD command. The procedure is complicated by the fact that READ RECORD needs to be provided the "record size" as one of its parameters, which, in turn, must be taken from the response data of the last SELECT command, and this must be obtained via the GET RESPONSE command. The complete example would therefore be:

    -- Select /TELECOM/ADN
    card.select("#7F10")
    sw, resp = card.select("#6F3A")
    
    -- Read file metadata
    GET_RESPONSE = "A0 C0 00 00"
    sw, resp = card.send(bytes.new(8, GET_RESPONSE, bit.AND(sw, 0xff)))
    
    -- Read out first record in the file
    sw, resp = card.read_record('.', 1, resp:get(14))
    
    -- Print the record to the log
    log.print(log.INFO, resp:format("%P"))
    Reading out a phone record

    Reading out a phone record

    Note that instead of printing the output to the log via log.print you could also show a message box:

    ui.readline(resp:format("%P"))

    or append a new node to the tree in the "card view" tab:

    nodes.append(nodes.root(), {classname="block", 
                                label="Name", 
                                val=resp:format("%P"), 
                                size=#resp})
    

    In fact, at this point you should go and read the script $HOME/.cardpeek/scripts/gsm (beta).lua. It contains the code we ran in the beginning of this post to analyze the card. The script simply sends the relevant commands to the card and appends all responses as nodes to the tree.

    Authentication

    While data storage is a useful capability of a SIM card, its main purpose is subscriber authentication. Thus, our acquaintance with the SIM card would be incomplete without checking the corresponding function out as well. It is quite simple:

    RUN GSM ALGORITHM

    RUN GSM ALGORITHM (source)

    That is, the process is the following: we send the byte sequence A0 88 00 00 10, followed by a 16 byte-long challenge string (which is normally given by the mobile operator when the phone joins the network). The SIM card responds with 12 bytes, of which the first 4 we should send back to the mobile operator for verification, and use the remaining 8 as a cipher key.

    Before we can use the RUN GSM ALGORITHM command we need to verify our PIN1 (already done) and SELECT the GSM (7F 20) file:

    RUN_GSM_ALGO = "A0 88 00 00 10"
    GET_RESPONSE = "A0 C0 00 00"
    DF_GSM = "#7F20"
    
    CHALLENGE_STRING = "00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00"
    
    -- Select GSM file
    card.select(DF_GSM)
    
    -- Send the challenge data
    sw, resp = card.send(bytes.new(8, RUN_GSM_ALGO, CHALLENGE_STRING))
    
    -- Read 12-byte-long response
    sw, RESPONSE_STRING = card.send(bytes.new(8, GET_RESPONSE, 12))
    log.print(log.INFO, RESPONSE_STRING:format("%D"))
    RUN GSM ALGORITHM example

    RUN GSM ALGORITHM example

    And that is it. I hope you learned to understand and appreciate your SIM card a bit more today.

    Tags: , , , , , ,

  • Posted by Konstantin 17.03.2018 3 Comments

    I have randomly stumbled upon a Quora question "Can you write a program for adding 10 numbers" yesterday. The existing answers competed in geeky humor and code golf, so I could not help adding another take on the problem.

    Can you write a program for adding 10 numbers?

    The question offers a great chance to illustrate how to properly develop software solutions to real-life problems such as this one.

    First things first - let us analyze the requirements posed by the customer. They are rather vague, as usual. It is not clear what “numbers” we need to add, where and how should these “numbers” come from, what is really meant under “adding”, what should we do with the result, what platform the software is supposed to be running on, what are the service guarantees, how many users are expected, etc.

    Of course, we do not want to discover that we misunderstood some of the requirements late in the development cycle, as this could potentially require us to re-do all of the work. To avoid such unpleasant surprises we should be planning for a general, solid, enterprise-grade solution to the problem. After a short meeting of the technical committee we decided to pick C# as the implementation platform. It is OS-independent and has many powerful features which should cover any possible future needs. For example, if the customer would decide to switch to a cluster-based, parallel implementation later along the way, we’d quickly have this base covered. Java could also be a nice alternative, but, according to the recent developer surveys, C# development pays more.

    The Architecture

    Let us start by modeling the problem on a higher level. The customer obviously needs to process (“add”) some data (“10 numbers”). Without getting into too much detail, this task can be modeled as follows:

    interface IInputProvider {}
    interface IOutput {}
    interface ISolution {
        IOutput add10(IInputProvider input);    
    }

    Note how we avoid specifying the actual sources of input and output yet. Indeed, we really don’t know where the “10 numbers” may be coming from in the future - these could be read from standard input, sent from the Internet, delivered by homing pigeons, or teleported via holographic technology of the future - all these options are easily supported by simply implementing IInputProvider appropriately.

    Of course, we need to do something about the output once we obtain it, even though the customer forgot to mention this part of the problem. This means we will also have to implement the following interface:

    interface IOutputConsumer {
        void consumeOutput(IOutput output);
    }

    And that is it - our general solution architecture! Let us start implementing it now.

    The Configuration

    The architecture we work with is completely abstract. An actual solution would need to provide implementations for the IInputProviderIOutputConsumer and ISolution interfaces. How do we specify which classes are implementing these interfaces? There are many possibilities - we could load this information from a database, for example, and create a dedicated administrative interface for managing the settings. For reasons of brevity, we’ll illustrate a simplistic XML-based factory method pattern.

    Namely, we shall describe the necessary implementations in the XML file config.xml as follows:

    <Config>
        <InputProvider class="Enterprise.NumberSequenceProvider"/>
        <OutputConsumer class="Enterprise.PeanoNumberPrinter"/>
        <Solution class="Enterprise.TenNumbersAddingSolution"/>
    </Config>

    A special SolutionFactory class can now load this configuration and create the necessary object instances. Here’s a prototype implementation:

    class SolutionFactory {
        private XDocument cfg;
        public SolutionFactory(string configFile) {
            cfg = XDocument.Load(configFile);
        }
        public IInputProvider GetInputProvider() {
            return Instantiate<IInputProvider>("InputProvider");
        }
        public IOutputConsumer GetOutputConsumer() {
            return Instantiate<IOutputConsumer>("OutputConsumer");
        }
        public ISolution GetSolution() {
            return Instantiate<ISolution>("Solution");
        }
        private T Instantiate<T>(string elementName) {
            var typeName = cfg.Root.Element(elementName)
                                   .Attribute("class").Value;
            return (T)Activator.CreateInstance(Type.GetType(typeName));
        }
    }

    Of course, in a real implementation we would also worry about specifying the XML Schema for our configuration file, and make sure it is possible to override the (currently hard-coded) “config.xml” file name with an arbitrary URI using command-line parameters or environment variables. In many real-life enterprise solutions in Java, for example, even the choice of the XML parsing library would need to be configured and initialized using its own factory pattern. I omit many of such (otherwise crucial) details for brevity here.

    I am also omitting the unit-tests, which, of course, should be covering every single method we are implementing.

    The Application

    Now that we have specified the architecture and implemented the configuration logic, let us put it all together into a working application. Thanks to our flexible design, the main application code is extremely short and concise:

    class Program {
        static void Main(string[] args) {
            var sf = new SolutionFactory("config.xml");
            var ip = sf.GetInputProvider();
            var oc = sf.GetOutputConsumer();
            var sol = sf.GetSolution();
            var op = sol.add10(ip);
            oc.consumeOutput(op);
        }
    }

    Amazing, right? Well, it does not really work yet, of course, because we still need to implement the core interfaces. However, at this point we may conclude the work of the senior architect and assign the remaining tasks of filling in the blanks to the the main engineering team.

    The Inputs and Outputs

    Now that we have set up the higher-level architecture, we may think a bit more specifically about the algorithm we plan to implement. Recall that we need to “add 10 numbers”. We don’t really know what these “numbers” should be - they could be real numbers, complex numbers, Roman numerals or whatnot, so we have to be careful and not rush into making strict assumptions yet. Let’s just say that a “number” is something that can be added to another number:

    interface INumber {
        INumber add(INumber other);
    }

    We’ll leave the implementation of this interface to our mathematicians on the team later on.

    At this step we can also probably make the assumption that our IInputProviderimplementation should somehow give access to ten different instances of an INumber. We don’t know how these instances are provided - in the worst case each of them may be obtained using a completely different method and at completely different times. Consequently, one possible template for an IInputProvider could be the following:

    interface ITenNumbersProvider: IInputProvider {
        INumber GetNumber1();
        INumber GetNumber2();
        INumber GetNumber3();
        INumber GetNumber4();
        INumber GetNumber5();
        INumber GetNumber6();
        INumber GetNumber7();
        INumber GetNumber8();
        INumber GetNumber9();
        INumber GetNumber10();
    }

    Note how, by avoiding the use of array indexing, we force the compiler to require that any implementation of our ITenNumbersProvider interface indeed provides exactly ten numbers. For brevity, however, let us refactor this design a bit:

    enum NumberOfANumber {
        ONE, TWO, THREE, FOUR, FIVE, SIX, SEVEN, EIGHT, NINE, TEN
    }
    interface ITenNumbersProvider: IInputProvider {
        INumber GetNumber(NumberOfANumber noan);
    }

    By listing the identities of our “numbers” in an enum we still get some level of compile-time safety, although it is not as strong any more, because enum is, internally, just an integer. However, we god rid of unnecessary repetitions, which is a good thing. Refactoring is an important aspect of enterprise software development, you see.

    The senior architect looked at the proposed interface at one of our regular daily stand-ups, and was concerned with the chosen design. “Your interface assumes you can provide immediate access to any of the ten numbers”, he said. But what if the numbers cannot be provided simultaneously and will be arriving at unpredictable points in time? If this were the case, an event-driven design would be much more appropriate:

    delegate void NumberHandler(NumberOfANumber id, INumber n);
     
    interface IAsynchronousInputProvider: IInputProvider {
        void AddNumberListener(NumberHandler handler);
    }

    The adding subsystem would then simply subscribe to receive events about the incoming numbers and handle them as they come in.

    “This is all good and nice”, responded the mathematician, “but for efficient implementation of the addition algorithm we might need to have all ten numbers available at the same time”. “Ah, software design 101”, says the senior architect. We simply install an adapter class. It would pool the incoming data until we have all of it, thus converting the IAsynchronousInputProvider, used for feeding the data, into an ITenNumbersProvider, needed by the mathematician:

    class SyncronizationAdapter: ITenNumbersProvider {
       private Dictionary<NumberOfANumber, INumber> nums;
       private ManualResetEvent allDataAvailableEvent;
     
       public SynchronizationAdapter(IAsynchronousInputProvider ainput){
           nums = new Dictionary<NumberOfANumber, INumber>();
           allDataAvailableEvent = new ManualResetEvent(false);
           ainput.AddNumberListener(this.HandleIncomingNumber);
       }
       private void HandleIncomingNumber(NumberOfANumber id, INumber n){
           nums[id] = n;
           if (Enum.GetValues(typeof(NumberOfANumber))
                   .Cast<NumberOfANumber>()
                   .All(k => nums.ContainsKey(k)))
                allDataAvailableEvent.Set();
       }
       public INumber GetNumber(NumberOfANumber noan) {
           allDataAvailableEvent.WaitOne();
           return nums[noan];
       }
    }

    Now the mathematician can work on his addition logic without having to know anything about the way the numbers are coming in. Convenient, isn’t it?

    Note that we are still only providing the input interface specification (along with an adapter) here. The actual implementation has to wait until our mathematicians come up with an implementation of INumber and the data engineers decide on how to obtain ten of these in the most optimal way.

    But what about IOutput? Let us assume that we expect to output a single number. This means that INumber must itself already be an instance of IOutput:

    interface INumber: IOutput {
       INumber add(INumber other);
    }

    No need to implement anything, we just add an interface tag to INumber! See how object-oriented design techniques allow us to save development time!

    The Order of Addition

    OK, so we now have a concept of an INumber which has a (binary) addition operation defined, an ITenNumbersProvider which can provide ten INumber instances (conveniently abstracting away the IAsynchrhonousInputProvider which actually obtains the numbers), and our goal is to add them up to get an IOutput which is itself an INumber. Sounds easy, right? Not so fast! How exactly are we going to add these numbers? After all, maybe in some cases adding ((a+b)+c)+d)… can be less efficient or precise than (a+(b+(c+(d…. Or maybe the optimal addition strategy is to start from the middle and then add numbers in some order? There do exist nontrivial ways to add up numbers, you know. To accommodate for any possible options in the future (so that we wouldn’t have to rewrite the code unnecessarily), we should design our solution in a way that would let us switch our addition strategy easily, should we discover a better algorithm. One way to do it is by abstracting the implementation behind the following interface:

    interface IAdditionStrategy {
       INumber fold(Func<NumberOfANumber, INumber> elements,
                    Func<INumber, INumber, INumber> op); 
    }

    You see, it is essentially a functor, which gets a way to access our set of numbers (via an accessor function) along with a binary operator “op”, and “folds” this operator along the number set in any way it deems necessary. This particular piece was designed by Harry, who is a huge fan of functional programming. He was somewhat disappointed when we decided not to implement everything in Haskell. Now he can show how everyone was wrong. Indeed, the IAdditionStrategy is a core element of our design, after all, and it happens to look like a fold-functor which takes functions as inputs! “I told you we had to go with Haskell!”, says Harry! It would allow us to implement all of our core functionality with a much higher level of polymorphism than that of a simplistic C# interface!

    The Solution Logic

    So, if we are provided with the ten numbers via ITenNumbersProvider and an addition strategy via IAdditionStrategy, the implementation of the solution becomes a very simple matter:

    class TenNumbersAddingSolution: ISolution {
       private IAdditionStrategy strategy;
       public TenNumbersAddingSolution() {
           strategy = ...
       }
       public IOutput add10(IInputProvider input) {
           var tenNumbers = new SynchronizationAdapter(
                          (IAsynchronousInputProvider)input);
           return strategy.fold(i => tenNumbers.GetNumber(i), 
                                (x,y) => x.add(y));
       }
    }

    We still need to specify where to take the implementation of the IAdditionStrategy from, though. This would be a good place to refactor our code by introducing a dependency injection configuration framework such as the Autofac library. However, to keep this text as short as possible, I am forced to omit this step. Let us simply add the “Strategy” field to our current config.xml as follows:

    <Config>
        ...
        <Solution class="Enterprise.TenNumbersAddingSolution">
            <Strategy class="Enterprise.AdditionStrategy"/>
        </Solution>
    </Config>

    We could now load this configuration setting from the solution class:

        ...
        public TenNumbersAddingSolution() {
            var cfg = XDocument.Load("config.xml");
            var typeName = cfg.Root
                   .Element("Solution")
                   .Element("Strategy")
                   .Attribute("class").Value;
            strategy = (IAdditionStrategy)Activator
                   .CreateInstance(Type.GetType(typeName));
        }
        ...

    And voilà, we have our solution logic in place. We still need to implement INumberIAdditionStrategyITenNumbersProvider and IOutputConsumer, though. These are the lowest-level tasks that will force us to make the most specific decisions and thus determine the actual shape of our final product. These will be done by the most expert engineers and mathematicians, who understand how things actually work inside.

    The Numbers

    How should we implement our numbers? As this was not specified, we should probably start with the simplest possible option. One of the most basic number systems from the mathematician’s point of view is that of Peano natural numbers. It is also quite simple to implement, so let’s go for it:

    class PeanoInteger: INumber {
        public PeanoInteger Prev { get; private set; }
        public PeanoInteger(PeanoInteger prev) { Prev = prev; }
        public INumber add(INumber b) {
            if (b == null) return this;
            else return new PeanoInteger(this)
                    .add(((PeanoInteger)b).Prev);
        }
    }

    Let us have IOutputConsumer print out the given Peano integer as a sequence of “1”s to the console:

    class PeanoNumberPrinter: IOutputConsumer {
        public void consumeOutput(IOutput p) {
            for (var x = (PeanoInteger)p; x != null; x = x.Prev)
                 Console.Write("1");
            Console.WriteLine();
        }
    }

    Finally, our prototype IAdditionStrategy will be adding the numbers left to right. We shall leave the option of considering other strategies for later development iterations.

    class AdditionStrategy: IAdditionStrategy {
        public INumber fold(Func<NumberOfANumber, INumber> elements,
                            Func<INumber, INumber, INumber> op) {
           return Enum.GetValues(typeof(NumberOfANumber))
                      .Cast<NumberOfANumber>()
                      .Select(elements).Aggregate(op);
        }
    }

    Take a moment to contemplate the beautiful abstraction of this functional method once again. Harry’s work, no doubt!

    The Input Provider

    The only remaining piece of the puzzle is the source of the numbers, i.e. the IAsynchronousInputProvider interface. Its implementation is a fairly arbitrary choice at this point - most probably the customer will want to customize it later, but for the purposes of our MVP we shall implement a simple sequential asynchronous generator of Peano numbers {1, 2, 3, …, 10}:

    class NumberSequenceProvider: IAsynchronousInputProvider {
        private event NumberHandler handler;
        private ManualResetEvent handlerAvailable;
     
        public NumberSequenceProvider() {
            handlerAvailable = new ManualResetEvent(false);
            new Thread(ProduceNumbers).Start();
        }
        public void AddNumberListener(NumberHandler nh) {
            handler += nh;
            handlerAvailable.Set();
        }
        private void ProduceNumbers() {
            handlerAvailable.WaitOne();
            PeanoInteger pi = null;
            foreach (var v in Enum.GetValues(typeof(NumberOfANumber))
                                  .Cast<NumberOfANumber>()) {
                    pi = new PeanoInteger(pi);
                    handler(v, pi);
            }
        }
    }

    Note that we have to be careful to not start publishing the inputs before the number processing subsystem attaches to the input producer. To achieve that we rely on the event semaphore synchronization primitive. At this point we can clearly see the benefit of choosing a powerful, enterprise-grade platform from the start! Semaphores would look much clumsier in Haskell, don’t you think, Harry? (Harry disagrees)

    So here we are - we have a solid, enterprise-grade, asynchronous, configurable implementation for an abstractly defined addition of abstractly defined numbers, using an abstract input-output mechanism.

    $> dotnet run
    1111111111111111111111111111111111111111111111111111111

    We do need some more months to ensure full test coverage, update our numerous UML diagrams, write documentation for users and API docs for developers, work on packaging and installers for various platforms, arrange marketing and sales for the project (logo, website, Facebook page, customer relations, all that, you know), and attract investors. Investors could then propose to pivot the product into a blockchain-based, distributed solution. Luckily, thanks to our rock solid design abstractions, this would all boil down to reimplementing just a few of the lower-level interfaces!

    Software engineering is fun, isn’t it?

    The source code for the developed solution is available here.

    Tags: , , , , ,

  • Posted by Konstantin 02.05.2017 2 Comments

    I happen to use the Amazon cloud machines from time to time for various personal and work-related projects. Over the years I've accumulated a terabyte or so of data files there. Those are mostly useless intermediate results or expired back-ups, which should be deleted and forgotten, but I could not gather the strength for that. "What if those datafiles happen to be of some archaelogical interest 30 years from now?", I thought. Keeping them just lying there on an Amazon machine is, however, a waste of money - it would be cheaper to download them all onto a local hard drive and tuck it somewhere into a dark dry place.

    But what would be the fastest way to download a terabyte of data from the cloud? Obviously, large downstream bandwidth is important here, but so should be a smart choice of the transfer technology. To my great suprise, googling did not provide me with a simple and convincing answer. A question posted to StackOverflow did not receive any informative replies and even got downvoted for reasons beyond my understanding. It's year 2017, but downloading a file is still not an obvious matter, apparently.

    Unhappy with such state of affairs I decided to compare some of the standard ways for downloading a file from a cloud machine. Although the resulting measurements are very configuration-specific, I believe the overall results might still generalize to a wider scope.

    Experimental Setup

    Consider the following situation:

    • An m4.xlarge AWS machine (which is claimed to have "High" network bandwidth) located in the EU (Ireland) region, with an SSD storage volume (400 Provisioned IOPS) attached to it.
    • A 1GB file with random data, generated on that machine using the following command:
      $ dd if=/dev/urandom of=file.dat bs=1M count=1024
    • The file needs to be transferred to a university server located in Tartu (Estonia). The server has a decently high network bandwidth and uses a mirrored-striped RAID for its storage backend.

    Our goal is to get the file from the AWS machine into the university server in the fastest time possible. We will now try eight different methods for that, measuring the mean transfer time over 5 attempts for each method.

    File Download Methods

    One can probably come up with hundreds of ways for transferring a file. The following eight are probably the most common and reasonably easy to arrange.

    1. SCP (a.k.a. SFTP)

    • Server setup: None (the SSH daemon is usually installed on a cloud machine anyway).
    • Client setup: None (if you can access a cloud server, you have the SSH client installed already).
    • Download command:

      scp -i ~/.ssh/id_rsa.amazon \
               ubuntu@$REMOTE_IP:/home/ubuntu/file.dat .

    2. RSync over SSH

    • Server setup: sudo apt install rsync (usually installed by default).
    • Client setup: sudo apt install rsync (usually installed by default).
    • Download command:

      rsync -havzP --stats \
            -e "ssh -i $HOME/.ssh/id_rsa.amazon" \
            ubuntu@$REMOTE_IP:/home/ubuntu/file.dat .

    3. Pure RSync

    • Server setup:
      Install RSync (usually already installed):

      sudo apt install rsync

      Create /etc/rsyncd.conf with the following contents:

      pid file = /var/run/rsyncd.pid
      lock file = /var/run/rsync.lock
      log file = /var/log/rsync.log
      
      [files]
      path = /home/ubuntu

      Run the RSync daemon:

      sudo rsync --daemon
    • Client setup: sudo apt install rsync (usually installed by default).
    • Download command:

      rsync -havzP --stats \
            rsync://$REMOTE_IP/files/file.dat .

    4. FTP (VSFTPD+WGet)

    • Server setup:
      Install VSFTPD:

      sudo apt install vsftpd

      Edit /etc/vsftpd.conf:

      listen=YES
      listen_ipv6=NO
      pasv_address=52.51.172.88   # The public IP of the AWS machine

      Create password for the ubuntu user:

      sudo passwd ubuntu

      Restart vsftpd:

      sudo service vsftpd restart
    • Client setup: sudo apt install wget (usually installed by default).
    • Download command:

      wget ftp://ubuntu:somePassword@$REMOTE_IP/file.dat

    5. FTP (VSFTPD+Axel)

    Axel is a command-line tool which can download through multiple connections thus increasing throughput.

    • Server setup: See 4.
    • Client setup: sudo apt install axel
    • Download command:

      axel -a ftp://ubuntu:somePassword@$REMOTE_IP/home/ubuntu/file.dat

    6. HTTP (NginX+WGet)

    • Server setup:
      Install NginX:

      sudo apt install nginx

      Edit /etc/nginx/sites-enabled/default, add into the main server block:

      location /downloadme {
          alias /home/ubuntu;
          gzip on;
      }

      Restart nginx:

      sudo service nginx restart
    • Client setup: sudo apt install wget (usually installed by default).
    • Download command:

      wget http://$REMOTE_IP/downloadme/file.dat

    7. HTTP (NginX+Axel)

    • Server setup: See 6.
    • Client setup: sudo apt install axel
    • Download command:

      axel -a http://$REMOTE_IP/downloadme/file.dat

    8. AWS S3

    The last option we try is first transferring the files onto an AWS S3 bucket, and then downloading from there using S3 command-line tools.

    • Server setup:
      Install and configure AWS command-line tools:

      sudo apt install awscli
      aws configure

      Create an S3 bucket:

      aws --region us-east-1 s3api create-bucket \
          --acl public-read-write --bucket test-bucket-12345 \
          --region us-east-1

      We create the bucket in the us-east-1 region because the S3 tool seems to have a bug at the moment which prevents from using it in the eu regions.

      Next, we transfer the file to the S3 bucket:

      aws --region us-east-1 s3 cp file.dat s3://test-bucket-12345
    • Client setup:
      Install and configure AWS command-line tools:

      sudo apt install awscli
      aws configure
    • Download command:

      aws --region us-east-1 s3 cp s3://test-bucket-12345/file.dat .

    Results

    Here are the measurement results. In case of the S3 method we report the total time needed to upload from the server to S3 and download from S3 to the local machine. Note that I did not bother to fine-tune any of the settings - it may very well be possible that some of the methods can be sped up significantly by configuring the servers appropriately. Consider the results below to indicate the "out of the box" performance of the corresponding approaches.

    Although S3 comes up as the fastest method (and might be even faster if it worked out of the box with the european datacenter), RSync is only marginally slower, yet it is easier to use, requires usually no additional set-up and handles incremental downloads very gracefully. I would thus summarize the results as follows:

    Whenever you need to download large files from the cloud, consider RSync over SSH as the default choice.

    Tags: , , , ,

  • Posted by Konstantin 27.07.2016 3 Comments

    While writing the previous post I was thinking of coming up with a small fun illustration for Aframe. I first heard about AFrame at the recent European Innovation Academy - a team-project-based entrepreneurship summer school. The team called MemVee was aiming to develop an AFrame-based site which would allow students to design and view interactive "Memory Palaces" - three-dimensional spaces with various objects related to their current study topics, organized in a way that simplifies remembering things for visual learners. Although I have never viewed a "Memory Palace" as something beyond a fantastic concept from a Sherlock Holmes TV episode, I am a visual learner myself and understand the importance of such illustrations. In fact, I try to use and preach graphical references in my teaching practice whenever I find the time and opportunity:

    • In this lecture the concept of a desk is used as a visual metaphor of "structuring the information" as well as to provide an outline to the talk.
    • Here and here an imaginary geographical map is used in a similar context.
    • For the computer graphics course I had to develop some "slides" as small OpenGL apps for visualizing the concepts during the lecture. This has been later taken to extreme heights in the practical materials designed by Raimond-Hendrik, who went on to give this course (alongside with a seminar) in the following years. Unfortunately the materials are closed for non-participants (yet I still hope they will be opened some day, do you read this, Raimond?), but the point is that every single notion has a tiny WebGL applet made to illustrate it interactively.
    • Once I tried to make a short talk about computer graphics, where the slides would be positioned on the walls of a 3D maze, so that to show them I'd have to "walk through the maze", like in a tiny first-person shooter game. Although this kind of visualization was not at all useful as a learning aid (as it did not structure anything at all), it none the less looked cool and was very well appreciated by the younger audience of the talk, to whom it was aimed at.

    I have lost the sources of that last presentation to a computer error and decided to recreate a similar "maze with slides" with AFrame. The night was long and I got sucked into the process to the point of making an automated tool. You upload your slides, and it generates a random maze with your slides hanging on the walls. It is utterly useless, but the domain name "slideamaze.com" was free and I could not resist the pun. [Update from 2018 - the domain expired and the project migrated to slideamaze.ing.ee.]

    Check it out. If you are into programming-related procrastination, try saving the "mazes" generated by the tool on your computer and editing the A-frame code to, say, add monsters or other fun educational tools into the maze.

    Tags: , , , , , ,

  • Posted by Konstantin 22.01.2015 38 Comments

    Update from year 2017: The tool described in this post DOES NOT WORK with recent versions of Skype. Either these versions stopped saving removed messages altogether, or they are doing it in a novel manner not recognized by the tool.

    In other words - you would only recover "removed" messages if you are running older version of Skype (or these messages were sent at the time you were using that older version).

    Yesterday I happened to attend a discussion about the security and privacy of information stored locally in Skype and Thunderbird profiles. It turns out, if you obtain a person's Skype profile directory, you will be able to log in as him without the need to know the password. In addition, Dominique made a remark that Skype does not really delete the messages that are marked as "removed" in the chat window. I found that curious and decided to take a closer look.

    Indeed, there is a bunch of *.dat files in the chatsync subdirectory of the Skype's profile, which preserve all messages along with all their edits or deletions. Unfortunately, the *.dat files are in some undocumented binary format, and the only tool I found for reading those lacks in features. However, hacking up a small Python parser according to what is known about the format, along with a minimalistic GUI is a single evening's exercise, and I happened to be in the mood for some random coding.

    Skype Chatsync Viewer

    Skype Chatsync Viewer

    Now, if you want to check out what was that message you or your conversation partner wrote before it was edited or deleted, this package will help. If you are not keen on installing Python packages, here is a standalone Windows executable.

    Tags: , , , , , , ,

  • Posted by Konstantin 12.03.2014 No Comments

    Whenever you write a program, you want this program to behave correctly and do what you want it to do. Thus, programming always goes together with the mental act of proving to yourself (and sometimes to other people as well), that the code you write is correct. Most often this "proof" is implicit, dissolved in the way you write your code and comment it. In fact, in my personal opinion, "good code" is exactly the one, where a human-reviewer is able to verify its correctness without too much effort.

    It is natural to use computers to help us verify correctness. Everyone who has ever programmed in a strictly-typed programming language, such as Java or Haskell, is familiar with the need to specify types of variables and functions and follow strict rules of type-safety. But of course, ensuring type-safety is just the compiler's way to help you ensure some basic claims about the program, such as "this variable will always contain an integer" or "this function will always be invoked with exactly three parameters".

    This is very convenient, yet can be somewhat limiting or annoying at times. After all, type-safety requires you to write code that "can be type-checked". Although very often this is expected of "good code" anyway, there are situations where you would like some more flexibility. For this reason some languages impose no type-safety rules at all (e.g. Python or Javascript), and some languages let you disable type-checking for parts of code.

    Rather than disabling the type checker, another principled way to allow more flexibility is to make the type-checker smarter. This is the promise of dependent types. In principle, a language, which supports dependent types, would let you make much more detailed statements about your program and have your program automatically checked for correctness with respect to those statements. Rather than being limited to primitive claims like "this variable is an integer", the use of dependent types enables you to assert things like "this is a sorted list", or "this is an odd integer", and so on up to nearly arbitrary level of detail, in the form of a type annotation. At least that much I found out during a course at the recent winter school.

    The course was based on the Agda programming language, and the first thing I decided to try implementing in Agda is a well-typed version of the following simple function:

    f t = if t then true else 0

    It might look like something trivial, yet most traditional strictly typed languages would not let you write this. After all, a function has to have a return type, and it has to be either a Boolean or an Integer, but not both. In this case, however, we expect our function to have a parameter-dependent type:

    f : (t : Bool) → if t then Bool else Integer

    Given that Agda is designed to support dependent types, how complicated could it be to implement such a simple function? It turns out, it takes a beginner more than an hour of thinking and a couple of consultations with the specialists in the field. The resulting code will include at least three different definitions of "if-then-else" statements and, I must admit, some aspects of it are still not completely clear to me.

    IF-THEN-ELSE in Agda

    IF-THEN-ELSE in Agda, including all the boilerplate code

    This is the longest code I've ever had to write to specify a simple if-then-else statement. The point of this blog post is to share the amusement and suggest you to check out Agda if you are in the mood for some puzzle-solving.

    As for dependent types, I guess those are not becoming mainstream any time soon.

    Tags: , , , , ,

  • Posted by Konstantin 03.09.2012 2 Comments

    For many people, the ability for learning and adaptation seems like something unique, extremely complicated and mysterious. Indeed, those are the abilities we almost exclusively associate with high levels of intelligence and knowledge. This is, however, an illusion. Although adaptive behaviour might indeed look complex, it is not necessarily driven by "intelligent" mechanisms. One of the best illustrations of this is a fully-fledged self-learning machine made from plain matchboxes.

    A Tic-Tac-Toe machine made by James Bridle

    A Tic-Tac-Toe machine by James Bridle

    The idea for such a machine was first introduced in 1960 by Donald Michie, who devised a simple self-learning algorithm for Tic-Tac-Toe (reminiscent of what is now known to be Reinforcement Learning). Due to lack of appropriate computing power, he implemented it "in hardware" using 300 or so matchboxes.

    The idea of the machine is simple. There is a matchbox corresponding to each game position, where the "computer" has to make a move. The matchbox contains colored beads, each color corresponding to a particular move. The decision is made by picking a random bead from the matchbox. Initially (when the machine is "untrained"), there is an equal number of beads of each color, and the machine thus makes equiprobably random turns. After each game, however, the machine is "punished" by removing beads, corresponding to losing turns, or "rewarded" by adding beads, corresponding to winning turns. Thus, after several games, the machine will adapt its strategy towards a winning one.

    The idea was popularized by Martin Gardner in one of his Scientific American articles (later published in the book "The Unexpected Hanging and Other Mathematical Diversions"). Gardner invented a simple game of  "Hexapawn", and derived a matchbox machine for it, which only required as little as 19 matchboxes. He also suggests in his article, however, to create a matchbox machine for "Mini-checkers" - checkers played on a 4x4 board. Ever since I saw this article some 20 or so years ago I was thinking of making one. This summer, while teaching a machine learning course in a summer school in Kiev, I actually made one. I could use it to both fulfil my ages-old desire as well as a teaching aid. You can make one too now, if you are interested.

    The Mini-checkers Machine

    The rules of mini-checkers are exactly like those of usual checkers, with three modifications:

    • The game is played on a 4x4 field. White is the first one to move. Machine plays for black.
    • Whenever both players get a King, the game immediately ends in a draw.
    • The King must always move to the furthest possible position in the chosen direction.

    To make the machine, you first have to buy and empty 24 matchboxes. Next, print out and stick the 24 game positions onto the boxes. Draw on each box all the possible black's moves as arrows using colored markers. Finally, for  each colored arrow, add 2 beads of the same color into the matchbox. That's it, your machine is ready to play.

    The Mini-checkers machine

    The Mini-checkers machine

    The game proceeds as already described: whenever the machine (the black player) has to make a decision (i.e. whenever it has to make a move and there is more than one possibility), find the matchbox with the current position depicted on it, shake it, and pick a random bead. This will tell you the decision of the machine. If the corresponding matchbox is empty, the machine forfeits. You should keep the matchboxes, corresponding to the moves that were made, open until the end of the game.

    Once the game is over, the machine is "taught":

    • If the machine won, do nothing.
    • If the game was a draw, remove the bead corresponding to the machine's last move from the matchbox, unless it was the last bead of that color in the box.
    • If the machine lost, remove all the beads, corresponding to the machine's last move, from the last matchbox.

    It takes about 30 games or so for the machine to actually learn to play well enough. Of course, a human would understand the strategy much earlier, but it's fun none the less.

    Playing with the machine will immediately lead you towards two important questions:

    • How efficient is the suggested learning procedure? Can it be improved and generalized?
    • How do you make a matchbox machine for a more complex game without having to manage thousands of matchboxes.

    As far as I know, contemporary machine learning has only partial answers to both of them.

    Tags: , , , ,

  • Posted by Konstantin 23.10.2009 No Comments

    The Alt+Tab key combination is perhaps one of the most well-known keyboard shortcuts in Windows, with the only competitors for the throne being Ctrl+C and Ctrl+V. And no matter, whether you are used to alt-tabbing for working purposes or simply as a means of efficient undercover procrastination, unless you are a complete novice you probably have this skill at the level of basic instincts.

    Unfortunately, there are cases where the instinct becomes inconvenient. Namely, whenever you use an application that displays multiple documents in separate tabs (like Firefox or Notepad++) or in separate child windows (like R), you are expected to use Ctrl+Tab rather than Alt+Tab to switch among documents. However, most of the time switching among documents is subjectively perceived as nothing essentially different than switching among programs, hence the fact that Alt+Tab won't work normally for that case is highly unintuitive. The typical case with me is that I would accidentally use Alt+Tab attempting to switch between the editor and console in R and unexpectedly find a completely different window in front of me, which is quite annoying.

    Although I am pretty sure I am not the only one to experience this kind of frustration, it is surprising that there does not seem to be any easily available solution to this trivial issue known to google. Thus, considering that the whole problem can be solved to a fair extent by simply translating Alt keypresses into Ctrl in a smart way, I've made a smallish program that does exactly that.

    I'm quite happy with the result and can't help sharing it with you.

    Download: Binary, Source.

    Tags: , , ,