• Posted by Konstantin 07.12.2008

    Logic versus Statistics

    Consider the two algorithms presented below.

    Algorithm 1:

       If, for a given brick B,
          B.width(cm) * B.height(cm) * B.length(cm) > 1000
       Then the brick is heavy

    Algorithm 2:

       If, for a given male person P,
          P.age(years) + P.weight(kg) * 4 - P.height(cm) * 2 > 100
       Then the person might have health problems

    Note that the two algorithms are quite similar, at least from the point of view of the machine executing them: in both cases a decision is produced by performing some simple mathematical operations with a given object. The algorithms are also similar in their behaviour: both work well on average, but can make mistakes from time to time, when given an unusual person, or a rare hollow brick. However, there is one crucial difference between them from the point of view of a human: it is much easier to explain how the algorithm "works'' in the first case, than it is in the second one. And this is what in general distinguishes traditional "logical" algorithms from the machine learning-based approaches.

    Of course, explanation is a subjective notion: something, which looks like a reasonable explanation to one person, might seem incomprehensible or insufficient to another one. In general, however, any proper explanation is always a logical reduction of a complex statement to a set of "axioms". An "axiom" here means any "obvious" fact that requires no further explanations. Depending on the subjective simplicity of the axioms and the obviousness of the logical steps, the explanation can be judged as being good or bad, easy or difficult, true or false.

    Here is, for example, an explanation of Algorithm 1, that would hopefully satisfy most readers:

    • The volume of a rectangular object can be computed as its width*height*length. (axiom, i.e. no further explanation needed)
    • A brick is a rectangular object. (axiom)
    • Thus, the volume of a brick can be computed as its width*height*length. (logical step)
    • The mass of a brick is its volume times the density. (axiom)
    • We consider the density of a brick to be at least 1g/cm3 and we consider a brick heavy if it weighs at least 1 kg. (axiom)
    • Thus a brick is heavy if its mass > width*height*length > 1000. (logical step, end of explanation)

    If you try to deduce a similar explanation for Algorithm 2 you will probably stumble into problems: there are no nice and easy "axioms" to start with, unless, at least, you are really deep into modeling body fat and can assign a meaning to the sum of a person's age with his weight. Things become even murkier if you consider a typical linear classification algorithm used in OCR systems for deciding whether a given picture contains the handwritten letter A or not. The algorithm in its most simple form might look as follows:

       If \sum_{i,j} a_{ij} \mathrm{pixel}_{ij} > 0
       Then there is a letter A on the picture,

    where a_{ij} are some real numbers that were obtained using an obscure statistical procedure from an obscure dataset of pre-labeled pictures. There is really no good way to explain why the values of a_{ij} are what they are and how this algorithm gets the result, other than to present the dataset of pictures it was trained upon and state that "well, these are all pictures of the letter A, therefore our algorithm detects the letter A on pictures".

    Note that, in a sense, such an "explanation" uses each picture of the letter A from the training set as an axiom. However, these axioms are not the kind of statements we used to justify Algorithm 1. The evidence they provide is way too weak for traditional logical inferences. Indeed, the fact that one known image has a letter A on it does not help much in proving that some other given image has an A too. Yet, as there are many of these "weak axioms", one statistical inference step can combine them into a well-performing algorithm. Notice how different this step is from the traditional logical steps, which typically derive each "strong" fact from a small number of other "strong" facts.

    So to summarize again: there are two kinds of algorithms, logical and statistical.
    The former ones are derived from a few strong facts and can be logically explained. Very often you can find the exact specifications of such algorithms in the internet. The latter ones are based on a large number of "weak facts" and rely on induction rather than logical (i.e. deductive) explanation. Their exact specification (e.g. the actual values for the parameters a_{ij} used in that OCR classifier) does not make as much general sense as the description of classical algorithms. Instead, you would typically find general principles for constructing such algorithms.

    The Human Aspect

    What I find interesting, is that the mentioned dichotomy stems more from human psychology than mathematics. After all, the "small" logical steps as well as the "big" statistical inference steps are all just "steps" from the point of view of maths and computation. The crucial difference is mainly due to a human aspect. The logical algorithms, as well as all of the logical decisions we make in our life, is what we often call "reason" or "intelligence". We make decisions based on reasoning many times a day, and we could easily explain the small logical steps behind each of them. But even more often do we make the kind of reason-free decisions that we call "intuitive". Take, for example, visual perception and body control. We do these things by analogy with our previous experiences and cannot really explain the exact algorithm. Professional intuition is another nice example. Suppose a skilled project manager says "I have doubts about this project because I've seen a lot of similar projects and all of them failed". Can he justify his claim? No, no matter how many examples of "similar projects" he presents, none of them will be considered as reasonable evidence from the logical point of view. Is his decision valid? Most probably yes.

    Thus, the aforementioned classes of logical (deductive) and statistical (inductive) algorithms seem to directly correspond to reason and intuition in the human mind. But why do we, as humans, tend to consider intuition to be inexplicable and thus make "less sense" than reason? Note that the formal difference between the two classes of algorithms is that in the former case the number of axioms is small and the logical steps are "easy". We are therefore capable of representing the separate axioms and the small logical steps in our minds somehow. However, when the number of axioms is virtually unlimited and the statistical step for combining them is way more complicated, we seem to have no convenient way of tracking them consciously due to our limited brain capacity. This is somewhat analogous to how we can "really understand" why 1+1=2, but will have difficulties trying to grasp the meaning of 121*121=14641. Instead, the corresponding inductive computations can be "wired in" to the lower, unconscious level of our neural tissue by learning patterns from experience.

    The Consequences

     There was a time at the dawn of computer science, when much hope was put in the area of Artificial Intelligence. There, people attempted to devise "intelligent" algorithms based on formal logic and proofs. The promise was that in a number of years the methods of formal logic would develop to such heights, that would allow computer algorithms to attain "human" level of intelligence. That is, they would be able to walk like humans, talk like humans and do a lot of other cool things that we humans do. Half a century has passed and this still didn't happen. Computer science has seen enormous progress, but we have not found an algorithm based on formal logic that could imitate intuitive human actions. I believe that we never shall, because devising an algorithm based on formal logic actually means understanding and explaining an action in terms of a fixed number of axioms.

    Firstly, it is unreasonable to expect that we can precisely explain much of the real world, because, strictly speaking, there exist mathematical statements that can't in principle be explained. Secondly, and most importantly, this expectation contradicts the assumption that most "truly human" actions are intuitive, i.e. we are simply incapable of understanding them.

    Now what follows is a strange conclusion. There is no doubt that sooner or later computers will get really good at performing "truly human" actions, the trend is clear already. But, contradictory to our expectations, the fact that we shall create a machine that acts like a human will not really bring us closer to understanding how "a human" really "works". In other words, we shall never create Artificial Intelligence. What we are creating now, whether we want it or not, is Artificial Intuition.

    Posted by Konstantin @ 1:40 pm

    Tags: , , , ,

  • 7 Comments

    1. Konstantin on 09.12.2008 at 14:03 (Reply)

      There is one more amusing consequence that I see to it. I set it aside here to keep the main text somewhat more coherent and invite the interested readers to comment. Namely, I believe that such a complex system like the living cell could also turn out to be "too complex" to be understood by a human. Which means that any nontrivial process taking place in the cell can only be regarded by an human as an "intuitive" kind of thing. From which it follows that bioinformatics is never going to really help to "explain" what happens in the cell. What it will do most importantly, is produce algorithms that can provide answers to questions and help us make intelligent decisions. I can imagine a situation where a machine suggests us to "modify this gene in order to obtain property XXX" and we shall have to blindly trust the machine because there is no way we can verify or understand this suggestion. It's more or less what we have now, in fact, the problem is just that the suggestions are way too bad to be trusted, yet.

    2. Taivo Lints on 14.12.2008 at 19:14 (Reply)

      Definitely an interesting and thought-provoking post. However, I feel like if there was a bit of sand somewhere between the cogs... 🙂 Putting together a comprehensive comment would require some further thinking, but maybe the following short first-glance-observations (thus possibly irrelevant/inaccurate) help to kick off the discussion.

      "We consider the density of a brick to be at least 1g/cm3 and we consider a brick heavy if it weighs at least 1 kg. (axiom)"

      Not a particularly strong axiom, I would say. More like statistical observations.

      As of Algorithm 2, the first-approximation-interpretation I'd come up with would be:
      * Health risks increase with age.
      * Health risks increase with overweight.
      * Health risks decrease with body height.

      Hm. Surely not as good and detailed and logical explanation as the one for Algorithm 1. But then again, the given two algorithms do not seem to be comparable in their answer's accuracy, information content, and the relative acceptable range of input variable (i.e., the domain of validity on the argument axis) either. Yeah, it is partly because the second one is more "statistical", but it should be possible to find two algorithms of comparable accuracy and information content as well, which would make the comparison a bit more insightful. Also, as noted, the first algorithm's axioms are not THAT strong after all, and, as you note yourself, with better clinical-physiological knowledge the explanation of the second one would likely also be a bit easier to accomplish. (And yes, I did notice that one of the main points of the post is exactly that those two algorithms are not really that different as they seem to be. My problem might partly be due to the fact that they do not seem particularly different to me in the first place 🙂 (at leased on the level discussed here). Might be my lack of logical deepness. Might be not.)

      As a slight deviation -- how to classify the algorithms / equations of quantum mechanics? Are they statistical? Or some third class? At least they don't seem to be particularly logical (yet?). I mean, the "Shut up and calculate!" null interpretation of quantum mechanics is apparently the view of quite a few physicists...

      Now, intelligence vs. intuition. Again, I do not feel like they are that separate concepts in the first place. There might surely be some measurable bias in general opinion towards EQUATING intelligence with logical reasoning, but when looking at the collection of published definitions the equation generally does not hold: http://www.vetta.org/definitions-of-intelligence/ . And even public opinion is changing under the influence of bestsellers like "Emotional Intelligence" and whatnot.

      "we haven’t found an algorithm based on formal logic that could imitate intuitive human actions. I believe that we never shall, because devising an algorithm based on formal logic actually means understanding and explaining an action in terms of a fixed number of axioms."

      I'm, to put it mildly, not particularly enthusiastic about the formal logic approach either 🙂 But as of the necessity of UNDERSTANDING -- no, why? Sure, pure logicians would abhor the idea, but generating formal logic based algorithms that solve complex problems is, as long as they in principle exist, perfectly doable using, say, brute force search or artificial evolution.

      "Secondly, and most importantly, this expectation contradicts the assumption that most “truly human” actions are intuitive, i.e., we are simply incapable of understanding them."

      Could you elaborate on that? Incapable because they are computationally irreducible to human cognitive level in any possible future (including those where we will have mental capabilities technologically boosted in orders of magnitude), or because you imply fundamental indeterminism? Also, what exactly are those "truly human" actions? (also, cf. http://www.metsas.ee/et/tekstid/tanelt/inimlikkus_on_loomalikkus , which might have a few outdated remarks (like "Ükski robot ei suuda läbi metsa sörkida: see nõuab võimast aju."), but is interesting nevertheless).

      " http://artificial-intuition.com/ "

      Apart from potentially new specific methods, which are warmly welcome, does the Artificial Intuition have any ideological differences with Computational Intelligence ( http://en.wikipedia.org/wiki/Computational_Intelligence ), or the earlier scruffies ( http://en.wikipedia.org/wiki/Neats_vs._scruffies ), or, going all the way back, the very early AI people ( http://en.wikipedia.org/wiki/History_of_artificial_intelligence#Cybernetics_and_early_neural_networks )?

      Also, if you don't mind, some short comments on the referenced Monica Anderson's text on that Artificial Intuition site.

      "Walkers keep some of their legs on the ground at all times which means you only lift one or a few of them at any one time. You will also want to avoid stepping on your own feet. Such coordination requires control from a central point."

      No, it doesn't.

      "The single skill of prediction, even if it often fails, yields a big advantage in how well you survive and how likely you are to breed."

      If it fails more than 50% of the time, it can be worse than doing things at random. Also, there are developmental and maintenance costs of the corresponding biological apparatus to be taken into account when talking about advantageousness.

      "Evolution rarely throws anything away."

      Ummm.. seriously??? And I thought that finding anything WORTH OF RETAINING was what is rare in evolution...

      "brain, that contains numerous but nearly identical neurons"

      http://en.wikipedia.org/wiki/Neuron#Classes

      "7. Opinions
      8. Hypotheses
      9. Multiple Points of View
      ...
      ... these issues remain as unresolved problems for Logic-based representational systems."

      http://en.wikipedia.org/wiki/Subjective_logic

      "Artificial Intuition provides an alternative strategy to deal with these since it is immune to all of them by virtue of not using Logic-based models; there is no model there to get confused by these problems."

      First of all, if there is something that does predicting, there IS some kind of a model there, it just might be implicit and non-logic-based. Secondly, if logic-based models cannot deal with certain problems and Artificial Intuition does not use logic-based models, it does NOT just out of the blue imply that Artificial Intuition thus CAN deal with those problems (c'mon, think (oh, the irony) logically ;D ).

      "Computer-based intuition - "Artificial Intuition" - is quite straightforward to implement, but requires computers"
      "Computer hardware is designed based on principles of Boolean Logic, and Logic is also used in programming them."
      "Intuition operates at a "level below logic"."

      If Artificial Intuition can be implemented on a computer (she does not explicitly say that it is implementable purely on a computer, but she doesn't mention any other requirements apart from large memory either), and a computer is based on logic, I would assume that intuition is implementable in logic...

      "Bizarre Domains ... Chaotic Systems"
      "Physics and related sciences use logical and mathematical models to describe the world."
      "Logic cannot handle Bizarre Systems"

      I'm a bit lost here... If she considers the equations of physics to be in the same class as logical models, and says that logic cannot handle bizarre systems, then how come she uses chaotic systems as an example of bizarre domains? I mean, deterministic chaos is handled quite well by the equations of nonlinear dynamics, right?

      "Friedrich Hayek has observed that all sensory information is converted to one single kind of nerve signals before reaching the brain. The brain then processes these incoming nerve signals by sending further nerve signals to other parts of the brain."

      It's not all about a single kind of nerve signals. There's a lot of chemical neuromodulatory activity going on as well ( http://en.wikipedia.org/wiki/Neuromodulation ).

      "Values of Logic based Methods
      ...
      Timeliness
      We expect to get the result in bounded time.
      ..."
      "The brain needs none of these seven values."

      How come the timeliness is a value of logic based approach, not the Artificial Intuition, and how come the brain does not need it???

      Ok, I hope this didn't sound like bashing, because in general I'm really in favor of approaches not based on formal logic 🙂

      As of the bioinformatics comment, there is a big difference between asking if bioinformatics will explain the cell or if it will HELP to explain the cell. While the first option is indeed somewhat dubious, the latter, in my humble opinion, is almost beyond doubt. "What it will do most importantly, is produce algorithms that can provide answers to questions and help us make intelligent decisions." That would include intelligent decisions by researchers about where to focus with their large toolbox of other approaches so as to make progress in the understanding.

      Best,
      Taivo

      1. Konstantin on 25.12.2008 at 22:44 (Reply)

        Thanks for an interested comment. I'm afraid my reply might somewhat disappoint you, as it is probably not be as deep as your questions deserve.

        Logic vs Statistics
        It might very well be that the example with the Algorithm 2 is not the best one, but I did my best and just could not find anything better to illustrate two algorithms equivalent "in form, meaning and size" and yet somehow different in concept. Please note that your explanation attempt is not really proper as it does not answer basic algorithm mechanics: why do you multiply weight in kilograms by 4 and height in centimeters by 2? I doubt you'll find an easy way to explain that, comparable to the clear explanation of Algorithm 1.

        And if the first example is not enough, consider the linear-classifier OCR algorithm presented afterwards. In my opinion, it is completely unexplicable to a human person, yet it could work with close to 100% precision.

        Once again, you might say, that OCR works because it was "constructed using sound principles", but this is, to my mind, a "meta-explanation". It somewhat justifies the class of algorithms but does not explain anything about a specific instance of the algorithm itself. Once you've found proper classifier parameters you can't really "prove to yourself" that they are "the correct ones", you just have to believe they work. This is in contrast to the logical algorithms, where you can convince yourself in algorithm correctness.

        As an even more involved example, consider an algorithm A that generates a statistical estimation algorithm B (by, say, brute force search or genetic optimization or whatever else) and then algorithm B is used to construct a model for given data C. Even though you will see that C works well, you will have a really vague understanding as to why it does work.

        And of course, I wouldn't strictly limit the "intuitive" algorithms to statistical methods only. If there were an algorithm that could indeed be proven logically, but the proof would be so complicated that it could only be checked on a computer, I'd call that also an "intuitive" kind of thing.

        All in all, what I wanted to say by this point, is that I feel there is an important difference between statistical and logical algorithms. It is kinda tricky to see why these algorithms are different, and I found the most important difference in the fact that the former are "inexplicable" (meta-explicable, at best). And then by observing that AI research gave way to machine learning in the recent years, the straightforward fact follows that when at some point we create a really smart computer, we won't ourselves gain much understanding as to why the computer is smart and how does he really do that. It's somewhat like when you design a chess playing algorithm you yourself don't necessarily get much better at chess. Which is close to obvious, but still unexpectedly sad, isn´t it?

        As a slight deviation - how to classify the algorithms / equations of quantum mechanics?

        This is somewhat off-topic. I believe most basic physical laws are axioms. We check them on the real world and don't really explain them further.

        Intelligence vs Intuition
        I don't want to claim that "general human intelligence" is necessarily something based on logic, there are various ways to define that. However, here I have mainly used the term "intelligence" in reference to "artificial intelligence" a.k.a. AI, which is a term often used to denote that specific area of computer science related to computerized logic. I then thought of a nice term "artificial intuition" to contrast with "artificial intelligence". For the purposes of clarity I then had to stick to the term "intelligence" in a limited meaning.

        "Secondly, and most importantly, this expectation contradicts the assumption that most "truly human" actions are intuitive, i.e., we are simply incapable of understanding them." Could you elaborate on that?

        The assumption is something I can't prove, but I believe most people could understand what I mean. Can you explain how you see and recognize objects (that is, describe a step-by-step algorithm and prove that it solves the problem)? Can you explain how you discern speech? Can you explain how you perform rational thinking, after all? These are some actions which are "truly human", in the sense that when we teach machines to be good at them we'll be really confused. In other words, "truly human" are those skills that we would like to check for in a Turing test.

        I believe the human brain could turn out to be inherently incapable of "understanding" these. It could be, for example, because "understanding" requires using up several neurons to mentally represent each axiom and logical transition. And when you try to explain-prove these complicated actions, the number of "axioms" and the "magnitude" of the transitions is too large with respect to the total number of neurons in our "rational brain". And of course, we don't need to understand in order to be able to perform these actions, nonetheless.

        As to what happens when we have a "computer-boosted" brain, I don't know. "Understanding" is, as I stated above, a purely true-human-related thing. It kinda doesn't seem to make sense to ask whether a microprocessor "understands" or is capable to understand the algorithm it is executing. Let us say it "does", who cares?

        Artificial-intuition.com
        As to the reference to artificial-intuition.com, I must admit I haven't took the time to read all of that site. The simple fact is, when I was nearly finishing writing up the post, I thought I'd google on that nice "artificial intuition" title I came up with. And I was pleased to find there's a properly named website presenting a view similar to what I've discussed here. I do agree there are a lot of arguable points in the Monica Anderson's text. I agree with most of what you've pointed out, although I wouldn´t be as critical.

        I´d answer some of your complaints, still (just for the pure joy of demagogical rhethorics).

        “The single skill of prediction, even if it often fails, yields a big advantage in how well you survive and how likely you are to breed.”

        If it fails more than 50% of the time, it can be worse than doing things at random. Also, there are developmental and maintenance costs of the corresponding biological apparatus to be taken into account when talking about advantageousness.

        Firstly, in most cases you need a non-binary prediction, but, say, predicting one possibility out of 10 or a 100. And in this case 50% precision is enormously good. In many other cases you don´t measure prediction quality in percent but rather, say, "error" (distance of your prediction to the true outcome).
        Secondly, it´s kinda strange to argue with the claim that the skill of prediction is something crucial. I´d personally go further and claim that 99% of what a living being does in its life is predict, but that´s a topic for a separate philosophy post.

        “brain, that contains numerous but nearly identical neurons”

        Well, that´s just one plausible model and it´s good enough for the sake of exposition there. Secondly, the neurons are certainly nearly identical, if you define nearly properly :).

        First of all, if there is something that does predicting, there IS some kind of a model there, it just might be implicit and non-logic-based...

        This is a question of how you define "logic". For example, I prefer to regard a statistical inference step as a kind of a formal logical transition that uses a weird many-input transition rule and a large set of "weak axioms". For other people this sounds like nonsense, and the only true logic is the one with Modus Ponens and the like. But basically, what the author said there is that statistical algorithms can easily do what pure AI can´t which is reasonably close to the truth.

        If Artificial Intuition can be implemented on a computer (she does not explicitly say that it is implementable purely on a computer, but she doesn’t mention any other requirements apart from large memory either), and a computer is based on logic, I would assume that intuition is implementable in logic…

        You use logic to prove things, not to implement algorithms. In other words, logic is something that can explain why a given algorithm solves a given problem. For certain problems you can have an algorithm that works but no way to prove it in traditional logic. That´s how I understand this and in fact that´s the main postulate of this post here, too.

        I mean, deterministic chaos is handled quite well by the equations of nonlinear dynamics, right?

        I agree that the example with chaotic system is somewhat out of place. But your statement is not very rigorous, either, deterministic chaos is not really "handled well" by anything. Indeed, when you select a nonlinear equation, you get an instance of a chaotic system which you can play with, but it does not work the other way around. If you´re given a real-life chaotic system (say, weather) you will have troubles fitting anything reasonable to it or making predictions. Just knowing that certain nonlinear differential equations could describe it doesn´t help.

        As of the bioinformatics comment, there is a big difference between asking if bioinformatics will explain the cell or if it will HELP to explain the cell.

        Well, my main concern here was that it might turn out that it is impossible to "explain" most of the nontrivial processes in the cell, simply because they are too complicated for us to understand and we´ll always have to use computers to "do the understanding" for us.

    3. Миша on 10.01.2009 at 03:56 (Reply)

      The theme of things "too complex to be understood" reminds me of Wolfram's New Kind of Science and his idea of computational irreducibility. A system is computationally irreducible, when the easiest way to predict its behaviour is actually to run or simulate the system - there are no shortcuts. Wolfram states that all not obviously trivial physical systems are of this kind.

      You seem to say that all nontrivial systems cannot be "understood" by a human. If I were to formalize this, I'd probably say that the easiest way for a human to convince another human that the system behaves in a certain way, is to actually show or simulate the system in question.

      If we use these definitions, then your worldview follows from Wolfram's, doesn't it?

      Now Wolfram's theory has been criticized for that it doesn't contain anything new. Assuming a living cell is in some sense Turing-complete and the Church-Turing thesis holds, it follows that the cell is computationally irreducible. This argumentation probably contains a lot of holes (for instance, the Church-Turing thesis doesn't say anything about how ~efficiently~ we can simulate a cell), but still it makes the whole thing more plausible.

      Any thoughts? 🙂

      1. Konstantin on 13.01.2009 at 04:19 (Reply)

        I haven't read the Wolfram's book so it's difficult to judge. As you state it, the idea could be similar, although not exactly equal in formal terms.

        For example, it is not true that any computationally irreducible system should be "incomprehensible". A very simple system (say, something that outputs a constant) could be irreducible yet perfectly understandable.
        On the other hand, a system could be "reducible", yet still too complex for a human brain to grasp its purpose.

        Secondly, I regard "comprehensibility" as something defined with respect to a goal. For example, the same Turing machine could be comprehensible as a "system that performs arbitrary computations" but incomprehensible as a "system that classifies images". And in these terms it is insufficient to consider just the "bare" Church-Turing thesis.

        But I probably should read the book before being able to discuss this properly.

    4. Taivo Lints on 20.01.2009 at 03:20 (Reply)

      Thanks for the thorough reply! Talking about limited brain capacity of humans and the resulting problems with reasoning about complex things, it starts getting difficult for me to upload this discussion to my head in a short enough time in an easily processable form in order to give coherent and well-argumented answers... especially late at night 🙂 Anyway...

      As of my complaints about the comparability of algorithms 1 and 2... I would say that the first one is easy not only because of logical explainability as such, but largely because of humans having a lot of experience with bricks (as you noted yourself: obviousness of an axiom is largely subjective).

      In addition, A1 gives results in a fully confident wording "is heavy", while A2 only says "might have problems", which makes it more difficult to compare their accuracy. Furthermore, I know it is not possible, but I would like to see the accuracy measure of both algorithms where the first one is applied to all bricks in the world and the second one to all humans in the world. I suspect the accuracies would be quite different.

      Yes, OCR is probably a better example (its accuracy surely still relies on the subjective definition of letter "a", but then again, is there anything objective in the world at all? 🙂 ). However, why is there no good way to explain the workings of it? I'm quite sure if you'd visualize the a_ij matrix as an ixj rectangle of pixels colored with the weights mapped to red-blue gradient (i.e., most negative weight being intensive red, 0 black, max. pos. intensive blue), it would quite make sense (not necessarily at first glance, but still). Explaining it algorithmically? Given a proper definition of the letter "a", as well as other letters, and assuming near 100% accuracy of the algorithm, I'd expect it to be possible.

      "Once you’ve found proper classifier parameters you can’t really "prove to yourself" that they are "the correct ones", you just have to believe they work. This is in contrast to the logical algorithms, where you can convince yourself in algorithm correctness."

      In my opinion, this is still more the problem of task domain combined with classifier accuracy than of statistical vs. logical algorithm. Isn't it the case that the higher the accuracy of an algorithm derived from / through statistics, the easier, in principle, it should be to explain it in a "logical" way? Sure, statistics gives correlations, proper algorithmic explanation in the context of this blog post gives (i.e., describes explicitly) either the causal relations or agreed-on definitions. But if the statistics based algorithm is always 100% correct, it is likely to have captured some causal relation, or the definition. As a primitive example, I expect a classifier solving XOR problems with perfect accuracy to have in some way be converged to the logical truth table of XOR. It might just take some time for a human to do the transformation from classifier's representation system to the one of human reasoning.

      I guess I agree that the classes of logical and statistical algorithms correspond to reason and intuition in human mind. However, I'm not fully satisfied with explaining the difference between reason and intuition only based on the argument that "the only formal difference between the two classes of algorithms is that in the former case the number of axioms is small and the logical steps are "easy"." It is possible to have a well-working intuition about something that is easily explainable through reasoning as well. Basically, you can use experience / statistics to give answers about the easy logic problems as well. Actually, it is even likely that we consider logic problems easy only if they are backed up with our intuition (as I said about the brick problem: it feels easy because we have a lot of brick experience).

      ""Understanding" is, as I stated above, a purely true-human-related thing. It kinda doesn’t seem to make sense to ask whether a microprocessor "understands" or is capable to understand the algorithm it is executing. Let us say it "does", who cares?"

      I do 🙂 More seriously, beware of confusing substrate and the higher-level processes running on it. In humans it is not the bunch of neurons to whom to accredit the understanding either (as of the "reality" of higher order processes, here's a fun essay: http://machineslikeus.com/articles/ThreeObservations.html, if you're interested).

      "I do agree there are a lot of arguable points in the Monica Anderson’s text. I agree with most of what you’ve pointed out, although I wouldn´t be as critical."

      I had my AHA moment of why I felt so critical about her writings when I finally read her mini-bio: "I have many years of experience with industrial grade AI and NL technologies. In 2001, disillusionment with Strong Symbolic AI led to a search for alternatives and I started exploring Subsymbolic methods, which eventually led me to the theory of Artificial Intuition."

      Basically it is a case of "finding a new faith", where people get overly enthusiastic about the newfound worldview and overly negative about their earlier worldview, but occasionally (and unknowingly) still thinking in the framework of their previous "faith".

      "Secondly, it´s kinda strange to argue with the claim that the skill of prediction is something crucial."

      I didn't argue with the importance of the ability to predict. I was just trying to say that in practice there can be cases where the cost of prediction mechanisms downweighs the gains from that ability and thus it is not THAT straightforward to talk about the big advantages it provides.

      "You use logic to prove things, not to implement algorithms."

      Well, depends on your background 🙂 My education is in computer and system engineering, and "To most electronic engineers, the terms "digital circuit", "digital system" and "logic" are interchangeable in the context of digital circuits." [http://en.wikipedia.org/wiki/Digital_logic]

      "the assumption that most "truly human" actions are intuitive, i.e. we are simply incapable of understanding them."
      "The assumption is something I can’t prove, but I believe most people could understand what I mean."

      My understanding so far has been that this assumption is more widely accepted among the people from humanities and general public (which, indeed, sums up to be the vast majority of people) than from exact sciences, and tends to be used by the first to decry the attempts to really explain human behavior... 🙂 But surely it is something to think about, which leads me to Misha's comment. I'm not sure if he read my comment, but one of my questions was actually similar ("Incapable because they are computationally irreducible to human cognitive level in any possible future ..."), with possibly a bit better wording in that it should escape the first half of your reply to Misha's comment 😀 Its just that I didn't explicitly refer to Wolfram's book, even though that was my inspiration there.

      1. Konstantin on 25.01.2009 at 04:52 (Reply)

        Hi, let me continue with the flood 🙂

        I suspect the accuracies would be quite different.

        I feel the algorithms are more-or-less equally good in terms of precision. The recall, however, will be worse for A2 because, as you observed, the notion of "health problems" is much wider than "a heavy brick".

        I’m quite sure if you’d visualize the a_ij matrix as an ixj rectangle of pixels colored with the weights mapped to red-blue gradient (i.e., most negative weight being intensive red, 0 black, max. pos. intensive blue), it would quite make sense (not necessarily at first glance, but still).

        Indeed, it will show you an average picture of the letter "A". Although this does give an intuitive feeling that the algorithm is "right", this is not a proper logical proof. As you noted, to prove the statement you would first have to formalize the "definition of the letter A", which, even if doable by a human, would probably look like a huge mess of predicate calculus. As a result, any imaginable correctness proof for the algorithm is doomed to be so large and meaningless that it is easier to accept the definition of the letter A as "something that is recognized by this algorithm" than to mess with the details. And the only way to convince yourself in the correctness of this definition is to check it empirically.

        Isn’t it the case that the higher the accuracy of an algorithm derived from / through statistics, the easier, in principle, it should be to explain it in a “logical” way?

        The more accurate is the algorithm, the more complicated is the model, usually. And this does not necessarily simplify the "explainability" a single bit.
        Of course, when you have a 100% accurate algorithm A for detecting X, you can often construct a "brute-force" proof that enumerates all possibilities or, alternatively, define X to be "an object recognized by A" and then voila, you have a trivial proof for algorithm correctness. But both cases barely resemble anything like a proper explanation.

        But if the statistics based algorithm is always 100% correct, it is likely to have captured some causal relation, or the definition.

        What follows from your position is that you consider most real-life problems "easy", and the difficulty in statistical algorithms is not due to the problems they solve but just due to their constructions ("if you could make them more precise, they would become easier to explain"). This is a philosophical point that is difficult to argue about because all opinions have equal rights. But I can state once more that my opinion is the opposite: most real-life problems themselves (e.g "truly human actions") are inherently difficult and any applicable algorithm for solving them would be inexplicable, i.e. intuitive. Statistical algorithms is perhaps the simplest class of these algorithms, and I'd be happy if we could find other classes. I see the failure of formal-logic-based AI as a kind of support for my argument.

        Naturally, your example with the "XOR" algorithm is somewhat improper as the latter is just too simple a problem. Consider instead an algorithm that could perfectly hold the bipedal robot's balance. Are you sure you could explain how it works without referring to vague notions such as "from experience" or "trained on data"?

        However, I’m not fully satisfied with explaining the difference between reason and intuition only based on the argument that “the only formal difference between the two classes of algorithms is that in the former case the number of axioms is small and the logical steps are “easy”.

        I don't claim that this should be the only explanation. I just intuitively felt the dichotomy and attempted to explain it in words in the best way I could.

        It is possible to have a well-working intuition about something that is easily explainable through reasoning as well.

        Indeed, but most probably the classes of algorithms used in the brain to make intuitive judgements and logical reasoning are completely different. The former are rapid-firing neural networks that just produce the correct answer and give you that intuitive feeling. How do they work? "By analogy", "from experience", "trained on data" are the best explanations. The latter are proper formal logic algorithms.

        I do. More seriously, beware of confusing substrate and the higher-level processes running on it. In humans it is not the bunch of neurons to whom to accredit the understanding either.

        Indeed. Thus, when I said "the microprocessor understands the program it executes" I didn't mean the bunch of transistors. I meant the microprocessor herself. As for the reality of higher-order processes I tend to sympathize with the opinion Schroedinger expressed in his book "What is Life?".

        Basically it is a case of “finding a new faith”, where people get overly enthusiastic about the newfound worldview and overly negative about their earlier worldview, but occasionally (and unknowingly) still thinking in the framework of their previous “faith”.

        This is a very nice exposition of a true observation.

        I didn’t argue with the importance of the ability to predict. I was just trying to say that in practice there can be cases where the cost of prediction mechanisms downweighs the gains from that ability and thus it is not THAT straightforward to talk about the big advantages it provides.

        Any "decision" taken by any "living organism" is always about making predictions. Right?

        “You use logic to prove things, not to implement algorithms.”

        Well, depends on your background My education is in computer and system engineering, and “To most electronic engineers, the terms “digital circuit”, “digital system” and “logic” are interchangeable in the context of digital circuits.”

        OK, so when I say "logic" here I mean a formal system for proving statements by reducing them to axioms. I thought this was obvious. Your statement in the previous comment was about "Intuition implementable in logic". When logic = "digital circuit", then of course it is. And this has nothing to do with the claim that "Intuition operates at a level below logic" because the latter refers to the different meaning of the word.

        “the assumption that most “truly human” actions are intuitive, i.e. we are simply incapable of understanding them.”
        “The assumption is something I can’t prove, but I believe most people could understand what I mean.”

        My understanding so far has been that this assumption is more widely accepted among the people from humanities and general public.

        As I noted, I see strong support for this claim in the historical "failure" of symbolic AI and success of pattern analysis.

    Leave a comment

    Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.