Powered by Squarespace
This form does not yet contain any fields.

    Entries in bayesian (5)

    Monday
    Mar012010

    Bayesian model for soft keyboard enhancement

    Virtual keyboard on my iPhone has always appeared as something amazing to me. Its buttons are quite small, my thumbs are rather large but I still make very few typing errors with it. It was quite obvious that iPhone OS developers have integrated some smart algorithm into it. And yesterday I’ve accidentally come across an interesting article about practically the same thing (from Microsoft Research, though). Idea in the paper is very nice and simple, so it practically forces me to post something :)

    Imagine user has entered a sequence of symbols (k1,…,kn)=H (so called typing history) and then he enters new symbol by touching device screen at position l=(x, y). We’ll determine the intended symbol k by maximizing expression P(k | H) P(l | k) with respect to k. Here P(k | H) is the probability of observing symbol k given sequence of previously entered symbols (usually only last 6 symbols are considered). It can be estimated from any text corpus. P(l | k) is the probability of touching screen position l when symbol k is intended. In the paper it is modelled as a bivariate Gaussian distribution. It is also constrained in a special way to prevent total suppressing of very improbable keys (given the typing history).

    Approach like that can dramatically reduce amount of typing errors on soft keyboards. Numbers, charts and comparison with the simple “static” keyboard and state-of-art dynamic keyboards with unconstrained P(l | k) can be found in the paper.

    Monday
    Dec072009

    Bayesian approach: don't hurry when reasoning

    Few days ago one friend of mine (let’s name him A) has mentioned some psychological test. Test consists of a single question, and, according to some statistics, 98 percent of serial killers answer that question right. When another friend of mine (B) gave correct answer to the question, A said that it’s highly probable that B is a serial killer. Was he right? Of course, he wasn’t. A’s problem is that he is not familiar with the Bayesian approach at all. And here is why.

    Let R be the event of giving the right answer to the question and M be the event that guy who answers is a maniac. From gathered statistics we know that P(R | M) = 0.98. Next, from Bayes theorem whe know that P(M | R) = P(R | M)P(M)/P(R). P(R) can be represented as P(R | M)P(M) + P(R | not M)P(not M). Next, assume that about 5 percent of usual people also gave the right answer to the question, so P(R | not M) = 0.05. Then, what’s the prior probability of M? I think we all agree that it’s quite small, about 1e-5 or even less. Now we are ready to calculate posterior probability of M:

    P(M | R) = (0.98 * 1e-5) / (0.98 * 1e-5 + 0.05 * (1 - 1e-5)) = 0.0000098 / (0.0000098 + 0.0499995) ~ 0.0002.

    Probability is very small, but why is that? That’s because my friend A was talking about the likelihood, but he didn’t take prior probabilities into account. And in this case prior probabilities are of great importance.

    Btw, what’s if serial killers always answer right and normal people always give wrong answers? Then P(R | not M) = 0 and P(M | R) = 1, so our model works in extreme cases too.

    Friday
    Nov272009

    Bayesian approach: Lorenzo von Matterhorn

    Today we will consider a more sophisticated inference example, with both closed-form solution and Infer.NET program. We are going to find answers to some very important questions closely connected to the Lorenzo von Matterhorn trick from Barney Stinson’s Playbook.

    First of all, let’s figure out factors influencing the successful completion of the trick. It looks reasonable that girl should search for your imaginary name in Google. Without it, trick wouldn’t work. It means that girl should have some internet device and area should be covered with internet (WiFi, 3G etc). Also, if your ugliness outweigh your imaginary wealth and fame, girl may not come with you (nevertheless, it’s highly improbable). Such suggestions allows us to build the following probabilistic model with discrete variables: 

    In fact our model is a Bayesian network. It means that we should specify parent-conditional probability distribution for each variable it consists of. Let’s get started.

    P(I) = 0.8 because almost all the modern phones have at least GPRS support.
    P(C) = 0.9 because internet is available practically everywhere nowadays.
    P(G | I, C) = 0.95 (wouldn’t you google for this strange man?)
    P(G | not I, C) = 0.3 cause she can ask someone who has internet to google.
    P(G | I, not C) = P(G | not I, not C) = 0 cause nobody can google without internet.
    P(H) = 0.2 because not much guys are handsome.
    P(S | G, H) = 0.99
    P(S | not G, H) = 0.5 (it’s nice to be attractive, huh)
    P(S | G, not H) = 0.9 cause wealth is more important than handsome look.
    P(S | not G, not H) = 1e-5 (the worst case).

    Computing probability of success is quite straightforward, so we will concentrate on two little more sophisticated questions:

    1. What’s the probability of you being handsome if you’ve succeeded in the trick? 
    2. If you haven’t succeeded in the trick, what’s the probability she had looked for your imaginary name in Google?

    To compute those probabilities, we’ll use Bayes theorem together with the law of alternatives. This blog is not really a comfortable place for a lot of formulas, so the whole inference is available in the separate PDF. As we can see there, it’s not very hard to show that P(H | S) = 0.2449 and P(G | not S) = 0.2042. We can also write a small inference program using Infer.NET library which will give us exactly the same results.

    Calculated probabilities agree with our common sense very well. Value of P(H | S) shows that it’s not necessary to be some handsome guy to perform a trick. And value of P(G | not S) tells us that most of the failures happen when girl refuses to google for some reason.

    That was another example of probabilistic logic in action. Next time we’ll consider some continuos case like linear regression.

    Tuesday
    Nov102009

    Bayesian approach: introduction

    As I’ve already said, I’m going to write a few posts about Bayesian approach to probability theory and, especially, to statistical machine learning. Someday I’m going to be an expert in this topic, but, currently, I’m far from it :) So, the following series of posts is my attempt to make things clear for myself and, probably, for someone just starting to learn this amazing topic. In fact, it means that there can be some mistakes. And if you find them, I’ll be glad you reveal that as soon as possible.

    Click to read more ...

    Wednesday
    Apr222009

    Bayesian inference for boys

     

    Learning something is good. Learning something using interesting examples is even better. Today I’m going to show you the power of bayesian inference by building model representing probability of you having some sex. If you want more formal introduction, please read this paper written by Christopher Bishop, famous researcher from Microsoft Research Cambridge.
    Bayesian inference is a method for finding posterior distribution of dependent set of random variables, when some variables are observed and some aren’t interesting. That set of variables is often represented as so-called graphical model: a directed graph called Bayesian network or undirected graph called Markov network. In that kind of graph vertices represent random variables and edge connecting two vertices mean that variables associated with that vertices are dependent. That representation is not only clean and obvious, but it also allows using some fast and robust algorithms for performing inference.
    Let’s look at concrete example. We’ll try to build simple model describing sexual relationships between boys and girls. Probability of you having sex with some girl mostly depends on the degree of she liking you (which depends on you sexuality) and her so-called slutness. I’ve also considered some gaussian noise in woman’s head that sometimes dramaticaly imacts on her decisions. An assumption that slutness and sexuality are distributed normally around zero (which we consider as an average value for that variables) and value describing how much she likes you is also distributed normally around your sexuality allows us to build simple graphical model. Amazing tool for that is Infer.NET, .NET library for bayesian inference from Microsoft Research. Model code is quite short:

     


    Variable<double> youAreSexy = Variable.GaussianFromMeanAndVariance(0, 1).Named("you're sexy");
    Variable<double> howMuchSheLikesYou = Variable.GaussianFromMeanAndVariance(youAreSexy, 0.25).Named("how much she likes you");
    Variable<double> slutness = Variable.GaussianFromMeanAndVariance(0, 1).Named("her slutness");
    Variable<double> randomNoiseInHerHead = Variable.GaussianFromMeanAndVariance(0, 0.25).Named("noise in her head");
    Variable<bool> willYouHaveSexToday = (howMuchSheLikesYou + randomNoiseInHerHead + slutness > 0).Named("will you have some sex?");

     

    After compiling model in Infer.NET you’ll get a graphical representation of your model which is called factor graph. It looks very similar to bayesian network but also contains special “factor” nodes representing different operations, constrains and distributions. Clickable picture below is an example of factor graph for our “sex” model:
    Let’s ask some questions to our model using ExpectationPropagation inference algorithm:
    1. slutness = -5 (she’s REALLY not a slut), willYouHaveSexToday = true (yeah, you did it). Then distribution of you being sexy is Gaussian(3.514, 0.3635) which means you are beauty as a devil (I guess).
    2. slutness = -0.5 (she’s just a girl), youAreSexy = -1 (you are not a good one). Then probability of you having sex today is 0.01695. It means that your chances are quite small.
    3. youAreSexy = -5 (oh, you look really bad), willYouHaveSexToday = true (but still have some sex? strange, isn’t it?). Then distribution of her slutness is Gaussian(3.514, 0.3635) (oh, she is a slut… that makes sence).
    So, bayesian inference helps us with different questions about girls, sex and everything. Remember that provided model is very simple and inaccurate. For example, one can consider slutness as a variance of howMuchSheLikesYou variable or add some other random variables and factors into model. Have fun with it and don’t forget that math is great.