Powered by Squarespace
This form does not yet contain any fields.

    Entries in papers (6)

    Monday
    Mar012010

    Bayesian model for soft keyboard enhancement

    Virtual keyboard on my iPhone has always appeared as something amazing to me. Its buttons are quite small, my thumbs are rather large but I still make very few typing errors with it. It was quite obvious that iPhone OS developers have integrated some smart algorithm into it. And yesterday I’ve accidentally come across an interesting article about practically the same thing (from Microsoft Research, though). Idea in the paper is very nice and simple, so it practically forces me to post something :)

    Imagine user has entered a sequence of symbols (k1,…,kn)=H (so called typing history) and then he enters new symbol by touching device screen at position l=(x, y). We’ll determine the intended symbol k by maximizing expression P(k | H) P(l | k) with respect to k. Here P(k | H) is the probability of observing symbol k given sequence of previously entered symbols (usually only last 6 symbols are considered). It can be estimated from any text corpus. P(l | k) is the probability of touching screen position l when symbol k is intended. In the paper it is modelled as a bivariate Gaussian distribution. It is also constrained in a special way to prevent total suppressing of very improbable keys (given the typing history).

    Approach like that can dramatically reduce amount of typing errors on soft keyboards. Numbers, charts and comparison with the simple “static” keyboard and state-of-art dynamic keyboards with unconstrained P(l | k) can be found in the paper.

    Friday
    Dec252009

    An interesting approach to camera calibration

    Camera calibration is a process of determining intrinsic (like principal point or focal length) and extrinsic (position and orientation in space) parameters of a camera, which is often described by the pinhole camera model. In computer vision we usually perform calibration by analyzing images taken from camera. The most widely used approach to camera calibration is based on this paper by Zhengyou Zhang. It involves chessboard (or some other planar calibration pattern) and consists of the following steps:

    1. Find a chessboard (bigger is better). Note that it should have distinct width and height (measured in chessboard squares) which should both be even (otherwise you will be unable to determine chessboard orientation given its picture).
    2. Find a “calibration dude”.
    3. Calibration dude takes a chessboard and waves it in front of the camera attached to a computer with calibration software installed. Calibration software takes about 20 images with distinct chessboard orientations (camera orientation remains the same, of course), finds inner corners of the chessboard on every image and then uses them together with information about real-world size of chessboard to determine intrinsic camera parameters.
    4. Calibration dude puts chessboard on the floor in a way camera still can see it. Calibration software than takes one more image from camera, finds chessboard corners on it (again) and calculates camera position assuming that some predefined chessboard corner is located at the coordinate system origin and chessboard sides are oriented towards coordinate system axes. Of course, any other chessboard orientation can be specified in software, but this one is the most simple.

    What problems do we have there? First of all, camera can see no floor at all, so we can’t just put chessboard on it during step 4. Instead we need to set it up somewhere else, not on the ground level. We should then carefully measure its position and orientation and pass them as an input to the calibration tool.

    What if we have more than one camera seeing no floor, and those cameras are not overlapping? In this case we should repeat process described above for each camera, carefully measuring chessboard position in the world coordinate system every time. In fact, it’s a pain in the ass. Calibrating multiple cameras that way can be really slow and error-prone.

    Much more interesting approach to multiple non-overlapping camera calibration was proposed in this paper. Its key idea is to fix chessboard position (put it at the origin) and move mirror instead. Cameras will see chessboard reflection in that mirror and use reflected image for calibration. Of course, some questions arise.

    1. Is it legal to determine intrinsic camera parameters using reflected chessboard image? Answer is simple: yes. Authors prove that common calibration techniques give same result (except of coordinate system handedness) when applied to mirrored images.
    2. Don’t we need to know position and orientation of the mirror when calibrating extrinsic parameters? No, we don’t. It turns out that every mirrored chessboard image imposes constraint on the position and orientation of the real camera. And if we have five (or more) such images, we can reconstruct position and orientation without any knowledge of mirror position.

    This approach can save a lot of time and help to reduce part of the calibration error that arises from incorrect chessboard position and orientation determination. But it has it’s own drawbacks, of course. First of all, mirror is rather heavy. It’s not easy to manipulate it if your calibration dude is not a beefcake. Next, it’s hard to change orientation of the calibration pattern in frame from one snapshot to another when using mirror. It has to be in the field of view of the camera, and oriented such that the pattern’s image is reflected into the camera. These requirements may result in little variation in the pattern orientations as seen by the camera in the mirror and lead to solution degeneration.

    Despite the drawbacks, this approach has the potential to increase speed of the multiple camera calibration process a lot. We will probably try it in 2010.

    Ram Krishan Kumar, Adrian Ilie, Jan-Michael Frahm, & Marc Pollefeys (2008). Simple calibration of non-overlapping cameras with a mirror 2008 IEEE Conference on Computer Vision and Pattern Recognition
    Thursday
    Aug062009

    Genetic tree regression experimental results

    About a month ago I have posted about my experiments with genetic algorithm for boosting trees. Since than, I have not much time to implement something new, but I have found a lot of bugs in my previous implementation. Due to those bugs, on every boosting iteration my algorithm returned regression tree which was quite far from best possible. Actually, it worked only because of amazing ability of boosting to build rather good classifier from huge amount of really bad.

    Click to read more ...

    Sunday
    Jul262009

    MSR summer school

    This week I have visited a few lectures of the Microsoft Research guys during the Microsoft Research summer school on high performance computing at MSU. There were a lot of interesting stuff at the school. The main idea behind the school was that Von Neumann architecture is too old and its limitations become more and more obvious. So, computer science field (its programming part especially) must be “reinvented” soon. As an alternative to the classic imperative programming, MSR guys propose using functional languages like Haskell. So, they have talked about Haskell (and, especially, about parallel programming in Haskell) a lot. And, of course, Simon Peyton-Jones, one of the main developers of the Haskell language and lead designer of the Glasgow Haskell Compiler, was there. He is very inspiring and cheerful man, and his talks are really great! But the talk I liked most was not about functional programming. It was about research papers and talks (it was kind of a meta talk). Slides and other stuff (like videos) related to that talk are available here, but I’m still going to list some of the major (or just interesting to me) theses:

    1. Writing articles and giving talks is not about anything but sharing ideas.
    2. Have an idea (it’s not necessary for your idea to be a fantastic one) => write an article.
    3. One article <=> one clear idea.
    4. Use a lot of examples! Every definition or statement (especially the one with complicated math) becomes much more clear if it has an associated example.
    5. Related work section should be placed just before conclusion. Other works can distract your reader from your (good or not) own work (this idea was quite surprising to me).
    6. Be as clear and concrete, as possible.
    7. Write in active voice, use agents (like process, algorithm, iteration etc). For example, “this algorithm selects best classifier” is much better than “best classifier is selected”.
    8. Good talk contents: motivation (20%) and key idea (80%).
    9. You should select something you want your readers to remember after listening to your talk. Concentrate on that thing. It’s absolutely normal to cover only part of your paper at your talk.
    10. Adding outline to your slides makes no sense but wastes the time of the talk.
    11. Again, examples are your main weapon!
    12. Do not show the total amount of slides to your audience (numbers like “6 of 95” can make people very sad).
    13. Always finish in time. And it’s better to save some time for questions than to show all the slides to your audience.
    14. Be enthusiastic! Do not make excuses! Do not afraid to be afraid of your talk (everybody does)!

    I hope reading this will help someone with his (or her) paper or talk. And I hope that person will look through the original slides or watch the video by himself. It is worth all the time spent.

    Friday
    Jun052009

    Paper from the WIT talk

    Preprint of the paper associated with my WIT talk I've recently posted about is available at arxiv.org now.

    Wednesday
    Jun032009

    PhD summer school, WIT

    Yesterday I've participated in pattern recognition section of PhD summer school on Scentific Computing. It's a web conference event. Several universities participate in it, including Waterford University of Technology and our Moscow State University. There I have a talk about my last research work: accelerating boosting with genetic algorithms. That was my first experience of web conferencing, but everything went ok as I think.

    If someone is interested, slides from my talk are available there. AFAIK conference site will appear soon. It will contain both the presentations and theses of the presented works. There should be some interesting stuff there. For example, I'd like the talk about feature set compactness of the training set and its relation to k-NN classifiers by Dmitry Potepalov.