Powered by Squarespace
This form does not yet contain any fields.

    Entries in thoughts (10)

    Tuesday
    Feb092010

    Camera DEcalibration and REcalibration

    In one of the recent posts I have covered the topic of camera calibration a little. But that was not really interesting. Tons of very good books have been written about that. Today we’ll look into keeping your cameras in calibrated state after they’ve been calibrated instead.

    Suppose you have a 3d-tracking system that uses about 20 calibrated cameras installed in some crowded place like bank office or shopping mall. In such places calibrated cameras have a tendency to become uncalibrated sometimes, mostly due to vibrations caused by factors like construction work or cleaning. What should we do about that? Obviously, we should recalibrate cameras whose position or orientation has been changed (or change position and orientation back to original). But how can we find out what cameras should be calibrated again? Should we hire a man who’ll check image on every camera every day? Fortunately, we shouldn’t.

    Camera decalibration can be detected automatically

    What we have to do is to save snapshot from camera at the moment its extrinsic parameters were estimated. Then the following algorithm can be used to check for decalibration:

    1. Extract something like SURF features from the saved snapshot and current camera image.
    2. Match extracted features.
    3. If not less than 10-15% of the matched points have the same pixel coords, camera is OK. Otherwise, it seems to be decalibrated.

    As you can see, this algorithm is fully automatic and can be run, for example, every hour for every camera. It is also quite robust to presence of some temporary objects like humans in both images because it requires only small amount of matches to have the same pixel coordinates. It seems that it works bad (or doesn’t work at all) only in cases when there are very few good features available (for example, when camera looks at the white wall). But in those cases it’s very hard to detect camera decalibration even for human. One can also make this algorithm robust to illumination changes by using some lighting-invariant point descriptors or by image preprocessing.

    OK, we have determined that camera has become decalibrated. And we also have old camera position, orientation and point matches between current and old camera images. It seems that we can automatically determine new camera position easily. Or can’t we?

    Camera can not be recalibrated automatically

    And here is why. Of course, there is a 3d reconstruction algorithm that takes a list of coordinates of matches together with camera intrinsic matrix and returns transform from old camera view space into view space of the new one (together with 3d coordinates of the points). Unfortunately, it can calculate transform and 3d coordinates only up to scale. Neither coordinates of the matched pixels nor intrinsic matrix contain enough information about the scale of you scene. Are you measuring everything in meters? Or in inches? You need more information (like real 3d coordinates of one of the matched points) to rescale algorithm results. But you don’t have them. So the reconstruction algorithm gives you pretty much nothing (well, it gives you correct camera orientation, but you can’t use it without rescaled translation).

    It’s interesting that I’ve failed to find any papers covering the topic of decalibration and recalibration although it’s quite important for some practical computer vision applications. Should I write a short paper about it by myself?

    Tuesday
    Nov102009

    Bayesian approach: introduction

    As I’ve already said, I’m going to write a few posts about Bayesian approach to probability theory and, especially, to statistical machine learning. Someday I’m going to be an expert in this topic, but, currently, I’m far from it :) So, the following series of posts is my attempt to make things clear for myself and, probably, for someone just starting to learn this amazing topic. In fact, it means that there can be some mistakes. And if you find them, I’ll be glad you reveal that as soon as possible.

    Click to read more ...

    Sunday
    Sep272009

    Agnosticism and atheism

    I want to explain why I prefer atheism to agnosticism (and, of course, to all the religions). The most simple way to do it is to involve Russell’s teapot into process, like Richard Dawkins did in his “The God Delusion” book.

    So, imagine someone suggesting that somewhere between the Earth and Mars there is a china teapot revolving about the sun in an elliptical orbit. Unfortunately, teapot is too small to be visible from Earth (even with all our telescopic power), so nobody can disprove his assertion at this moment.

    Religion in this case will work as following. People will start to teach their children that there exists such a teapot. When children will grow up, most of them will become staunch supporters of the teapot existence, blind to any logical arguments. Theologians will waste billions of dollars for the propagation of the teapot idea. They will condemn everyone who will state they are wrong. And they will also argue that if someone has any doubts, he should first prove that there is no teapot. If he can’t (and, unfortunately, nobody can do it right now), he should keep his mouth shut. Their religious views should not ever be damaged.

    Agnostic in this case will hold the following point of view: we can’t prove that teapot really exist, but we also can’t prove that there is no teapot. So, both cases are probable. Why should we care then?

    Atheist will be the only who will say:

    Fuck it, are all of you crazy? How the hell that teapot can get there? It’s highly improbable. Actually, it’s pretty impossible. Of course, I can’t prove a negative, but I’m not going to believe in this shit. There exists a more simple hypothesis: there are no teapots there. Stop fucking our brains!

    Isn’t it rational? Agnosticism, like, does not take rationality into account, as we do. Dawkins used the “hypothesis probability” expression while explaining why “there is no god” hypothesis is better. But I’m not really sure if we can talk about probabilities without specifying probability space.

    Anyway, reductio ad absurdum rocks.

    Sunday
    Sep062009

    More graph theory for the Lizard-Spock game

    Those of you who watches The Big Bang Theory CBS sitcom should be familiar with the Rock-Paper-Scissors-Lizard-Spock game introduced in the Lizard-Spock expansion episode. One of my friends has recently got a t-shirt with the rules of the game printed on it. Then, we’ve got drunk and decided to play. The one question I’ve asked before we started was if the game is fair (I still have a strong feeling that Spock must act like an ultimate weapon). When t-shirt is in front of you, it’s rather easy to answer: each vertex of the digraph drawn on it (representing weapon) should have equal in and out degrees. If so, each weapon beats and is beaten by the same number of weapons. In the mentioned Lizard-Spock game each vertex has in and out degrees of 2.

    But what if you haven’t such a beautiful picture? Remember how Sheldon Cooper introduces the game:

    Scissors cuts Paper, Paper covers Rock, Rock crushes Lizard, Lizard poisons Spock, Spock smashes Scissors, Scissors decapitates Lizard, Lizard eats Paper, Paper disproves Spock, Spock vaporizes Rock, and as it always has, Rock crushes Scissors!

    As you can see, his speech actually represents a path in the game graph. He starts and ends with the Scissors, walking through every relationship in the game. In fact, Sheldon gives us an Eulerian circuit in the digraph. Existence of such circuit implies that each vertex of the graph has equal in and out degrees, something we were trying to figure out.

    Well, each vertex Vi can have the same in and out degree Ki, but Kj can be not equal to Ki for distinct j and i. What does that mean for the players? It means that there is no rule for some pair of weapons. So, if you began to play and then discovered that nobody knows what lizard does to Spock, then something is obviously wrong with your game and you don’t need neither graph theory nor other theories to figure it out.

    Thursday
    Jun182009

    Genetic weak learner and classifier trees

    I have some free time now (most of my friends went to the student military camp), so I decided to continue my last research work.

    If you have read my paper about accelerating boosting with genetic algorithm, you have an idea of what the genetic weak learner is. In the original work that learner was only applied to the families of parametric classifiers that can be represented as points in multidimensional space. Proposed approach can not be used to build an additive model consisting of single-feature stump classifiers in cases when features have no good parametric representation. For example, if feature can be represented only by it’s number in the feature pool and there is no corellation between features with close numbers, genetic algorithm used in the weak learner simply turns into random search. Another bad thing about stump classifier is that it can’t represent dependencies between variables which is always an important feature of the learning algorithm.

    Now I’m trying to fix some problems of the original approach. Here are some ideas I’ve recently came to:

    1. Boosting in fact is a stagewise additive modelling algorithm. So, on each iteration it tries to fit the tree that, if added to the current model, will minimize some loss function. It’s an optimization problem with tree as a function parameter. There exist some well-known approaches for solving such optimization problem with genetic algorithm: GP and GEP.
    2. What’s bad for the stump can be not so bad for the tree with more than one split. When mutation changes feature used for the split in some node, all the features in another nodes remain the same and, so, there is a strong corellation between chromosome states before and after the mutation.
    3. If non-parametric feature usage become possible, I will compare my approach with some popular tree boosting approaches such as gradient boosting on different UCI datasets.
    4. As I already have mentioned in the paper, genetic learner computational complexity depends only on genetic algorithm parameters (such as population size or epoch count) and, therefore, genetic tree boosting algorithm also has an opportunity to significantly outperform existing approaches in learning time.

    Also I’m going to remember my API design skills and implement a library for boosting both tree-based classifiers and tree-based regressors in some convinient way. Keep reading the blog to know more =)

     

    Monday
    Apr062009

    Weighted voting for president elections

    It's totally true that choosing new president for your country is quite important decision. And it's also true that some people are clever, and some are not. I understand all that democracy stuff, but why should we care about some junky opinion as much, as about the Noble Prize laureate's one. Solution of this problem is well known in machine learning. It's called weighted voting and consists of assigning weights to the each voter and then combining votes together with weights being taken into account.

    The only question is: how should we assign weights to voters? What's the formula, man? Well, my answer is simple: I don't know yet. But weight should probably depend on education, average earnings, number of children in family and all that social stuff. We don't want some sociopath make decisions for us, right?

    Tuesday
    Mar312009

    Scientific OS

    At this time I'm participating in a contest provided by Russian search engine called Yandex. In the contest participants should build an algorithm which can rank web search results by relevance well, using features of both found documents and search query itself. Later I'll post about my contest experience, but today I want to write about something else.

    While solving such tasks you always have to deal with large amounts of data in different formats. It often involves implementing some converters etc. For example, in the Yandex contest I've tried to apply SVMRank algorithm to the given data using SVMLight first. That forced me to write a small program for sorting input data by query id because of SVMLight requirements. Then, of course, I've added ability to restore original data order to my program because results are checked by the contest automatic checking system which knows anything about me reordering the data. Then I've decided to apply PCA to the input data before applying SVMRank algorithm (warning! due to the unusual feature value distributions result was terrible!). MATLAB contains freely available PCA implementation, so I've added ability to export data to format supported by MATLAB and import it back to my program. Then I repeated the following sequence of actions several times (for each number of selected PCA components): convert data to MATLAB, load attributes, apply PCA (fortunately, first 3 steps should be done only once), save first K attributes, convert back to SVMLight format, sort by query id, learn SVMRank model. Then I did the same for the test data except of replacing model learning with rank prediction and original order restoring. Of course, I could make a script from that action sequence and run it several times, but it has it's own drawbacks. For example, I know nothing about integrating external applications with MATLAB and learning that stuff can only increase amount of time required to perform the whole process.So, that's all boring, dreary and it distracts you from the actual task. What can we do about it? I've just thought: why there is no special operating system for researchers, scientists and engineers that can make their lifes much more easier. Here is a short list of what I'd like to see in it:

    Primary (related to the previous paragraph)

    1. Math engine (some kewl mixture of MATLAB and MAPLE) should be integrated with OS environment. I want ability to run any math command in any shell opened and get result immediately (well, it depends on command). And all the calculated math data should be accessible through math engine from any other OS part.
    2. There should be comfortable math engine API available for most popular programming languages.
    3. There should be some standard for all the data types used in the math engine.
    4. All the external math algorithms should be integrated in the OS math engine. All that algorithms should use the same math data types provided by math engine (see 3).

    Secondary (all the ideas that came to my head while writing "Primary" section):

    1. I'd like to see some semantic relationship database integrated to the OS where you can hold dependencies between papers, researchers, lectures, presentations, data sets, experiment results etc (take a look at the MSR Research-Output Repository Platform). For example, it would be rather kewl to visualize references between articles related to some research area as a graph. It allows you to filter the most important works quickly and opens many other possibilities.
    2. There should be some useful network interaction mechanisms. For example, I can stream experiment results in real-time mode to my colleague who will immediately visualize it. Or I can implement some math algorithm using OS math engine and ask my colleagues to share their computing capabilities with me. Or I can submit it to some computing server and then stream required data to it. And all of it completely transparently. And there should be, of course, some mechanism for relationship synchronization between network nodes.
    3. And, finally, it would be a pleasure to have some comfortable LaTeX editor and compiler with all the necessary stuff.

    At least, unification of the data formats and math engine will help with exchanging data, prototypes, algorithms etc between researchers. Nowadays it turns into "Let's try to build it" game very often.
    Well, it would be nice to hear other ideas and opinions. And then, may be, someone will read this post, get inspired and 5 to 7 years later new operating system will totally change all the scientific community =)
    By the way, my team in the Yandex contest has the same name as this blog.

    Wednesday
    Mar182009

    Invent now!

    Here is one simple example of what can one very-very talented guy do with computer vision technologies. Btw, there are a lot of other interesting and amazing stuff at TED. Feel inspired, invent and have fun!


    I don't have much time right now to improve RoboZZle solver or just write something interesting into blog. Currently I'm stuck with finally implementing working BRM face alignment at work, doing some university stuff like implementing simple linear algebra algorithms for BlueGene/P supercomputer, finishing article about genetic algorithms in boosting etc. But when something amazing will happen, I'll report it immediately :)

    Thursday
    Mar052009

    Future is almost now

    Microsoft released wonderful video as the part of their Microsoft’s Future Vision Series project. In that video you can see how amazing computer (especially computer vision) technologies will change our lives in approximately 10 years.

    I’ll remember that and compare with ground truth 10 years later.

    Wednesday
    Feb112009

    C++ vs C#

    I totally hate C++. And every day my hate grows up. But that is not just about my programming language religion. We should look deeper.

    I know C++ pretty well. I’ve read all the stuff like Effective C++, More Effective C++ and Modern C++ Design and understand all of it. And it was really hard. Now I can do amazing things with templates and write robust error-free code. Do I want it? No. It is still really hard for me (and for everyone, I guess) to write, read and maintain such a code. That snippet looks really ugly in C++:
    for (std::vector<double>::iterator it = values.begin();
    it != values.end();
    ++it)
    DoStuff(*it);
    And here is it’s C# equivalent:
    foreach (double value in values)
    DoStuff(value);
    And I’m pretty sure (I have some experience, you know) that 99% of equivalent code pieces are much more simple in C# than in C++. When you read and maintain a lot of code, it is very important.
    Next, I hate includes. There is no good reason for good programming language to have such a non-obvious and non-robust way to incorporate different subsystems or just pieces of code. Really! IDE can do it for me! In .NET there are no problems with include paths. And I can just reference some library to get access to it’s components. No precompiled headers, no include (or lib) directory setup. That’s all about rapid and easy development.
    And the last thing that’s important for me: in most cases (except of writting really performance-sensitive code) you should not care about memory deallocation at all. Execution environment can do it for you. Rather quickly. And it does. How much errors have you ever had with memory allocation and deallocation in C++? I have a lot. More than any other kinds of errors. It’s like a pain.

    But, of course, C++ is fast. Really fast. In computer vision and 3D rendering there are a lot of very performance-sensitive stuff. And the only choise you have is to code it in C++. But don’t forget: there are wrappers for everything. There is XNA, Emgu.CV and CUDA.NET. There is much more than that. You can still write performance-sensitive code in a managed environment with all it’s benefits.

    I forget to mention wonderful tools for .NET developer such a Resharper, FxCop, StyleCop and many others. There are no technical possibilities to create such a useful and powerful tool for a non-managed language like C++.

    Let’s sum up:
    .NET (C#)
    1. Easy language to code, read and maintain.
    2. No problems with library version control, includes and stuff.
    3. Fast and robust garbage collection.
    4. Very powerful standart library.
    5. Cool IDE and tools.
    6. Quite fast.
    C++
    1. Really fast.
    2. There is Boost.
    So that’s my proposal: when you want to code something, think about implementing it in C# first and consider C++ as the last option. I’m pretty sure you’ll like it as much as I do.