Powered by Squarespace
This form does not yet contain any fields.

    Entries in ranking (1)

    Saturday
    May162009

    RankBoost and DCG

    So I've finally sucked (my best DCG on test set is 4.198 which is kind of not too good) in the Yandex contest I've already posted about. As I think, there were two major problems: lack of time and, unfortunately, lack of good ideas. But I want to share with you some things I've learned while participating in it.

    One of the ranking algorithms I've tried was RankBoost with binary rankers originally proposed by Freund and Schapire. To have ability to separate not only documents with high value of some feature from documents with low value of the same feature, but also, for example, documents with feature value distributed somewhere around 0.5 from any other, I've performed additional experiments using ranking features that are functions of another features. For that purpose I've selected truncated gaussian with mean=0.5 and also [0,1]-multimodal sinus-based function:
    There are plots representing experiment results in terms of RankBoost performance value and Yandex DCG:

    As for me, I've drawn 3 conslusions:

    1. Using only original features for creating weak ranker sucks.
    2. Using weak rankers based on functions of features is slightly better.
    3. The whole approach still sucks in terms of DCG.
    I should have tried RankBoost with concave learners proposed there. Oh, forget to mention. Yandex DCG for query can be calculated like that:

    Final DCG value is then acquired by calculating sum of DCGs of all the queries and then dividing it to the number of queries.