RankBoost and DCG
Saturday, May 16, 2009 at 1:46PM So I've finally sucked (my best DCG on test set is 4.198 which is kind of not too good) in the Yandex contest I've already posted about. As I think, there were two major problems: lack of time and, unfortunately, lack of good ideas. But I want to share with you some things I've learned while participating in it.
One of the ranking algorithms I've tried was RankBoost with binary rankers originally proposed by Freund and Schapire. To have ability to separate not only documents with high value of some feature from documents with low value of the same feature, but also, for example, documents with feature value distributed somewhere around 0.5 from any other, I've performed additional experiments using ranking features that are functions of another features. For that purpose I've selected truncated gaussian with mean=0.5 and also [0,1]-multimodal sinus-based function:
There are plots representing experiment results in terms of RankBoost performance value and Yandex DCG:

As for me, I've drawn 3 conslusions:
- Using only original features for creating weak ranker sucks.
- Using weak rankers based on functions of features is slightly better.
- The whole approach still sucks in terms of DCG.
hr0nix |
Post a Comment |
ranking 
