Tuesday, February 10, 2009

Interquartile Mean Analysis

Not willing to leave well enough alone, I went ahead and did the analysis of the interquartile mean (IQ) values for rounds and NDCGs, and I'm quite glad I did. While the IQ rounds were not much better than the mean rounds, the IQ NDCGs were consistently around 4% better than the mean NDCGs. The graph below shows the difference between mean and IQ NDCGs vs. the total number of examples seen for LETOR seeding using l10.


The mean over all examples ("All") is clearly dominated by the interquartile means. This bump in performance means that performance using LETOR seeds is now more pronounced, as seen below.
Again, the graph shows NDCGs vs. total examples seen, in this case for IQ NDCGs for l10 and the baseline.

Later today or tomorrow I'm going to do an analysis of the distribution of NDCGs across all queries. I'm hoping to show that there are tight standard deviations with a very large total range, which will allow me to argue that discounting outliers is indeed valid.

No comments:

Post a Comment