A Rounded Evaluation of Recommender Systems


Training data, evaluation scripts and rules can be found in the official challenge repository; relevant literature and background information about the challenge and relevant industry use cases can be found in the challenge paper pre-print.


For the submission format and the general rules of the contest, please consult the relevant section in the README

EvalRS Leaderboard (Phase 2)

Note: Values for individual metrics (e.g. Hit-Rate) reflect their un-normalized value as per Phase 1. Score reflects the aggregated score inclusive of normalization.

Some charts to better understand how models perform

We are plotting MRR with other metrics, to see the impact of the "main" metric onto the others. Note that MRED is multiplied by -1 in this plot to give a better idea of the relationships. Thus HIGHER MRED means higher unequal performance across partitions.

MRR and Country MRED

MRR and Gender MRED

EvalRS Leaderboard (Phase 1)