Available Metrics
To specify one of the following metrics as metric in the .yaml
file, you can simply use its name
(The parsing is not case-sensitive)!
...
eval:
SequentialSideInfoTask:
- hit
- map@10
...
Ranking Metrics
Info
Each Ranking Metric can evaluate recommendations produced with a cut-off value:
To specify it, simply use METRIC_NAME@K
Hit (Hit@K)
The Hit metric simply check if, for each user, at least one relevant item (that is, an item present in the ground truth of the user) has been recommended.
In math formulas, the Hit for the single user is computed like this:
Where:
- is the recommendation list for user
- is the ground truth for user
And the Hit for the whole model is basically the average Hit for each user:
Where:
- is the set containing all users
MAP (MAP@K)
The metric (Mean average Precision) is a ranking metric computed by first calculating the (Average Precision) for each user and then taking the average.
The is calculated as such for the single user:
Where:
- is the number of relevant items for the user
- is the number of recommended items for the user
- is the precision computed at cutoff
- is a binary function defined as such:
After computing the for each user, the can be computed for the whole model:
MRR (MRR@K)
The (Mean Reciprocal Rank) computes, for each user, the inverse position of the first relevant item (that is, an item present in the ground truth of the user), and then the average is computed to obtain a model-wise metric.
In math formulas:
Where:
- is the set containing all users
- is the position of the first relevant item in the recommendation list of the user
NDCG (NDCG@K)
The (Normalized Discounted Cumulative Gain) metric compares the actual ranking with the ideal one. First, the is computed for each user:
Where:
- is the recommendation list for user
- is a binary function defined as such:
Then the for a single user is calculated using the following formula:
Where:
- is the sorted in descending order (representing the ideal ranking)
Finally, the of the whole model is calculated averaging the of each user:
Where:
- is the set containing all users
Error Metrics
Error metrics calculate the error the model made in predicting the rating a particular user would have given to an unseen item
MAE
The (Mean Absolute Error) computes, in absolute value, the difference between rating predicted and actual rating:
Where:
- is the set containing all users
- is the actual score give by user to item
- is the predicted score give by user to item
RMSE
The (Root Mean Squared Error) computes the difference, squared, between rating predicted and actual rating:
Where:
- is the set containing all users
- is the actual score give by user to item
- is the predicted score give by user to item