Plot metrics
Plot metrics save a plot in the chosen output directory
LongTailDistr(out_dir='.', file_name='long_tail_distr', on='truth', format='png', overwrite=False)
Bases: PlotMetric
This metric generates the Long Tail Distribution plot and saves it in the output directory with the file name specified. The plot can be generated both for the truth set or the predictions set (based on the on parameter):
-
on = 'truth': in this case the long tail distribution is useful to see which are the most popular items (the most rated ones)
-
on = 'pred': in this case the long tail distribution is useful to see which are the most recommended items
The plot file will be saved as out_dir/file_name.format
Since multiple split could be evaluated at once, the overwrite parameter comes into play:
if is set to False, file with the same name will be saved as file_name (1).format
, file_name (2).format
, etc.
so that for every split a plot is generated without overwriting any file previously generated
PARAMETER | DESCRIPTION |
---|---|
out_dir |
Directory where the plot will be saved. Default is '.', meaning that the plot will be saved in the same directory where the python script it's being executed
TYPE:
|
file_name |
Name of the plot file. Default is 'long_tail_distr'
TYPE:
|
on |
Set on which the Long Tail Distribution plot will be generated. Values accepted are 'truth' or 'pred'
TYPE:
|
format |
Format of the plot file. Could be 'jpg', 'svg', 'png'. Default is 'png'
TYPE:
|
overwrite |
parameter which specifies if the plot saved must overwrite any file that as the same name ('file_name.format'). Default is False
TYPE:
|
RAISES | DESCRIPTION |
---|---|
ValueError
|
exception raised when a invalid value for the 'on' parameter is specified |
Source code in clayrs/evaluation/metrics/plot_metrics.py
115 116 117 118 119 120 121 122 123 |
|
PopRatioProfileVsRecs(user_groups, user_profiles, original_ratings, out_dir='.', file_name='pop_ratio_profile_vs_recs', pop_percentage=0.2, store_frame=False, format='png', overwrite=False)
Bases: GroupFairnessMetric
, PlotMetric
This metric generates a plot where users are split into groups and, for every group, a boxplot comparing profile popularity ratio and recommendations popularity ratio is drawn
Users are split into groups based on the user_groups parameter, which contains names of the groups as keys, and percentage of how many user must contain a group as values. For example:
user_groups = {'popular_users': 0.3, 'medium_popular_users': 0.2, 'low_popular_users': 0.5}
Every user will be inserted in a group based on how many popular items the user has rated (in relation to the percentage of users we specified as value in the dictionary):
- users with many popular items will be inserted into the first group
- users with niche items rated will be inserted into one of the last groups.
In general users are grouped by \(Popularity\_ratio\) in a descending order. \(Popularity\_ratio\) for a single user \(u\) is defined as:
The most popular items are the first pop_percentage
% items of all items ordered in a descending order by
popularity.
The popularity of an item is defined as the number of times it is rated in the original_ratings
parameter
divided by the total number of users in the original_ratings
.
It can happen that for a particular user of a group no recommendation are available: in that case it will be skipped and it won't be considered in the \(Popularity\_ratio\) computation of its group. In case no user of a group has recs available, a warning will be printed and the whole group won't be considered.
The plot file will be saved as out_dir/file_name.format
Since multiple split could be evaluated at once, the overwrite
parameter comes into play:
if is set to False, file with the same name will be saved as file_name (1).format
, file_name (2).format
, etc.
so that for every split a plot is generated without overwriting any file previously generated
Thanks to the 'store_frame' parameter it's also possible to store a csv containing the calculations done in order to build every boxplot. Will be saved in the same directory and with the same file name as the plot itself (but with the .csv format):
The csv will be saved as out_dir/file_name.csv
Please note: once computed, the DeltaGAP class needs to be re-instantiated in case you want to compute it again!
PARAMETER | DESCRIPTION |
---|---|
user_groups |
Dict containing group names as keys and percentage of users as value, used to split users in groups. Users with more popular items rated are grouped into the first group, users with slightly less popular items rated are grouped into the second one, etc.
TYPE:
|
user_profiles |
one or more |
original_ratings |
TYPE:
|
out_dir |
Directory where the plot will be saved. Default is '.', meaning that the plot will be saved in the same directory where the python script it's being executed
TYPE:
|
file_name |
Name of the plot file. Default is 'pop_ratio_profile_vs_recs'
TYPE:
|
pop_percentage |
How many (in percentage) 'most popular items' must be considered. Default is 0.2
TYPE:
|
store_frame |
True if you want to store calculations done in order to build every boxplot in a csv file, False otherwise. Default is set to False
TYPE:
|
format |
Format of the plot file. Could be 'jpg', 'svg', 'png'. Default is 'png'
TYPE:
|
overwrite |
parameter which specifies if the plot saved must overwrite any file that as the same name ('file_name.format'). Default is False
TYPE:
|
Source code in clayrs/evaluation/metrics/plot_metrics.py
242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 |
|
PopRecsCorrelation(original_ratings, out_dir='.', file_name='pop_recs_correlation', mode='both', format='png', overwrite=False)
Bases: PlotMetric
This metric generates a plot which has as the X-axis the popularity of each item and as Y-axis the recommendation frequency, so that it can be easily seen the correlation between popular (niche) items and how many times are being recommended
The popularity of an item is defined as the number of times it is rated in the original_ratings
parameter
divided by the total number of users in the original_ratings
.
The plot file will be saved as out_dir/file_name.format
Since multiple split could be evaluated at once, the overwrite parameter comes into play:
if is set to False, file with the same name will be saved as file_name (1).format
, file_name (2).format
, etc.
so that for every split a plot is generated without overwriting any file previously generated
There exists cases in which some items are not recommended even once, so in the graph could appear zero recommendations. One could change this behaviour thanks to the 'mode' parameter:
-
mode='both': two graphs will be created, the first one containing eventual zero recommendations, the second one where zero recommendations are excluded. This additional graph will be stored as out_dir/file_name_no_zeros.format (the string '_no_zeros' will be added to the file_name chosen automatically)
-
mode='w_zeros': only a graph containing eventual zero recommendations will be created
-
mode='no_zeros': only a graph excluding eventual zero recommendations will be created. The graph will be saved as out_dir/file_name_no_zeros.format (the string '_no_zeros' will be added to the file_name chosen automatically)
PARAMETER | DESCRIPTION |
---|---|
original_ratings |
TYPE:
|
out_dir |
Directory where the plot will be saved. Default is '.', meaning that the plot will be saved in the same directory where the python script it's being executed
TYPE:
|
file_name |
Name of the plot file. Default is 'pop_recs_correlation'
TYPE:
|
mode |
Parameter which dictates which graph must be created. By default is 'both', so the graph with eventual zero recommendations as well as the graph excluding eventual zero recommendations will be created. Check the class documentation for more
TYPE:
|
format |
Format of the plot file. Could be 'jpg', 'svg', 'png'. Default is 'png'
TYPE:
|
overwrite |
parameter which specifies if the plot saved must overwrite any file that as the same name ('file_name.format'). Default is False
TYPE:
|
Source code in clayrs/evaluation/metrics/plot_metrics.py
460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 |
|
build_no_zeros_plot(popularity, recommendations)
Method which builds and saves the plot excluding eventual zero recommendations It saves the plot as out_dir/filename_no_zeros.format, according to their value passed in the constructor. Note that the '_no_zeros' string is automatically added to the file_name chosen
PARAMETER | DESCRIPTION |
---|---|
popularity |
x-axis values representing popularity of every item
TYPE:
|
recommendations |
y-axis values representing number of times every item has been recommended
TYPE:
|
Source code in clayrs/evaluation/metrics/plot_metrics.py
531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 |
|
build_plot(x, y, title)
Method which builds a matplotlib plot given x-axis values, y-axis values and the title of the plot. X-axis label and Y-axis label are hard-coded as 'Popularity' and 'Recommendation frequency' respectively.
PARAMETER | DESCRIPTION |
---|---|
x |
List containing x-axis values
TYPE:
|
y |
List containing y-axis values
TYPE:
|
title |
title of the plot
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
matplotlib.figure.Figure
|
The matplotlib figure |
Source code in clayrs/evaluation/metrics/plot_metrics.py
488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 |
|
build_w_zeros_plot(popularity, recommendations)
Method which builds and saves the plot containing eventual zero recommendations It saves the plot as out_dir/filename.format, according to their value passed in the constructor
PARAMETER | DESCRIPTION |
---|---|
popularity |
x-axis values representing popularity of every item
TYPE:
|
recommendations |
y-axis values representing number of times every item has been recommended
TYPE:
|
Source code in clayrs/evaluation/metrics/plot_metrics.py
515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 |
|