Eval Model class

`EvalModel(pred_list, truth_list, metric_list)`

Class for evaluating a recommender system.

The Evaluation module needs the following parameters:

A list of computed rank/predictions (in case multiple splits must be evaluated)
A list of truths (in case multiple splits must be evaluated)
List of metrics to compute

Obviously the list of computed rank/predictions and list of truths must have the same length, and the rank/prediction in position \(i\) will be compared with the truth at position \(i\)

Examples:

>>> import clayrs.evaluation as eva
>>>
>>> em = eva.EvalModel(
>>>         pred_list=rank_list,
>>>         truth_list=truth_list,
>>>         metric_list=[
>>>             eva.NDCG(),
>>>             eva.Precision()
>>>             eva.RecallAtK(k=5, sys_average='micro')
>>>         ]
>>> )

Then call the fit() method of the instantiated EvalModel to perform the actual evaluation

PARAMETER DESCRIPTION

pred_list

Recommendations list to evaluate. It's a list in case multiple splits must be evaluated. Both Rank objects (where items are ordered and the score is not relevant) or Prediction objects (where the score predicted is the predicted rating for the user regarding a certain item) can be evaluated

TYPE: Union[List[Prediction], List[Rank]]

truth_list

Ground truths list used to compare recommendations. It's a list in case multiple splits must be evaluated.

TYPE: List[Ratings]

metric_list

List of metrics that will be used to evaluate recommendation list specified

TYPE: List[Metric]

RAISES	DESCRIPTION
`ValueError`	ValueError is raised in case the pred_list and truth_list are empty or have different length

Source code in clayrs/evaluation/eval_model.py

def __init__(self,
             pred_list: Union[List[Prediction], List[Rank]],
             truth_list: List[Ratings],
             metric_list: List[Metric]):

    if len(pred_list) == 0 and len(truth_list) == 0:
        raise ValueError("List containing predictions and list containing ground truths are empty!")
    elif len(pred_list) != len(truth_list):
        raise ValueError("List containing predictions and list containing ground truths must have the same length!")

    self._pred_list = pred_list
    self._truth_list = truth_list
    self._metric_list = metric_list

    self._yaml_report_result = None

`metric_list: List[Metric]` `property`

List of metrics used to evaluate recommendation lists

RETURNS	DESCRIPTION
`List[Metric]`	The list containing all metrics

`pred_list: Union[List[Prediction], List[Rank]]` `property`

List containing recommendations frame

RETURNS	DESCRIPTION
`Union[List[Prediction], List[Rank]]`	The list containing recommendations frame

`truth_list: List[Ratings]` `property`

List containing ground truths

RETURNS	DESCRIPTION
`List[Ratings]`	The list containing ground truths

`append_metric(metric)`

Append a metric to the metric list that will be used to evaluate recommendation lists

PARAMETER DESCRIPTION

metric

Metric to append to the metric list

TYPE: Metric

Source code in clayrs/evaluation/eval_model.py

def append_metric(self, metric: Metric):
    """
    Append a metric to the metric list that will be used to evaluate recommendation lists

    Args:
        metric: Metric to append to the metric list
    """
    self._metric_list.append(metric)

`fit(user_id_list=None)`

This method performs the actual evaluation of the recommendation frames passed as input in the constructor of the class

In case you want to perform evaluation for selected users, specify their ids parameter of this method. Otherwise, all users in the recommendation frames will be considered in the evaluation process

Examples:

>>> import clayrs.evaluation as eva
>>> selected_users = ['u1', 'u22', 'u3'] # (1)
>>> em = eva.EvalModel(
>>>         pred_list,
>>>         truth_list,
>>>         metric_list=[eva.Precision(), eva.Recall()]
>>> )
>>> em.fit(selected_users)

The method returns two pandas DataFrame: one containing system results for every metric in the metric list, one containing users results for every metric eligible

PARAMETER DESCRIPTION

user_id_list

list of string ids for the users to consider in the evaluation (note that only string ids are accepted and not their mapped integers)

TYPE: Optional[List[str]] DEFAULT: None

RETURNS	DESCRIPTION
`pd.DataFrame`	The first DataFrame contains the system result for every metric inside the metric_list
`pd.DataFrame`	The second DataFrame contains every user results for every metric eligible inside the metric_list

Source code in clayrs/evaluation/eval_model.py

def fit(self, user_id_list: Optional[List[str]] = None) -> Tuple[pd.DataFrame, pd.DataFrame]:
    """
    This method performs the actual evaluation of the recommendation frames passed as input in the constructor of
    the class

    In case you want to perform evaluation for selected users, specify their ids parameter of this method.
    Otherwise, all users in the recommendation frames will be considered in the evaluation process

    Examples:

        >>> import clayrs.evaluation as eva
        >>> selected_users = ['u1', 'u22', 'u3'] # (1)
        >>> em = eva.EvalModel(
        >>>         pred_list,
        >>>         truth_list,
        >>>         metric_list=[eva.Precision(), eva.Recall()]
        >>> )
        >>> em.fit(selected_users)

    The method returns two pandas DataFrame: one containing ***system results*** for every metric in the metric
    list, one containing ***users results*** for every metric eligible

    Args:
        user_id_list: list of string ids for the users to consider in the evaluation (note that only string ids are
            accepted and not their mapped integers)

    Returns:
        The first DataFrame contains the **system result** for every metric inside the metric_list

        The second DataFrame contains every **user results** for every metric eligible inside the metric_list
    """
    logger.info('Performing evaluation on metrics chosen')

    final_pred_list = []
    final_truth_list = []

    # if user id list is passed, convert it to int if necessary and append the new ratings filtered with
    # only the users of interest
    if user_id_list is not None:

        for pred, truth in zip(self._pred_list, self._truth_list):

            split_users = user_id_list
            split_truth_users = set(truth.user_map.convert_seq_str2int(split_users))
            split_pred_users = set(pred.user_map.convert_seq_str2int(split_users))

            final_pred_list.append(pred.filter_ratings(list(split_pred_users)))
            final_truth_list.append(truth.filter_ratings(list(split_truth_users)))

    # otherwise the original lists are kept
    else:

        final_pred_list = self._pred_list
        final_truth_list = self._truth_list

    sys_result, users_result = MetricEvaluator(final_pred_list, final_truth_list).eval_metrics(self.metric_list)

    # we save the sys result for report yaml
    self._yaml_report_result = sys_result.to_dict(orient='index')

    return sys_result, users_result

Eval Model class

EvalModel(pred_list, truth_list, metric_list)

metric_list: List[Metric] property

pred_list: Union[List[Prediction], List[Rank]] property

truth_list: List[Ratings] property

append_metric(metric)

fit(user_id_list=None)

`EvalModel(pred_list, truth_list, metric_list)`

`metric_list: List[Metric]` `property`

`pred_list: Union[List[Prediction], List[Rank]]` `property`

`truth_list: List[Ratings]` `property`

`append_metric(metric)`

`fit(user_id_list=None)`