Report class

`Report(output_dir='.', ca_report_filename='ca_report', rs_report_filename='rs_report', eva_report_filename='eva_report')`

Class which will generate a YAML report for the whole experiment (or a part of it) depending on the objects passed to the yaml() function.

A report will be generated for each module used (Content Analyzer, RecSys, Evaluation).

PARAMETER	DESCRIPTION
`output_dir`	Path of the folder where reports generated will be saved TYPE: `str` DEFAULT: `'.'`
`ca_report_filename`	Filename of the Content Analyzer report TYPE: `str` DEFAULT: `'ca_report'`
`rs_report_filename`	Filename of the Recsys report TYPE: `str` DEFAULT: `'rs_report'`
`eva_report_filename`	Filename of the evaluation report TYPE: `str` DEFAULT: `'eva_report'`

Source code in clayrs/utils/report.py

def __init__(self, output_dir: str = '.',
             ca_report_filename: str = 'ca_report',
             rs_report_filename: str = 'rs_report',
             eva_report_filename: str = 'eva_report'):

    self._output_dir = output_dir
    self._ca_report_filename = ca_report_filename
    self._rs_report_filename = rs_report_filename
    self._eva_report_filename = eva_report_filename

`yaml(content_analyzer=None, original_ratings=None, partitioning_technique=None, recsys=None, eval_model=None)`

Main module responsible of generating the YAML reports based on the objects passed to this function:

If content_analyzer is set, then the report for the Content Analyzer will be produced
If one between original_ratings, partitioning_technique, recsys is set, then the report for the recsys module will be produced.
If eval_model is set, then the report for the evaluation module will be produced

PLEASE NOTE: by setting the recsys parameter, the last experiment conducted will be documented! If no experiment is conducted in the current run, then a ValueError exception is raised!

Same goes for the eval_model

Examples:

Generate a report for the Content Analyzer module

>>> from clayrs import content_analyzer as ca
>>> from clayrs import utils as ut
>>> # movies_ca_config = ...  # user defined configuration
>>> content_a = ca.ContentAnalyzer(movies_config)
>>> content_a.fit()  # generate and serialize contents
>>> ut.Report().yaml(content_analyzer=content_a)  # generate yaml

Generate a partial report for the RecSys module

>>> from clayrs import utils as ut
>>> from clayrs import recsys as rs
>>> ratings = ca.Ratings(ca.CSVFile(ratings_path))
>>> pt = rs.HoldOutPartitioning()
>>> [train], [test] = pt.split_all(ratings)
>>> ut.Report().yaml(original_ratings=ratings, partitioning_technique=pt)

Generate a full report for the RecSys module and evaluation module

>>> from clayrs import utils as ut
>>> from clayrs import recsys as rs
>>> from clayrs import evaluation as eva
>>>
>>> # Generate recommendations
>>> ratings = ca.Ratings(ca.CSVFile(ratings_path))
>>> pt = rs.HoldOutPartitioning()
>>> [train], [test] = pt.split_all(ratings)
>>> alg = rs.CentroidVector()
>>> cbrs = rs.ContentBasedRS(alg, train_set=train, items_directory=items_path)
>>> rank = cbrs.fit_rank(test, n_recs=10)
>>>
>>> # Evaluate recommendations and generate report
>>> em = eva.EvalModel([rank], [test], metric_list=[eva.Precision(), eva.Recall()])
>>> ut.Report().yaml(original_ratings=ratings,
>>>                  partitioning_technique=pt,
>>>                  recsys=cbrs,
>>>                  eval_model=em)

PARAMETER	DESCRIPTION
`content_analyzer`	`ContentAnalyzer` object used to generate complex representation in the experiment TYPE: `ContentAnalyzer` DEFAULT: `None`
`original_ratings`	`Ratings` object representing the original dataset TYPE: `Ratings` DEFAULT: `None`
`partitioning_technique`	`Partitioning` object used to split the original dataset TYPE: `Partitioning` DEFAULT: `None`
`recsys`	`RecSys` object used to produce recommendations/score predictions. Please note that the latest experiment run will be documented. If no experiment is run, then an exception is thrown TYPE: `RecSys` DEFAULT: `None`
`eval_model`	`EvalModel` object used to evaluate predictions generated. Please note that the latest evaluation run will be documented. If no evaluation is run, then an exception is thrown TYPE: `EvalModel` DEFAULT: `None`

Source code in clayrs/utils/report.py

def yaml(self, content_analyzer: ContentAnalyzer = None,
         original_ratings: Ratings = None,
         partitioning_technique: Partitioning = None,
         recsys: RecSys = None,
         eval_model: EvalModel = None):
    """
    Main module responsible of generating the `YAML` reports based on the objects passed to this function:

    * If `content_analyzer` is set, then the report for the Content Analyzer will be produced
    * If one between `original_ratings`, `partitioning_technique`, `recsys` is set, then the report for the recsys
    module will be produced.
    * If `eval_model` is set, then the report for the evaluation module will be produced

    **PLEASE NOTE**: by setting the `recsys` parameter, the last experiment conducted will be documented! If no
    experiment is conducted in the current run, then a `ValueError` exception is raised!

    * Same goes for the `eval_model`

    Examples:

        * Generate a report for the Content Analyzer module
        >>> from clayrs import content_analyzer as ca
        >>> from clayrs import utils as ut
        >>> # movies_ca_config = ...  # user defined configuration
        >>> content_a = ca.ContentAnalyzer(movies_config)
        >>> content_a.fit()  # generate and serialize contents
        >>> ut.Report().yaml(content_analyzer=content_a)  # generate yaml

        * Generate a partial report for the RecSys module
        >>> from clayrs import utils as ut
        >>> from clayrs import recsys as rs
        >>> ratings = ca.Ratings(ca.CSVFile(ratings_path))
        >>> pt = rs.HoldOutPartitioning()
        >>> [train], [test] = pt.split_all(ratings)
        >>> ut.Report().yaml(original_ratings=ratings, partitioning_technique=pt)

        * Generate a full report for the RecSys module and evaluation module
        >>> from clayrs import utils as ut
        >>> from clayrs import recsys as rs
        >>> from clayrs import evaluation as eva
        >>>
        >>> # Generate recommendations
        >>> ratings = ca.Ratings(ca.CSVFile(ratings_path))
        >>> pt = rs.HoldOutPartitioning()
        >>> [train], [test] = pt.split_all(ratings)
        >>> alg = rs.CentroidVector()
        >>> cbrs = rs.ContentBasedRS(alg, train_set=train, items_directory=items_path)
        >>> rank = cbrs.fit_rank(test, n_recs=10)
        >>>
        >>> # Evaluate recommendations and generate report
        >>> em = eva.EvalModel([rank], [test], metric_list=[eva.Precision(), eva.Recall()])
        >>> ut.Report().yaml(original_ratings=ratings,
        >>>                  partitioning_technique=pt,
        >>>                  recsys=cbrs,
        >>>                  eval_model=em)

    Args:
        content_analyzer: `ContentAnalyzer` object used to generate complex representation in the experiment
        original_ratings: `Ratings` object representing the original dataset
        partitioning_technique: `Partitioning` object used to split the original dataset
        recsys: `RecSys` object used to produce recommendations/score predictions. Please note that the latest
            experiment run will be documented. If no experiment is run, then an exception is thrown
        eval_model: `EvalModel` object used to evaluate predictions generated. Please note that the latest
            evaluation run will be documented. If no evaluation is run, then an exception is thrown
    """

    def represent_none(self, _):
        return self.represent_scalar('tag:yaml.org,2002:null', 'null')

    def dump_yaml(output_dir, data):
        with open(output_dir, 'w') as yaml_file:
            pyaml.dump(data, yaml_file, sort_dicts=False, safe=True,)

    # None values will be represented as 'null' in yaml file.
    # without this, they will simply be represented as an empty string
    pyaml.add_representer(type(None), represent_none)

    if content_analyzer is not None:
        ca_dict = self._report_ca_module(content_analyzer)

        # create folder if it doesn't exist
        Path(self.output_dir).mkdir(parents=True, exist_ok=True)

        output_dir = os.path.join(self.output_dir, f'{self._ca_report_filename}.yml')
        dump_yaml(output_dir, ca_dict)

    if original_ratings is not None or partitioning_technique is not None or recsys is not None:
        rs_dict = self._report_rs_module(original_ratings, partitioning_technique, recsys)

        # create folder if it doesn't exist
        Path(self.output_dir).mkdir(parents=True, exist_ok=True)

        output_dir = os.path.join(self.output_dir, f'{self._rs_report_filename}.yml')
        dump_yaml(output_dir, rs_dict)

    if eval_model is not None:
        eva_dict = self._report_eva_module(eval_model)

        # create folder if it doesn't exist
        Path(self.output_dir).mkdir(parents=True, exist_ok=True)

        output_dir = os.path.join(self.output_dir, f'{self._eva_report_filename}.yml')
        dump_yaml(output_dir, eva_dict)