Properties from local dataset
PropertiesFromDataset(mode='only_retrieved_evaluated', field_name_list=None)
Bases: ExogenousPropertiesRetrieval
Exogenous technique which expands each content by using as external source the raw source itself
Different modalities are available:
- If
mode='only_retrieved_evaluated'
all fields for the content will be retrieved from raw source but discarding the ones with a blank value (i.e. '')
[{'Title': 'Jumanji', 'Year': 1995},
{'Title': 'Toy Story', 'Year': ''}]
json_file = JSONFile(json_path)
PropertiesFromDataset(mode='only_retrieved_evaluated').get_properties(json_file)
# output is a list of PropertiesDict object with the following values:
# [{'Title': 'Jumanji', 'Year': 1995},
# {'Title': 'Toy Story'}]
- If
mode='all'
all fields for the content will be retrieved from raw source including the ones with a blank value
[{'Title': 'Jumanji', 'Year': 1995},
{'Title': 'Toy Story', 'Year': ''}]
json_file = JSONFile(json_path)
PropertiesFromDataset(mode='only_retrieved_evaluated').get_properties(json_file)
# output is a list of PropertiesDict object with the following values:
# [{'Title': 'Jumanji', 'Year': 1995},
# {'Title': 'Toy Story', 'Year': ''}]
You could also choose exactly which fields to use to expand each content with the field_name_list
parameter
[{'Title': 'Jumanji', 'Year': 1995},
{'Title': 'Toy Story', 'Year': ''}]
json_file = JSONFile(json_path)
PropertiesFromDataset(mode='only_retrieved_evaluated',
field_name_list=['Title']).get_properties(json_file)
# output is a list of PropertiesDict object with the following values:
# [{'Title': 'Jumanji'},
# {'Title': 'Toy Story'}]
PARAMETER | DESCRIPTION |
---|---|
mode |
Parameter which specifies which properties should be retrieved. Possible values are ['only_retrieved_evaluated', 'all']:
TYPE:
|
field_name_list |
List of fields from the raw source that will be retrieved. Useful if you want to expand each content with only a subset of available properties from the local dataset |
Source code in clayrs/content_analyzer/exogenous_properties_retrieval.py
139 140 141 |
|