HoldOut partitioning technique
HoldOutPartitioning(train_set_size=None, test_set_size=None, shuffle=True, random_state=None, skip_user_error=True)
Bases: Partitioning
Class that performs Hold-Out partitioning
PARAMETER | DESCRIPTION |
---|---|
train_set_size |
Should be between 0.0 and 1.0 and represent the proportion of the ratings to hold in the train set for each user. If int, represents the absolute number of train samples. If None, the value is automatically set to the complement of the test size. |
test_set_size |
If float, should be between 0.0 and 1.0 and represent the proportion
of the dataset to include in the test split. If int, represents the
absolute number of test samples. If None, the value is set to the
complement of the train size. If |
random_state |
Controls the shuffling applied to the data before applying the split. Pass an int for reproducible output across multiple function calls.
TYPE:
|
shuffle |
Whether to shuffle the data before splitting.
TYPE:
|
skip_user_error |
If set to True, users for which data can't be split will be skipped and only a warning will be logged at the
end of the split process specifying n° of users skipped. Otherwise, a
TYPE:
|
Source code in clayrs/recsys/partitioning.py
222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 |
|
split_single(uir_user)
Method which splits train set and test set the ratings of a single user by holding in the train set of the user interactions accoring to the parameters set in the constructor
PARAMETER | DESCRIPTION |
---|---|
uir_user |
uir matrix containing interactions of a single user |
RETURNS | DESCRIPTION |
---|---|
List[np.ndarray]
|
The first list contains a uir matrix for each split constituting the train set of the user |
List[np.ndarray]
|
The second list contains a uir matrix for each split constituting the test set of the user |
Source code in clayrs/recsys/partitioning.py
249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 |
|