Skip to content

Low level techniques

ColorQuantization(n_colors=3, init='k-means++', n_init=10, max_iter=300, tol=0.0001, random_state=None, copy_x=True, algorithm='auto', flatten=False, imgs_dirs='imgs_dirs', max_timeout=2, max_retries=5, max_workers=0, batch_size=64, resize_size=(227, 227))

Bases: LowLevelVisual

Low level technique which returns the colors obtained from applying a clustering technique (in this case KMeans)

Arguments for SkLearn KMeans

PARAMETER DESCRIPTION
imgs_dirs

directory where the images are stored (or will be stored in the case of fields containing links)

TYPE: str DEFAULT: 'imgs_dirs'

max_timeout

maximum time to wait before considering a request failed (image from link)

TYPE: int DEFAULT: 2

max_retries

maximum number of retries to retrieve an image from a link

TYPE: int DEFAULT: 5

max_workers

maximum number of workers for parallelism

TYPE: int DEFAULT: 0

batch_size

batch size for the images dataloader

TYPE: int DEFAULT: 64

resize_size

since the Tensorflow dataset requires all images to be of the same size, they will all be resized to the specified size. Note that if you were to specify a resize transformer in the preprocessing pipeline, the size specified in the latter will be the final resize size

TYPE: Tuple[int, int] DEFAULT: (227, 227)

Source code in clayrs/content_analyzer/field_content_production_techniques/visual_techniques/low_level_techniques.py
298
299
300
301
302
303
304
305
306
307
def __init__(self, n_colors: Any = 3, init: Any = "k-means++", n_init: Any = 10, max_iter: Any = 300,
             tol: Any = 1e-4, random_state: Any = None, copy_x: Any = True, algorithm: Any = "auto",
             flatten: bool = False, imgs_dirs: str = "imgs_dirs", max_timeout: int = 2, max_retries: int = 5,
             max_workers: int = 0, batch_size: int = 64, resize_size: Tuple[int, int] = (227, 227)):

    super().__init__(imgs_dirs, max_timeout, max_retries, max_workers, batch_size, resize_size)
    self.k_means = KMeans(n_clusters=n_colors, init=init, n_init=n_init, max_iter=max_iter, tol=tol,
                          random_state=random_state, copy_x=copy_x, algorithm=algorithm)
    self.flatten = flatten
    self._repr_string = autorepr(self, inspect.currentframe())

ColorsHist(imgs_dirs='imgs_dirs', max_timeout=2, max_retries=5, max_workers=0, batch_size=64, resize_size=(227, 227))

Bases: LowLevelVisual

Low level technique which generates a color histogram for each channel of each RGB image

The technique retrieves all the values for each one of the three RGB channels in the image, flattens them and returns an EmbeddingField representation containing a numpy two-dimensional array with three rows (one for each channel)

PARAMETER DESCRIPTION
imgs_dirs

directory where the images are stored (or will be stored in the case of fields containing links)

TYPE: str DEFAULT: 'imgs_dirs'

max_timeout

maximum time to wait before considering a request failed (image from link)

TYPE: int DEFAULT: 2

max_retries

maximum number of retries to retrieve an image from a link

TYPE: int DEFAULT: 5

max_workers

maximum number of workers for parallelism

TYPE: int DEFAULT: 0

batch_size

batch size for the images dataloader

TYPE: int DEFAULT: 64

resize_size

since the Tensorflow dataset requires all images to be of the same size, they will all be resized to the specified size. Note that if you were to specify a resize transformer in the preprocessing pipeline, the size specified in the latter will be the final resize size

TYPE: Tuple[int, int] DEFAULT: (227, 227)

Source code in clayrs/content_analyzer/field_content_production_techniques/visual_techniques/low_level_techniques.py
248
249
250
251
252
def __init__(self, imgs_dirs: str = "imgs_dirs", max_timeout: int = 2, max_retries: int = 5,
             max_workers: int = 0, batch_size: int = 64, resize_size: Tuple[int, int] = (227, 227)):

    super().__init__(imgs_dirs, max_timeout, max_retries, max_workers, batch_size, resize_size)
    self._repr_string = autorepr(self, inspect.currentframe())

CustomFilterConvolution(weights, mode='reflect', cval=0.0, origin=0, flatten=False, imgs_dirs='imgs_dirs', max_timeout=2, max_retries=5, max_workers=0, batch_size=64, resize_size=(227, 227))

Bases: LowLevelVisual

Low level technique which implements a custom filter for convolution over an image, using the convolve method from the scipy library

Parameters are the same ones you would pass to the convolve function in scipy together with some framework specific parameters

Arguments for Scipy convolve

PARAMETER DESCRIPTION
flatten

whether the output of the technique should be flattened or not

TYPE: bool DEFAULT: False

imgs_dirs

directory where the images are stored (or will be stored in the case of fields containing links)

TYPE: str DEFAULT: 'imgs_dirs'

max_timeout

maximum time to wait before considering a request failed (image from link)

TYPE: int DEFAULT: 2

max_retries

maximum number of retries to retrieve an image from a link

TYPE: int DEFAULT: 5

max_workers

maximum number of workers for parallelism

TYPE: int DEFAULT: 0

batch_size

batch size for the images dataloader

TYPE: int DEFAULT: 64

resize_size

since the Tensorflow dataset requires all images to be of the same size, they will all be resized to the specified size. Note that if you were to specify a resize transformer in the preprocessing pipeline, the size specified in the latter will be the final resize size

TYPE: Tuple[int, int] DEFAULT: (227, 227)

Source code in clayrs/content_analyzer/field_content_production_techniques/visual_techniques/low_level_techniques.py
198
199
200
201
202
203
204
205
def __init__(self, weights, mode='reflect', cval=0.0, origin=0, flatten: bool = False,
             imgs_dirs: str = "imgs_dirs", max_timeout: int = 2, max_retries: int = 5,
             max_workers: int = 0, batch_size: int = 64, resize_size: Tuple[int, int] = (227, 227)):

    super().__init__(imgs_dirs, max_timeout, max_retries, max_workers, batch_size, resize_size)
    self.convolve = lambda x: convolve(x, weights=weights, mode=mode, cval=cval, origin=origin)
    self.flatten = flatten
    self._repr_string = autorepr(self, inspect.currentframe())

SkImageCannyEdgeDetector(sigma=1.0, low_threshold=None, high_threshold=None, mask=None, use_quantiles=False, mode='constant', cval=0.0, flatten=False, imgs_dirs='imgs_dirs', max_timeout=2, max_retries=5, max_workers=0, batch_size=64, resize_size=(227, 227))

Bases: LowLevelVisual

Low level technique which implements the Canny Edge Detector using the SkImage library

Parameters are the same ones you would pass to the canny function in SkImage together with some framework specific parameters

Arguments for SkImage Canny

PARAMETER DESCRIPTION
flatten

whether the output of the technique should be flattened or not

TYPE: bool DEFAULT: False

imgs_dirs

directory where the images are stored (or will be stored in the case of fields containing links)

TYPE: str DEFAULT: 'imgs_dirs'

max_timeout

maximum time to wait before considering a request failed (image from link)

TYPE: int DEFAULT: 2

max_retries

maximum number of retries to retrieve an image from a link

TYPE: int DEFAULT: 5

max_workers

maximum number of workers for parallelism

TYPE: int DEFAULT: 0

batch_size

batch size for the images dataloader

TYPE: int DEFAULT: 64

resize_size

since the Tensorflow dataset requires all images to be of the same size, they will all be resized to the specified size. Note that if you were to specify a resize transformer in the preprocessing pipeline, the size specified in the latter will be the final resize size

TYPE: Tuple[int, int] DEFAULT: (227, 227)

Source code in clayrs/content_analyzer/field_content_production_techniques/visual_techniques/low_level_techniques.py
142
143
144
145
146
147
148
149
150
151
152
def __init__(self, sigma=1.0, low_threshold=None, high_threshold=None, mask=None, use_quantiles=False,
             mode='constant', cval=0.0, flatten: bool = False,
             imgs_dirs: str = "imgs_dirs", max_timeout: int = 2, max_retries: int = 5,
             max_workers: int = 0, batch_size: int = 64, resize_size: Tuple[int, int] = (227, 227)):

    super().__init__(imgs_dirs, max_timeout, max_retries, max_workers, batch_size, resize_size)
    self.canny = lambda x: canny(image=x, sigma=sigma, low_threshold=low_threshold,
                                 high_threshold=high_threshold, mask=mask,
                                 use_quantiles=use_quantiles, mode=mode, cval=cval)
    self.flatten = flatten
    self._repr_string = autorepr(self, inspect.currentframe())

SkImageHogDescriptor(orientations=9, pixels_per_cell=(8, 8), cells_per_block=(3, 3), block_norm='L2-Hys', transform_sqrt=False, flatten=False, imgs_dirs='imgs_dirs', max_timeout=2, max_retries=5, max_workers=0, batch_size=64, resize_size=(227, 227))

Bases: LowLevelVisual

Low level technique which implements the Hog Descriptor using the SkImage library

Parameters are the same ones you would pass to the hog function in SkImage together with some framework specific parameters

Arguments for SkImage Hog

PARAMETER DESCRIPTION
flatten

whether the output of the technique should be flattened or not

TYPE: bool DEFAULT: False

imgs_dirs

directory where the images are stored (or will be stored in the case of fields containing links)

TYPE: str DEFAULT: 'imgs_dirs'

max_timeout

maximum time to wait before considering a request failed (image from link)

TYPE: int DEFAULT: 2

max_retries

maximum number of retries to retrieve an image from a link

TYPE: int DEFAULT: 5

max_workers

maximum number of workers for parallelism

TYPE: int DEFAULT: 0

batch_size

batch size for the images dataloader

TYPE: int DEFAULT: 64

resize_size

since the Tensorflow dataset requires all images to be of the same size, they will all be resized to the specified size. Note that if you were to specify a resize transformer in the preprocessing pipeline, the size specified in the latter will be the final resize size

TYPE: Tuple[int, int] DEFAULT: (227, 227)

Source code in clayrs/content_analyzer/field_content_production_techniques/visual_techniques/low_level_techniques.py
86
87
88
89
90
91
92
93
94
95
96
97
def __init__(self, orientations=9, pixels_per_cell=(8, 8), cells_per_block=(3, 3),
             block_norm='L2-Hys', transform_sqrt=False, flatten: bool = False,
             imgs_dirs: str = "imgs_dirs", max_timeout: int = 2, max_retries: int = 5,
             max_workers: int = 0, batch_size: int = 64, resize_size: Tuple[int, int] = (227, 227)):

    super().__init__(imgs_dirs, max_timeout, max_retries, max_workers, batch_size, resize_size)
    self.hog = lambda x, channel_axis: hog(x, orientations=orientations, pixels_per_cell=pixels_per_cell,
                                           cells_per_block=cells_per_block, block_norm=block_norm,
                                           transform_sqrt=transform_sqrt, feature_vector=flatten,
                                           channel_axis=channel_axis)

    self._repr_string = autorepr(self, inspect.currentframe())

SkImageLBP(p, r, method='default', flatten=False, as_image=False, imgs_dirs='imgs_dirs', max_timeout=2, max_retries=5, max_workers=0, batch_size=64, resize_size=(227, 227))

Bases: LowLevelVisual

Low level technique which allows for LBP feature detection from SkImage

Parameters are the same ones you would pass to the local_binary_pattern function in SkImage together with some framework specific parameters

Furthermore, in this case, there is also an additional parameter, that is 'as_image'

Arguments for SkImage lbp

PARAMETER DESCRIPTION
as_image

if True, the lbp image obtained from SkImage will be returned, otherwise the number of occurences of each binary pattern will be returned (as if it was a feature vector)

TYPE: bool DEFAULT: False

imgs_dirs

directory where the images are stored (or will be stored in the case of fields containing links)

TYPE: str DEFAULT: 'imgs_dirs'

max_timeout

maximum time to wait before considering a request failed (image from link)

TYPE: int DEFAULT: 2

max_retries

maximum number of retries to retrieve an image from a link

TYPE: int DEFAULT: 5

max_workers

maximum number of workers for parallelism

TYPE: int DEFAULT: 0

batch_size

batch size for the images dataloader

TYPE: int DEFAULT: 64

resize_size

since the Tensorflow dataset requires all images to be of the same size, they will all be resized to the specified size. Note that if you were to specify a resize transformer in the preprocessing pipeline, the size specified in the latter will be the final resize size

TYPE: Tuple[int, int] DEFAULT: (227, 227)

Source code in clayrs/content_analyzer/field_content_production_techniques/visual_techniques/low_level_techniques.py
416
417
418
419
420
421
422
423
424
def __init__(self, p: int, r: float, method='default', flatten: bool = False, as_image: bool = False,
             imgs_dirs: str = "imgs_dirs", max_timeout: int = 2, max_retries: int = 5,
             max_workers: int = 0, batch_size: int = 64, resize_size: Tuple[int, int] = (227, 227)):

    super().__init__(imgs_dirs, max_timeout, max_retries, max_workers, batch_size, resize_size)
    self.lbp = lambda x: local_binary_pattern(x, P=p, R=r, method=method)
    self.flatten = flatten
    self.as_image = as_image
    self._repr_string = autorepr(self, inspect.currentframe())

SkImageSIFT(upsampling=2, n_octaves=8, n_scales=3, sigma_min=1.6, sigma_in=0.5, c_dog=0.013333333333333334, c_edge=10, n_bins=36, lambda_ori=1.5, c_max=0.8, lambda_descr=6, n_hist=4, n_ori=8, flatten=False, imgs_dirs='imgs_dirs', max_timeout=2, max_retries=5, max_workers=0, batch_size=64, resize_size=(227, 227))

Bases: LowLevelVisual

Low level technique which allows for SIFT feature detection from SkImage

Parameters are the same ones you would pass to the SIFT object in SkImage together with some framework specific parameters

Arguments for SkImage SIFT

PARAMETER DESCRIPTION
imgs_dirs

directory where the images are stored (or will be stored in the case of fields containing links)

TYPE: str DEFAULT: 'imgs_dirs'

max_timeout

maximum time to wait before considering a request failed (image from link)

TYPE: int DEFAULT: 2

max_retries

maximum number of retries to retrieve an image from a link

TYPE: int DEFAULT: 5

max_workers

maximum number of workers for parallelism

TYPE: int DEFAULT: 0

batch_size

batch size for the images dataloader

TYPE: int DEFAULT: 64

resize_size

since the Tensorflow dataset requires all images to be of the same size, they will all be resized to the specified size. Note that if you were to specify a resize transformer in the preprocessing pipeline, the size specified in the latter will be the final resize size

TYPE: Tuple[int, int] DEFAULT: (227, 227)

Source code in clayrs/content_analyzer/field_content_production_techniques/visual_techniques/low_level_techniques.py
357
358
359
360
361
362
363
364
365
366
367
def __init__(self, upsampling=2, n_octaves=8, n_scales=3, sigma_min=1.6, sigma_in=0.5, c_dog=0.013333333333333334,
             c_edge=10, n_bins=36, lambda_ori=1.5, c_max=0.8, lambda_descr=6, n_hist=4, n_ori=8,
             flatten: bool = False, imgs_dirs: str = "imgs_dirs", max_timeout: int = 2, max_retries: int = 5,
             max_workers: int = 0, batch_size: int = 64, resize_size: Tuple[int, int] = (227, 227)):

    super().__init__(imgs_dirs, max_timeout, max_retries, max_workers, batch_size, resize_size)
    self.sift = SIFT(upsampling=upsampling, n_octaves=n_octaves, n_scales=n_scales, sigma_min=sigma_min,
                     sigma_in=sigma_in, c_dog=c_dog, c_edge=c_edge, n_bins=n_bins, lambda_ori=lambda_ori,
                     c_max=c_max, lambda_descr=lambda_descr, n_hist=n_hist, n_ori=n_ori)
    self.flatten = flatten
    self._repr_string = autorepr(self, inspect.currentframe())