Skip to content

Bipartite Graph

NXBipartiteGraph(source_frame=None, link_label=None)

Bases: BipartiteDiGraph

Class that implements a Bipartite graph through networkx library.

Info

A Bipartite Graph is a graph which supports only User nodes and Item nodes. If you need to model also other node categories, consider using a Tripartite Graph or a Full Graph

It creates a graph from an initial Rating object.

Consider the following matrix representation of the Rating object

    +------+-----------+-------+
    | User |   Item    | Score |
    +------+-----------+-------+
    | u1   | Tenet     |     4 |
    | u2   | Inception |     5 |
    | ...  | ...       |   ... |
    +------+-----------+-------+

The graph will be created with the following interactions:

             4
        u1 -----> Tenet
             5
        u2 -----> Inception

where u1 and u2 become User nodes and Tenet and Inception become Item nodes, with the edge weighted depending on the score given

If the link_label parameter is specified, then each link between users and items will be labeled with the label specified (e.g. link_label='score'):

        (4, 'score')
    u1 -------------> Tenet
        (5, 'score')
    u2 -------------> Inception
PARAMETER DESCRIPTION
source_frame

the initial Ratings object needed to create the graph

TYPE: Ratings DEFAULT: None

link_label

If specified, each link will be labeled with the given label. Default is None

TYPE: str DEFAULT: None

Source code in clayrs/recsys/graphs/nx_implementation/nx_bipartite_graphs.py
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
def __init__(self, source_frame: Ratings = None, link_label: str = None):

    self._graph = nx.DiGraph()

    if source_frame is not None:
        not_none_dict = {}
        if link_label is not None:
            not_none_dict['label'] = link_label

        user_column = source_frame.user_id_column
        item_column = source_frame.item_id_column
        score_column = source_frame.score_column
        timestamp_column = source_frame.timestamp_column

        if len(timestamp_column) != 0:
            frame_iterator = zip(user_column, item_column, score_column, timestamp_column)
        else:
            frame_iterator = zip(user_column, item_column, score_column)

        with get_progbar(frame_iterator, total=len(source_frame)) as progbar:
            progbar.set_description("Creating User->Item links")

            if len(timestamp_column) != 0:
                edges_with_attributes_gen = ((UserNode(interaction[0]), ItemNode(interaction[1]),

                                              # {**x, **y} merges the dicts x and y
                                              {**not_none_dict, **{'weight': interaction[2],
                                                                   'timestamp': interaction[3]}}
                                              )
                                             for interaction in progbar)
            else:
                edges_with_attributes_gen = ((UserNode(interaction[0]), ItemNode(interaction[1]),

                                              # {**x, **y} merges the dicts x and y
                                              {**not_none_dict, **{'weight': interaction[2]}})
                                             for interaction in progbar)

            self._graph.add_edges_from(edges_with_attributes_gen)

item_nodes: Set[ItemNode] property

Returns a set of all Item nodes in the graph

user_nodes: Set[UserNode] property

Returns a set of all User nodes in the graph

Creates a link connecting the start_node to the final_node. If two lists are passed, then the node in position \(i\) in the start_node list will be linked to the node in position \(i\) in the final_node list.

If nodes to link do not exist, they will be added automatically to the graph. Please remember that since this is a Bipartite Graph, only User nodes and Item nodes can be added!

A link can be weighted with the weight parameter and labeled with the label parameter. A timestamp can also be specified via timestamp parameter. All three are optional parameters, so they are not required

PARAMETER DESCRIPTION
start_node

Single Node object or a list of Node objects. They will be the 'head' of the link, since it's a directed graph

TYPE: Union[Node, List[Node]]

final_node

Single Node object or a list Node objects. They will be the 'tail' of the link, since it's a directed graph

TYPE: object

weight

weight of the link, default is None (no weight)

TYPE: float DEFAULT: None

label

label of the link, default is None (no label)

TYPE: str DEFAULT: None

timestamp

timestamp of the link, default is None (no timestamp)

TYPE: str DEFAULT: None

Source code in clayrs/recsys/graphs/nx_implementation/nx_bipartite_graphs.py
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
def add_link(self, start_node: Union[Node, List[Node]], final_node: Union[Node, List[Node]],
             weight: float = None, label: str = None, timestamp: str = None):
    """
    Creates a link connecting the `start_node` to the `final_node`. If two lists are passed, then the node in
    position $i$ in the `start_node` list will be linked to the node in position $i$ in the `final_node` list.

    If nodes to link do not exist, they will be added automatically to the graph. Please remember that since this is
    a Bipartite Graph, only *User nodes* and *Item nodes* can be added!

    A link can be weighted with the `weight` parameter and labeled with the `label` parameter.
    A timestamp can also be specified via `timestamp` parameter.
    All three are optional parameters, so they are not required

    Args:
        start_node: Single Node object or a list of Node objects. They will be the 'head' of the link, since it's a
            directed graph
        final_node (object): Single Node object or a list Node objects. They will be the 'tail' of the link,
            since it's a directed graph
        weight: weight of the link, default is None (no weight)
        label: label of the link, default is None (no label)
        timestamp: timestamp of the link, default is None (no timestamp)
    """
    if not isinstance(start_node, list):
        start_node = [start_node]

    if not isinstance(final_node, list):
        final_node = [final_node]

    self.add_node(start_node)
    self.add_node(final_node)

    not_none_dict = {}
    if label is not None:
        not_none_dict['label'] = label
    if weight is not None:
        not_none_dict['weight'] = weight
    if timestamp is not None:
        not_none_dict['timestamp'] = timestamp

    self._graph.add_edges_from(zip(start_node, final_node),
                               **not_none_dict)

add_node(node)

Adds one or multiple Node objects to the graph. Since this is a Bipartite Graph, only User Node and Item Node can be added!

No duplicates are allowed, but different category nodes with same id are (e.g. ItemNode('1') and UserNode('1'))

PARAMETER DESCRIPTION
node

Node(s) object(s) that needs to be added to the graph

TYPE: Union[Node, List[Node]]

RAISES DESCRIPTION
ValueError

Exception raised when one of the node to add to the graph is not a User or Item node

Source code in clayrs/recsys/graphs/nx_implementation/nx_bipartite_graphs.py
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
def add_node(self, node: Union[Node, List[Node]]):
    """
    Adds one or multiple Node objects to the graph.
    Since this is a Bipartite Graph, only `User Node` and `Item Node` can be added!

    No duplicates are allowed, but different category nodes with same id are (e.g. `ItemNode('1')` and
    `UserNode('1')`)

    Args:
        node: Node(s) object(s) that needs to be added to the graph

    Raises:
        ValueError: Exception raised when one of the node to add to the graph is not a User or Item node
    """
    if not isinstance(node, list):
        node = [node]

    if any(not isinstance(n, (UserNode, ItemNode)) for n in node):
        raise ValueError("You can only add UserNodes or ItemNodes to a bipartite graph!")

    self._graph.add_nodes_from(node)

closeness_centrality()

Calculate the closeness centrality for every node in the graph

RETURNS DESCRIPTION
Dict

Dictionary containing the closeness centrality for each node in the graph

Source code in clayrs/recsys/graphs/nx_implementation/nx_bipartite_graphs.py
304
305
306
307
308
309
310
311
def closeness_centrality(self) -> Dict:
    """
    Calculate the closeness centrality for every node in the graph

    Returns:
        Dictionary containing the closeness centrality for each node in the graph
    """
    return nx.closeness_centrality(self._graph)

degree_centrality()

Calculate the degree centrality for every node in the graph

RETURNS DESCRIPTION
Dict

Dictionary containing the degree centrality for each node in the graph

Source code in clayrs/recsys/graphs/nx_implementation/nx_bipartite_graphs.py
295
296
297
298
299
300
301
302
def degree_centrality(self) -> Dict:
    """
    Calculate the degree centrality for every node in the graph

    Returns:
        Dictionary containing the degree centrality for each node in the graph
    """
    return nx.degree_centrality(self._graph)

dispersion()

Calculate the dispersion for every node in the graph

RETURNS DESCRIPTION
Dict

Dictionary containing the dispersion computed for each node in the graph

Source code in clayrs/recsys/graphs/nx_implementation/nx_bipartite_graphs.py
313
314
315
316
317
318
319
320
def dispersion(self) -> Dict:
    """
    Calculate the dispersion for every node in the graph

    Returns:
        Dictionary containing the dispersion computed for each node in the graph
    """
    return nx.dispersion(self._graph)

Get link data such as weight, label, timestamp. between the start_node and the final_node. Returns None if said link doesn't exists

Remember that this is a directed graph so the result differs if 'start_node' and 'final_node' are switched.

PARAMETER DESCRIPTION
start_node

Node object from where the link starts

TYPE: Node

final_node

Node object to where the link ends

TYPE: Node

Source code in clayrs/recsys/graphs/nx_implementation/nx_bipartite_graphs.py
197
198
199
200
201
202
203
204
205
206
207
208
209
def get_link_data(self, start_node: Node, final_node: Node):
    """
    Get link data such as weight, label, timestamp. between the `start_node` and the `final_node`.
    Returns None if said link doesn't exists

    Remember that this is a directed graph so the result differs if 'start_node' and 'final_node'
    are switched.

    Args:
        start_node: Node object from where the link starts
        final_node: Node object to where the link ends
    """
    return self._graph.get_edge_data(start_node, final_node)

get_predecessors(node)

Returns a list containing the predecessors of the node passed. Raises TypeError exception if the node doesn't exists in the graph.

Taken from networkx library:

A predecessor of n is a node m such that there exists a directed edge from m to n

For example:

# GRAPH:

I1 <-- U1
↑
U2

>>> graph.get_predecessors(ItemNode('I1'))
[User U1, User U2]
PARAMETER DESCRIPTION
node

Node for which we want to know the predecessors

TYPE: Node

RAISES DESCRIPTION
TypeError

Exception raised when the node it's not in the graph

Source code in clayrs/recsys/graphs/nx_implementation/nx_bipartite_graphs.py
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
def get_predecessors(self, node: Node) -> List[Node]:
    """
    Returns a list containing the *predecessors* of the node passed.
    Raises TypeError exception if the node doesn't exists in the graph.

    Taken from networkx library:

    > A predecessor of n is a node m such that there exists a directed
    edge from m to n

    For example:
    ```
    # GRAPH:

    I1 <-- U1

    U2
    ```

    ```python
    >>> graph.get_predecessors(ItemNode('I1'))
    [User U1, User U2]
    ```

    Args:
        node: Node for which we want to know the predecessors

    Raises:
        TypeError: Exception raised when the node it's not in the graph
    """
    try:
        return list(self._graph.predecessors(node))
    except nx.NetworkXError:
        raise TypeError("The node specified is not in the graph!")

get_successors(node)

Returns a list containing the successors of the node passed. Returns None if the node doesn't exists in the graph.

Taken from networkx library:

A successor of n is a node m such that there exists a directed edge from n to m

For example:

U1 --> I2
↓
I1

>>> graph.get_successors(UserNode('U1'))
[Item I1, Item I2]
PARAMETER DESCRIPTION
node

Node for which we want to know the successors

TYPE: Node

RAISES DESCRIPTION
TypeError

Exception raised when the node it's not in the graph

Source code in clayrs/recsys/graphs/nx_implementation/nx_bipartite_graphs.py
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
def get_successors(self, node: Node) -> List[Node]:
    """
    Returns a list containing the successors of the node passed.
    Returns None if the node doesn't exists in the graph.

    Taken from networkx library:
    > A successor of n is a node m such that there exists a directed
    edge from n to m

    For example:
    ```
    U1 --> I2

    I1
    ```

    ```python

    >>> graph.get_successors(UserNode('U1'))
    [Item I1, Item I2]
    ```

    Args:
        node: Node for which we want to know the successors

    Raises:
        TypeError: Exception raised when the node it's not in the graph
    """
    try:
        return list(self._graph.successors(node))
    except nx.NetworkXError:
        raise TypeError("The node specified is not in the graph!")

node_exists(node)

Returns True if the node passed exists in the graph, False otherwise

PARAMETER DESCRIPTION
node

Node to check whether it's present in the graph or not

TYPE: Node

Source code in clayrs/recsys/graphs/nx_implementation/nx_bipartite_graphs.py
279
280
281
282
283
284
285
286
287
def node_exists(self, node: Node) -> bool:
    """
    Returns True if the node passed exists in the graph, False otherwise

    Args:
        node: Node to check whether it's present in the graph or not
    """
    r = self._graph.nodes.get(node)
    return r is not None

Removes the link connecting the start_node to the final_node. If there's no link between the two nodes, then a warning is printed

PARAMETER DESCRIPTION
start_node

head node of the link to remove

TYPE: Node

final_node

tail node of the link to remove

TYPE: Node

Source code in clayrs/recsys/graphs/nx_implementation/nx_bipartite_graphs.py
182
183
184
185
186
187
188
189
190
191
192
193
194
195
def remove_link(self, start_node: Node, final_node: Node):
    """
    Removes the link connecting the `start_node` to the `final_node`.
    If there's no link between the two nodes, then a warning is printed

    Args:
        start_node: *head* node of the link to remove
        final_node: *tail* node of the link to remove
    """
    try:
        self._graph.remove_edge(start_node, final_node)
    except nx.NetworkXError:
        logger.warning("No link exists between the start node and the final node!\n"
                       "No link will be removed")

remove_node(node_to_remove)

Removes one or multiple nodes from the graph. If one of the nodes to remove is not present in the graph, it is silently ignored

PARAMETER DESCRIPTION
node_to_remove

Single Node object or a list of Node objects to remove from the graph

TYPE: Union[Node, List[Node]]

Source code in clayrs/recsys/graphs/nx_implementation/nx_bipartite_graphs.py
322
323
324
325
326
327
328
329
330
331
332
333
def remove_node(self, node_to_remove: Union[Node, List[Node]]):
    """
    Removes one or multiple nodes from the graph.
    If one of the nodes to remove is not present in the graph, it is silently ignored

    Args:
        node_to_remove: Single Node object or a list of Node objects to remove from the graph
    """
    if not isinstance(node_to_remove, list):
        node_to_remove = [node_to_remove]

    self._graph.remove_nodes_from(node_to_remove)

to_networkx()

Returns underlying networkx implementation of the graph

Source code in clayrs/recsys/graphs/nx_implementation/nx_bipartite_graphs.py
289
290
291
292
293
def to_networkx(self) -> nx.DiGraph:
    """
    Returns underlying networkx implementation of the graph
    """
    return self._graph