netneurotools.modularity.match_assignments
- netneurotools.modularity.match_assignments(assignments, target=None, seed=None)[source]
Re-label clusters in columns of assignments to best match target.
Uses
match_cluster_labels()
to align cluster assignments.- Parameters:
assignments ((N, M) array_like) – Array of M clustering assignments for N subjects
target ((N,) array_like, optional) – Target clustering assignments to which all columns should be matched. If provided as an integer the relevant column in assignments will be selected. If not specified a (semi-)random column in assignments is chosen; because of the potential discontinuity introduced when matching an N-cluster solution to an N+1-cluster solution, the “random” target columns will be one assignments with the lowest cluster number. See Examples for more information. Default: None
seed ({int, np.random.RandomState instance, None}, optional) – Seed for random number generation; only used if target is not provided. Default: None
- Returns:
assignments – Provided array with re-labeled cluster solutions to better match across M assignments
- Return type:
(N, M) numpy.ndarray
Examples
>>> from netneurotools import modularity
First we can construct a matrix of N samples clustered M times (in this case, M is three) . Since cluster labels are generally arbitrary we can see that, while the same clusters were found each time, they were given different labels:
>>> assignments = np.array([[0, 0, 1], ... [0, 0, 1], ... [0, 0, 1], ... [1, 2, 0], ... [1, 2, 0], ... [1, 2, 0], ... [2, 1, 2], ... [2, 1, 2]])
We would like to match the assignments so they’re all the same. Since one of the columns will be randomly picked as the “target” solution, we provide a seed to ensure reproducibility in the selection:
>>> modularity.match_assignments(assignments, seed=1234) array([[1, 1, 1], [1, 1, 1], [1, 1, 1], [0, 0, 0], [0, 0, 0], [0, 0, 0], [2, 2, 2], [2, 2, 2]])
Alternatively, if assignments has clustering solutions with different numbers of clusters and no target is specified, the chosen target will be one of the columns with the smallest number of clusters:
>>> assignments = np.array([[0, 0, 1], ... [0, 0, 1], ... [0, 0, 1], ... [1, 2, 0], ... [1, 2, 0], ... [1, 2, 0], ... [1, 1, 2], ... [1, 1, 2]]) >>> modularity.match_assignments(assignments) array([[0, 0, 0], [0, 0, 0], [0, 0, 0], [1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 2, 2], [1, 2, 2]])