Groups (yFiles-for-JavaFX-Complete-3.0 API)

java.lang.Object
- com.yworks.yfiles.algorithms.Groups

```
public final class Groups
extends Object
```
This class provides methods for automatically partitioning nodes of a graph into groups.
Partitions can be defined using edge betweenness centrality, biconnectivity, k-means clustering or hierarchical clustering.

Definitions
- Betweenness centrality is a measure for how often a node lies on a shortest path between each pair of nodes in the graph.
- Biconnected graph is a graph that has no cut vertex or articulation point (i.e., a node whose removal disconnects the graph).
- K-means clustering algorithm partitions the nodes of a graph into k-clusters based on their positions on the plane and a given distance metric.
- Hierarchical clustering creates a hierarchy of clusters in a bottom-to-top approach based on some distance metric and linkage.

Nested Class Summary

Nested Classes
Modifier and Type	Class and Description
`static class`	`Groups.Dendrogram` This class provides the result of hierarchical clustering algorithms by means of a binary tree structure.
`static interface`	`Groups.INodeDistanceProvider` An interface that determines the distance between two nodes of a graph.

Method Summary

All Methods Static Methods Concrete Methods
Modifier and Type	Method and Description
`static int`	`biconnectedComponentGrouping(Graph graph, INodeMap groupIDs)` This method partitions the graph by analyzing its biconnected components.
`static int`	`edgeBetweennessClustering(Graph graph, INodeMap clusterIDs, boolean directed, int minGroupCount, int maxGroupCount, IDataProvider edgeCosts)` Partitions the graph into groups using edge betweenness centrality.
`static int`	`edgeBetweennessClustering(Graph graph, INodeMap clusterIDs, double qualityTimeRatio, int minGroupCount, int maxGroupCount, boolean refine)` Partitions the graph into groups using edge betweenness clustering proposed by Girvan and Newman.
`static Groups.Dendrogram`	`hierarchicalClustering(Graph graph, Groups.INodeDistanceProvider distances, Linkage linkage)` Partitions the graph into clusters based on hierarchical clustering.
`static int`	`hierarchicalClustering(Graph graph, INodeMap clusterIDs, Groups.INodeDistanceProvider distances, Linkage linkage, double cutOff)` Partitions the graph into clusters based on hierarchical clustering, while the dendrogram is cut based on a given cut-off value.
`static int`	`hierarchicalClustering(Graph graph, int maxCluster, INodeMap clusterIDs, Groups.INodeDistanceProvider distances, Linkage linkage)` Partitions the graph into clusters based on hierarchical clustering, while the dendrogram is cut based on a given maximum number of clusters.
`static int`	`kMeansClustering(Graph graph, INodeMap clusterIDs, IDataProvider nodePositions, DistanceMetric distanceMetric, int k)` Partitions the graph into clusters using k-means clustering algorithm.
`static int`	`kMeansClustering(Graph graph, INodeMap clusterIDs, IDataProvider nodePositions, DistanceMetric distanceMetric, int k, int iterations)` Partitions the graph into clusters using k-means clustering algorithm.
`static int`	`kMeansClustering(Graph graph, INodeMap clusterIDs, IDataProvider nodePositions, DistanceMetric distanceMetric, int k, int iterations, YPoint[] centroids)` Partitions the graph into clusters using k-means clustering algorithm.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Method Detail
  - biconnectedComponentGrouping
```
public static final int biconnectedComponentGrouping(Graph graph,
                                                     INodeMap groupIDs)
```
    This method partitions the graph by analyzing its biconnected components.
    Nodes will be grouped such that the nodes within each group are biconnected. Nodes that belong to multiple biconnected components will be assigned to exactly one of these components.
    
    Biconnected components are defined for undirected graphs only. As a consequence, this algorithm ignores self-loops and isolated nodes with only self-loop edges or no edges at all are not assigned to any group, i.e. the groupID for such a node will be null.
    
    Complexity:
    O(graph.E() + graph.N())
    
    Parameters:
    
    graph - the input graph
    
    groupIDs - the INodeMap that will be filled during the execution and returns an integer value (cluster ID) for each node
    
    Returns:
    
    the resulting number of different groups
  - edgeBetweennessClustering
```
public static final int edgeBetweennessClustering(Graph graph,
                                                  INodeMap clusterIDs,
                                                  boolean directed,
                                                  int minGroupCount,
                                                  int maxGroupCount,
                                                  IDataProvider edgeCosts)
```
    Partitions the graph into groups using edge betweenness centrality.
    In each iteration the edge with the highest betweenness centrality is removed from the graph. The method stops, if there are no more edges to remove. The clustering with the best quality reached during the process will be returned.
    
    The method requires the maximum number of groups that will be returned. The smaller this value is, the faster the overall computation time. The upper bound on the number of groups is graph.N(). Also, the number of returned groups is never smaller than the number of connected components of the graph.
    
    Throws:
    
    IllegalArgumentException - if minGroupCount > maxGroupCount or minGroupCount > graph.N() or maxGroupCount <= 0
    
    Preconditions:
    minGroupCount <= maxGroupCount
    minGroupCount <= graph.N()
    maxGroupCount > 0
    
    Complexity:
    O(graph.E()) * O(edgeBetweenness)
    In practice, it is faster because edge betweenness is computed for subgraphs during the process and this algorithm terminates after maxGroupCount groups have been determined.
    
    Parameters:
    
    graph - the input graph
    
    clusterIDs - the INodeMap that will be filled during the execution and returns an integer value (cluster ID) for each node
    
    directed - true if the graph should be considered as directed, false otherwise
    
    minGroupCount - the minimum number of groups that will be returned
    
    maxGroupCount - the maximum number of groups that will be returned
    
    edgeCosts - the IDataProvider that holds a positive Double cost or null if the edges of the graph are considered to be of equal cost
    
    Returns:
    
    the resulting number of different groups
  - edgeBetweennessClustering
```
public static final int edgeBetweennessClustering(Graph graph,
                                                  INodeMap clusterIDs,
                                                  double qualityTimeRatio,
                                                  int minGroupCount,
                                                  int maxGroupCount,
                                                  boolean refine)
```
    Partitions the graph into groups using edge betweenness clustering proposed by Girvan and Newman.
    In each iteration the edge with the highest betweenness centrality is removed from the graph. The method stops, if there are no more edges to remove or if the requested maximum number of groups is found. The clustering with the best quality reached during the process is returned.
    
    The algorithm includes several heuristic speed-up techniques available through the quality/time ratio. For the highest quality setting, it is used almost unmodified. The fast betweenness approximation of Brandes and Pich (Centrality Estimation in Large Networks) is employed for values around 0.5. Typically, this results in a tiny decrease in quality but a large speed-up and is the recommended setting. To achieve the lowest running time, a local betweenness calculation is used (Gregory: Local Betweenness for Finding Communities in Networks).
    
    The method requires the maximum number of groups that will be returned. The smaller this value is, the faster the overall computation time. The upper bound on the number of groups is graph.N(). Also, the number of returned groups is never smaller than the number of connected components of the graph.
    
    Throws:
    
    IllegalArgumentException - if minGroupCount > maxGroupCount or minGroupCount > graph.N() or maxGroupCount <= 0
    
    Preconditions:
    minGroupCount <= maxGroupCount
    minGroupCount <= graph.N()
    maxGroupCount > 0
    
    Complexity:
    O(graph.E()) * O(edgeBetweenness)
    In practice, it is faster because edge betweenness is computed for subgraphs during the process and this algorithm terminates after maxGroupCount groups have been determined.
    
    Parameters:
    
    graph - the input graph
    
    clusterIDs - the INodeMap that will be filled during the execution and returns an integer value (cluster ID) for each node
    
    qualityTimeRatio - a value between 0.0 (low quality, fast) and 1.0 (high quality, slow); the recommended value is 0.5
    
    minGroupCount - the minimum number of groups that will be returned
    
    maxGroupCount - the maximum number of groups that will be returned
    
    refine - true if the algorithm refines the current grouping, false if the algorithm discards the current grouping
    
    Returns:
    
    the resulting number of different groups
  - hierarchicalClustering
```
public static final Groups.Dendrogram hierarchicalClustering(Graph graph,
                                                             Groups.INodeDistanceProvider distances,
                                                             Linkage linkage)
```
    Partitions the graph into clusters based on hierarchical clustering.
    The clustering is performed using the agglomerative strategy i.e., a bottom-up approach according to which at the beginning each node belongs to its own cluster. At each step pairs of clusters are merged while moving up to the hierarchy. The dissimilarity between clusters is determined based on the given linkage and the given node distances metric. The algorithm continues until all nodes belong to the same cluster.
    
    The result is returned as a Groups.Dendrogram object which represents the result of the clustering algorithm as a binary tree structure. It can easily be traversed by starting from the root node and moving on to nodes of the next level via method Groups.Dendrogram.getChildren(Node).
    
    Throws:
    
    IllegalArgumentException - if an unknown linkage is given
    
    If the Groups.INodeDistanceProvider object returns a negative distance value for a pair of nodes, zero will be used instead.
    
    Complexity:
    O(graph.N() ^ 3)
    
    Parameters:
    
    graph - the input graph
    
    distances - a given Groups.INodeDistanceProvider object that determines the distance between any two nodes
    
    linkage - one of the predefined linkage values
    
    Returns:
    
    a Groups.Dendrogram which represents the result of the clustering as a binary tree
  - hierarchicalClustering
```
public static final int hierarchicalClustering(Graph graph,
                                               INodeMap clusterIDs,
                                               Groups.INodeDistanceProvider distances,
                                               Linkage linkage,
                                               double cutOff)
```
    Partitions the graph into clusters based on hierarchical clustering, while the dendrogram is cut based on a given cut-off value.
    The clustering is performed using the agglomerative strategy i.e., a bottom-up approach according to which at the beginning each node belongs to its own cluster. At each step pairs of clusters are merged while moving up to the hierarchy. The dissimilarity between clusters is determined based on the given linkage and the given node distances. The algorithm continues until all nodes belong to the same cluster.
    
    The result will be given based on the given cut-off value that is used for cutting the hierarchical tree at a point such that the dissimilarity values of the nodes that remain at the dendrogram are less than this value.
    
    Throws:
    
    IllegalArgumentException - if an unknown linkage is used
    
    The cut-off value should be greater that zero. If a negative value is given, zero will be used instead.
    
    Complexity:
    O(graph.N() ^ 3)
    
    Parameters:
    
    graph - the input graph
    
    clusterIDs - the INodeMap that will be filled during the execution and returns an integer value (cluster ID) for each node
    
    distances - a given Groups.INodeDistanceProvider object that determines the distance between any two nodes
    
    linkage - one of the predefined linkage values
    
    cutOff - the cut-off value that determines where to cut the hierarchic tree into clusters
    
    Returns:
    
    the resulting number of clusters
  - hierarchicalClustering
```
public static final int hierarchicalClustering(Graph graph,
                                               int maxCluster,
                                               INodeMap clusterIDs,
                                               Groups.INodeDistanceProvider distances,
                                               Linkage linkage)
```
    Partitions the graph into clusters based on hierarchical clustering, while the dendrogram is cut based on a given maximum number of clusters.
    The clustering is performed using the agglomerative strategy i.e., a bottom-up approach according to which at the beginning each node belongs to its own cluster. At each step pairs of clusters are merged while moving up to the hierarchy. The dissimilarity between clusters is determined based on the given linkage and the given node distances. The algorithm continues until all nodes belong to the same cluster.
    
    The result will be given based on the given maximum number of clusters value that is used for cutting the hierarchical tree at a point such that the number of remaining clusters equals to this value.
    
    The maximum number of clusters needs to be greater than zero and less than the number of the nodes of the graph.
    
    Throws:
    
    IllegalArgumentException - if an unknown linkage is given or if the maximum number of clusters is less than or equal to zero or greater than the number of nodes of the graph
    
    If the Groups.INodeDistanceProvider object returns a negative distance value for a pair of nodes, zero will be used instead.
    
    Complexity:
    O(graph.N() ^ 3)
    
    Parameters:
    
    graph - the input graph
    
    maxCluster - the maximum number of clusters that determines where to cut the hierarchic tree into clusters
    
    clusterIDs - the INodeMap that will be filled during the execution and returns an integer value (cluster ID) for each node
    
    distances - a given Groups.INodeDistanceProvider object that determines the distance between any two graph nodes
    
    linkage - one of the predefined linkage values
    
    Returns:
    
    the resulting number of clusters
  - kMeansClustering
```
public static final int kMeansClustering(Graph graph,
                                         INodeMap clusterIDs,
                                         IDataProvider nodePositions,
                                         DistanceMetric distanceMetric,
                                         int k)
```
    Partitions the graph into clusters using k-means clustering algorithm.
    The nodes of the graph will be partitioned in k clusters based on their positions such that their distance from the cluster's mean (centroid) is minimized.
    
    The distance can be defined using diverse metrics as euclidean distance, euclidean-squared distance, manhattan distance or chebychev distance.
    
    Throws:
    
    IllegalArgumentException - if the given distance metric is not supported
    
    If the given number of clusters is greater than the number of nodes of the graph, number k will be set equal to 2.
    
    If the number of given centroids is smaller than k or if no centroids are given, random initial centroids will be assigned for all clusters.
    
    Complexity:
    O(graph.N() * k * d * I) where k is the number of clusters, I the maximum number of iterations and d the dimension of the points
    
    Parameters:
    
    graph - the input graph
    
    clusterIDs - the INodeMap that will be filled during the execution and returns an integer value (cluster ID) for each node
    
    nodePositions - the IDataProvider that holds a point representing the current position of each node in the graph
    
    distanceMetric - one of the predefined distance metrics
    
    k - the number of clusters
    
    Returns:
    
    the number of resulting (non-empty) clusters
  - kMeansClustering
```
public static final int kMeansClustering(Graph graph,
                                         INodeMap clusterIDs,
                                         IDataProvider nodePositions,
                                         DistanceMetric distanceMetric,
                                         int k,
                                         int iterations)
```
    Partitions the graph into clusters using k-means clustering algorithm.
    The nodes of the graph will be partitioned in k clusters based on their positions such that their distance from the cluster's mean (centroid) is minimized.
    
    The distance can be defined using diverse metrics as euclidean distance, euclidean-squared distance, manhattan distance or chebychev distance.
    
    Throws:
    
    IllegalArgumentException - if the given distance metric is not supported
    
    If the given number of clusters is greater than the number of nodes of the graph, number k will be set equal to 2.
    
    If the number of given centroids is smaller than k or if no centroids are given, random initial centroids will be assigned for all clusters.
    
    Complexity:
    O(graph.N() * k * d * I) where k is the number of clusters, I the maximum number of iterations and d the dimension of the points
    
    Parameters:
    
    graph - the input graph
    
    clusterIDs - the INodeMap that will be filled during the execution and returns an integer value (cluster ID) for each node
    
    nodePositions - the IDataProvider that holds a point representing the current position of each node in the graph
    
    distanceMetric - one of the predefined distance metrics
    
    k - the number of clusters
    
    iterations - the maximum number of iterations performed by the algorithm for convergence
    
    Returns:
    
    the number of resulting (non-empty) clusters
  - kMeansClustering
```
public static final int kMeansClustering(Graph graph,
                                         INodeMap clusterIDs,
                                         IDataProvider nodePositions,
                                         DistanceMetric distanceMetric,
                                         int k,
                                         int iterations,
                                         YPoint[] centroids)
```
    Partitions the graph into clusters using k-means clustering algorithm.
    The nodes of the graph will be partitioned in k clusters based on their positions such that their distance from the cluster's mean (centroid) is minimized.
    
    The distance can be defined using diverse metrics as euclidean distance, euclidean-squared distance, manhattan distance or chebychev distance.
    
    Throws:
    
    IllegalArgumentException - if the given distance metric is not supported
    
    If the given number of clusters is greater than the number of nodes of the graph, number k will be set equal to 2.
    
    If the number of given centroids is smaller than k or if no centroids are given, random initial centroids will be assigned for all clusters.
    
    Complexity:
    O(graph.N() * k * d * I) where k is the number of clusters, I the maximum number of iterations and d the dimension of the points
    
    Parameters:
    
    graph - the input graph
    
    clusterIDs - the INodeMap that will be filled during the execution and returns an integer value (cluster ID) for each node
    
    nodePositions - the IDataProvider that holds a point representing the current position of each node in the graph
    
    distanceMetric - one of the predefined distance metrics
    
    k - the number of clusters
    
    iterations - the maximum number of iterations performed by the algorithm for convergence
    
    centroids - the initial centroids
    
    Returns:
    
    the number of resulting (non-empty) clusters

Class Groups

Definitions

Nested Class Summary

Method Summary

Methods inherited from class java.lang.Object

Method Detail

biconnectedComponentGrouping

edgeBetweennessClustering

edgeBetweennessClustering

hierarchicalClustering

hierarchicalClustering

hierarchicalClustering

kMeansClustering

kMeansClustering

kMeansClustering