Rand Index & Adjusted Rand Index

RI is an intuitive metric that reflects the accuracy of archive results. For an original dataset, select any two points p and q to observe the archive results in the clustering. There may be four cases:

TP: In the original dataset, p and q belong to the same class. In the clustering result, p and q belong to the same cluster.
FN: In the original dataset, p and q belong to the same class. In the clustering result, p and q do not belong to the same cluster.
FP: In the original dataset, p and q do not belong to the same class. In the clustering result, p and q belong to the same cluster.
TN: In the original dataset, p and q do not belong to the same class. In the clustering result, p and q do not belong to the same cluster.

Similar to confusion matrix in deep learning, the accuracy of the inference result can be obtained by calculating (TP+TN)/(TP+TN+FP+FN), which can also be extended to recall.

In practice, for any two points p and q, it is most likely that p and q in the original dataset belong to different classes, and p and q in the clustering result belong to different clusters. That is, TN accounts for the majority of both numerators and denominators, affecting normal clustering results. To address this issue, assign weights to adjust the proportion of TNs.

The adjusted Rand index can also solve the problem of high TN. Its value ranges from -1 to 1. The higher the value is, the better the clustering result is. If the clustering result tends to be randomly distributed, the value is close to 0. If the clustering algorithm always obtains incorrect results, the feature vectors of the same class in GroundTruth tend to be grouped into different clusters, and the feature vectors of different classes tend to be grouped into the same cluster. In this case, the adjusted Rand index is close to –1. For the clustering result of a specific group of parameters in a dataset, the adjusted Rand index is more accurate than the Rand index. You are advised to use this metric to evaluate the accuracy of the clustering result.

```
from sklearn import metrics
def CalcRI(allClusters, groundTruth, thresh = 99999999):
    '''
    Args
        allClusters: list of list, represent all clusters
        groundTruth: groundtruth dict
        thresh: blackhole archive threshold, if one single archive has features more than this value, ignore such archive
    '''
    filteredClusters = []
    for arc in allClusters:
        if len(arc) < thresh:
            filteredClusters.append(arc)
    tmpDict = {}
    for label, cluster in enumerate(filteredClusters):
        for feat in cluster:
            tmpDict[feat] = label
    featLis = list(tmpDict.keys())
    featLis.sort()
    pd = []
    gt = []
    for feat in featLis:
        pd.append(tmpDict[feat])
        gt.append(int(groundTruth[feat]))
    print("Start to calculate RI")
    score = metrics.rand_score(pd, gt)
    print("Rand Score {}".format(score))
    score = metrics.adjusted_rand_score(pd, gt)
    print("Adjusted Rand Score {}".format(score))
```

Parent topic: Clustering Accuracy Evaluation Method