General Description
The accuracy calculation method provided in this section is for reference only. The code used is only for understanding the algorithm.
For a clustering model (clustering algorithm), it is difficult to evaluate its accuracy and performance on an unknown dataset. A common method is to select a subset of a dataset of a specific task and manually label the dataset for evaluation based on the subset, which is then used to evaluate the overall accuracy of the task-specific algorithm. The following metrics are provided for reference.
Suppose that the following information is available:
- GroundTruthDic: dictionary type. key is the original ID, and value is the label of the feature vector cluster.
- GroundTruthCluster: list of list. Each list in the list indicates a cluster.
- PredictedCluster: list of list. Each list in the list indicates a cluster.
Parent topic: Clustering Accuracy Evaluation Method