DRIM: Deviation Ratio Index Based on Medoids
Advances and Challenges in Science and Technology Vol. 5,
18 October 2023
,
Page 99-126
https://doi.org/10.9734/bpi/acst/v5/7515A
Abstract
A technique for estimating the number of clusters in a dataset that is flexible for various types and sizes of data, adaptive to any clustering methods, and easy to calculate is discussed in this chapter. A Deviation Ratio Index based on Medoids (DRIM) is the approach we propose. The object distance to the final \(\mathit{k}\)-medoids is utilized to calculate the DRIM technique. The block-based \(\mathit{k}\)-medoids algorithm (Block-KM) and the \(\mathit{k}\)-medoids constructed using the variance of distance (VarD-KM) were applied to obtain these final medoids. Before running the Block-KM and VarD-KM, we select a specific transformation for some datasets. We use ten real datasets to validate the DRI. These data include Vote, Soybean (small), Primary Tumor, Breast Cancer, Ionsphere, Iris, Wine, Zoo, Heart Disease Case 2, and Credit Approval data. The experimental results show that the DRIM technique predicts the number of clusters for the ten real datasets more precisely than other methods. Three types of artificial data to evaluate the proposed method resulted in 76.67% of experiments predicting correctly. Applying the new approach to grouping 62 universities in Indonesia based on data on human resources, education, research, organization, infrastructure, and cooperation produces three easily interpreted groups.
- The number of clusters
- deviation ratio index based on medoids
- Block-KM
- VarD-KM