Abstract:Traditional clustering algorithms studies are built on the basis of the assumption that the objects,attributes and other aspects of the data sets are independent and subject to the same distribution.However,data in reality are often non-independent and identically distributed,that is,there are more and less interactions between attributes.The traditional K-means algorithm randomly selects the initial clustering center,which is sensitive to the selection of the center point,easy to fall into the local optimal and low accuracy.Min_max method improves on this shortcoming,but both the original and improved K-means algorithms ignore the interaction between attributes.Therefore,this paper uses Pearson correlation coefficient formula to calculate the interactions between attributes and map these to the original data set.Meanwhile,the Min_max method is optimized with the idea of dual domain.Experimental results show that this method can achieve higher accuracy,better clustering effect and relatively fewer iterations.
潘品臣,姜合,吕奕锟. 一种非独立同分布下K-means算法的初始中心优化方法[J]. 小型微型计算机系统, 2019, 40(6): 1254-1259.
PAN Pin-chen,JIANG He,LV Yi-kun. Initial Center Optimization Method of K-means Algorithm within Non-independent and Identically Distribution Context. Journal of Chinese Computer Systems, 2019, 40(6): 1254-1259.