Improved Spectral Clustering Based on Inflexion Point Estimate
ZHANG Jia-qi(1,2),ZHANG Hong-yun(1,2)
1(Department of Computer Science and Technology,Tongji University,Shanghai 201804,China)2(Key Laboratory of Embedded Systems and Service Computing,Ministry of Education,Tongji University,Shanghai 201804,China)
摘要 针对现有谱聚类算法不稳定,处理复杂分布数据较困难,需要手动输入聚类个数的问题,利用基于快速搜索和密度峰的聚类算法CFSFDP(Clustering by Fast Search and Find of Density Peaks),提出一种改进的谱聚类算法.本算法首先借鉴基于流形距离核的谱聚类算法计算数据的低维嵌入,将分布复杂或者类内不存在密度极值点的数据转换成类球状的低维嵌入代表点.接着,提出用CFSFDP算法代替基于流形距离核的谱聚类算法中Kmeans算法对低维嵌入进行处理.最后,基于CFSFDP算法的局部密度和距离属性的概念,提出拐点估计方法来自动确定聚类个数,获取聚类结果.实验表明,针对复杂分布的测试数据集,本算法能准确地确定聚类个数,获得很好的聚类效果,同时本算法需要输入的参数较少,且在一定范围内表现出较强的鲁棒性.
Abstract:The existing spectral clustering algorithms are unstable and difficult to deal with complicated data sets.Moreover,the number of clusters needs to be entered manually as a parameter in advance.To solve the problems mentioned above,we propose an improved spectral clustering based on CFSFDP(Clustering by Fast Search and Find of Density Peaks).Firstly,we compute the low-dimensional embedding by using the idea of spectral clustering algorithms based on Manifold Distance Kernel.The data sets which have complicated distribution or have no density extremum in some classes are converted into spheroidal low-dimensional embedding representative point.Instead of using Kmeans in the spectral clustering algorithms based on Manifold Distance Kernel,we then propose to use CFSFDP algorithms to deal with low-dimensional embedding.In the end,based on the concept of the local density and distance attributes in CFSFDP algorithm,we propose the Inflexion point estimate to automatically determine the number of clusters.Experimental results show that the algorithm we proposed can accurately determine the number of clusters and obtain good cluster results on complicated data sets.Meanwhile,the number of parameters need to be entered into the algorithm is less and show strong robust within a certain range.
张嘉琪(1,2),张红云(1,2). 拐点估计的改进谱聚类算法[J]. 小型微型计算机系统, 2017, 38(5): 1049-1053.
ZHANG Jia-qi(1,2),ZHANG Hong-yun(1,2). Improved Spectral Clustering Based on Inflexion Point Estimate. Journal of Chinese Computer Systems, 2017, 38(5): 1049-1053.