Abstract:Position uncertain data is a new type of uncertain data in the research of clustering algorithm for uncertain data streams. Existing uncertain data processing models can not describe and process position uncertain data well. Therefore,this paper presents the main concepts of connection number based position uncertain model,connection distance function and density accessibility of micro-clusters. Based on these concepts,a connection number based UCNStream (Uncertain Connection Number Stream)algorithm is proposed for location uncertain data stream. The algorithm adopts an online/offline two-stage processing framework with the initialization strategy based on density peak,and defines a new micro cluster characteristic vector to maintain the arriving data objects dynamically. Beyond that,it accurately reflects the evolution process of data flow by maintaining micro-clusters online with the usage of decay function and micro-cluster deletion mechanism. Finally,the computational complexity of the algorithm is analyzed,and the performances of the proposed algorithm are testified by a series of experiments on real-world data sets in comparison with several outstanding clustering algorithms. The experimental results illustrate that UCNStream algorithm has high clustering accuracy and processing efficiency.
史玲娟,黄德才. 一种联系数表达的位置不确定数据流聚类算法[J]. 小型微型计算机系统, 2020, 41(2): 361-368.
SHI Ling-juan,HUANG De-cai. Clustering Algorithm for Position Uncertain Data Expressed by Connection Number. Journal of Chinese Computer Systems, 2020, 41(2): 361-368.