Abstract:Community detection is a multi-disciplinary research hot issue in complex networks.The existing community detection methods are mainly focus on network topology,and majority of them cannot deal with large-scale networks well.Ruan et al.proposed an algorithm,namely CODICIL,which not only improved the accuracy of the community detection by introducing the text content,but also can be applied to large-scale networks clustering problems.CODICIL disposed text content of the nodes in the network through TF-IDF.However,TF-IDF has high dimension that is the reason why it cannot characterize the content of the nodes accurately and lead to quite large calculating amount.We proposal a Gaussian mixture model which integrate the content information in complex network effectively according to both accuracy and speed.We analyzed the proposed method on five real-world networks with ground-truths communities.The results showed the superior performance of our method over TF-IDF.In addition,our algorithm has better scalability through fit the data points in the Gaussian mixture model by adopting various parameters.