Scheduling Strategy of Hierarchical Storage in Heterogeneous Cluster About HDFS
YANG Dong-ju1,2,LI Qing1,2,DENG Chong-bin1,2
1(Cloud Computing Research Center,North China University of Technology,Beijing 100144,China)
2(Beijing Key Laboratory on Integration and Analysis of Largescale Stream Data,Beijing 100144,China)
Abstract:Most storage cluster may contain legacy devices and new purchased ones when building,and these devices are quite different in storage performance.When using default rack perception storage strategy of Hadoop Distributed File System (HDFS),it is possible to make a high frequency data stored on the low performance nodes,at the same time,the low frequency data more likely to store on high performance node,then impact on the cluster response time,as well as reduces the resource utilization.To solve these headache problems,our team propose a hierarchical storage scheduling mechanism.On the basis of HDFS rack perception scheduling policy,Firstly in accordance with the node′s CPU,memory size,disk size,disk I/O and other inherent hardware performance,dividing nodes into high configuration node and opposite of low configuration node;secondly according to the node′s CPU usage,memory usage,network bandwidth usage,disk usage and other performance dynamic factors to establish performance evaluation model of the node,and to build three performance levels p1,p2,p3,from high to low,to evaluate the performance of nodes.Making integrated scheduling according to the node configuration,performance levels,network location and other factors.According to the data access frequency to dynamically adjust the distribution of the data block in the process of cluster running.The experimental results show that the new strategy of hierarchical storage scheduling mechanism could improve the data access efficiency in HDFS heterogeneous cluster,optimize clustering performance.
杨冬菊,李青,邓崇彬,. HDFS异构集群中的分级存储调度机制[J]. 小型微型计算机系统, 2017, 38(1): 29-34.
YANG Dong-ju,LI Qing,DENG Chong-bin,. Scheduling Strategy of Hierarchical Storage in Heterogeneous Cluster About HDFS. Journal of Chinese Computer Systems, 2017, 38(1): 29-34.