Abstract:In the data center network,there are elephant flow and mouse flow.Accurately predicting flow type is the key to achieve optimal flow scheduling,while the existing prediction methods have some shortcomings,such as,not great accuracy,high overhead,long prediction time and so on.Therefore,based on the multi-dimensional feature characterization ability of deep learning and the ability of software-defined networks(SDN) centralized controlling the network,the two-level elephant flow prediction mechanism of “edge pre-classification+ center fine classification” is proposed.The mechanism includes the following steps:first,the time distribution characteristics of flow,the real-time characteristics of flow,and the characteristics of packet head are screened out by the random forest algorithm.Then,the pre-classification model deployed on the SDN switch at network edge uses the residual network algorithm+Softmax cross-entropy loss function with cost-sensitive properties to filter out most of the mouse flows.Finally,the fined classification model deployed in the SDN controller uses the residual network algorithm+Additive Margin Softmax cross-entropy loss function to accurately identify the elephant flow.Experiments from the public data set show that,when the fifth packet of a flow arrives,the proposed method’s recall could reach to 91%,the accuracy could reach 93%,the cost was 0.1kbps,and the prediction time was 7ms.Its performance also is better than the existing schemes(such as,FlowSeer,ESCA and NELLY).The Matthews correlation coefficient was 2.52X to NELLY,the prediction time was reduced to 0.35% of FlowSeer,and the overhead was reduced to 0.046% of ESCA.
曾嘉麒,刘外喜,卢锦杰. 软件定义数据中心基于残差网络的大象流预测机制[J]. 小型微型计算机系统, 2021, 42(9): 1938-1943.
ZENG Jia-qi,LIU Wai-xi,LU Jin-jie. Elephant Flow Prediction Scheme Based on Residual Network for Software-defined Data Centers. Journal of Chinese Computer Systems, 2021, 42(9): 1938-1943.