Abstract:Target tracking is an important research direction of computer vision.In order to accurately locate and timely track the single target in video sequence,the Siamese convolution neural network is adopted in this paper to solve the problem that the deep neural network can’t be updated in time and training data is insufficient.At the same time,SE-Net is added to the feature extraction submodule of Siamese CNN.The spatial feature information of the image is extracted by using the convolution layer,and the interdependencies between feature channels are used for modeling to strengthen the characteristics of the effective channels and further improve the network's feature characterization ability,so as to improve the effect of feature extraction.Finally,the Region Proposal Network is adopted for target positioning and boundary fine-tuning. In this paper,OTB2015 dataset is used for experiment and average coverage and OPE method is used as evaluation criteria.The results reveal that average coverage is 66.6% and both success rate and accuracy rate diagrams show our approach is better than other algorithms.