Text Detection Based on Multi-level Feature Fusion and Attention Mechanism
LUO Wen-li,WU Qin
(School of Artificial Intelligence and Computer Science,Jiangnan University,Wuxi 214122,China)(Jiangsu Provincial Engineering Laboratory for Pattern Recognition and Computational Intelligence,Jiangnan University,Wuxi 214122,China)
Abstract:The application of convolutional neural networks in natural scene text detection greatly improves the accuracy of text detection.However,the scale variability caused by camera's perspective and text sizes,and the diversity of text distribution also bring challenges to text detection.In order to alleviate the problem of text scale variability,we propose a new multi-level feature fusion module.Besides using feature pyramid to fuse features of different levels,we add an additional dilated convolutional and pooling module.It keeps different receptive fields without reducing feature scales,and obtains richer features,which helps to alleviate the problem of text scale variability.We propose an attention mechanism to further extract features which are more suitable for text through the channel attention mechanism,thus cross-channel interaction information is effectively extracted,alleviate the detection problems caused by the diversity of text distribution.We further improve the accuracy of the text detector.The experimental results on four public data sets(ICDAR2015,CTW1500,Total-Text,MSRA-TD500)prove the effectiveness of the method proposed in this paper.
骆文莉,吴秦. 多层次特征融合与注意力机制的文本检测[J]. 小型微型计算机系统, 2022, 43(4): 815-821.
LUO Wen-li,WU Qin. Text Detection Based on Multi-level Feature Fusion and Attention Mechanism. Journal of Chinese Computer Systems, 2022, 43(4): 815-821.