Recognizing Chinese Organization Names Based on Deep Learning: A Recurrent Network Model
Danhao Zhu1,2(),Lei Yang3,Dongbo Wang4
1Library of Jiangsu Police Institute, Nanjing 210031, China 2Department of Computer Science and Technology, Nanjing University, Nanjing 210093, China 3Department of High Education, College of Nanjing Traffic Technician, Nanjing 210049, China 4College of Information Science and Technology, Nanjing Agricultural University, Nanjing 210095, China
[Objective]Chinese organization names are difficult to be recognized by computers due to their complex structures and using of rare words. Successful recognition of these names plays significant roles in information extraction and retrieval, knowledge mining as well as institution research evaluation. [Methods] First, we redefined the input and output of organization names based on recurrent neural network method and nature of Chinese words or phrases. Second, we proposed a new model at the word level. [Results] Compared to the recurrent network models at the phrase level, the proposed method significantly improved the precision, recall and F value. Among them, the F value increased 1.54%. For organization names with rare words, the F value increased by 11.05%. [Limitations] We adopted a greedy strategy to find the local optimal values. A conditional random field method will yield better results from the global perspective. [Conclusions] The proposed method, which uses Chinese word level features, is easy to be implemented, and could generate better results than its phrase based counterparts.
朱丹浩, 杨蕾, 王东波. 基于深度学习的中文机构名识别研究*——一种汉字级别的循环神经网络方法[J]. 数据分析与知识发现, 2016, 32(12): 36-43.
Danhao Zhu, Lei Yang, Dongbo Wang. Recognizing Chinese Organization Names Based on Deep Learning: A Recurrent Network Model. Data Analysis and Knowledge Discovery, DOI：10.11925/infotech.1003-3513.2016.12.05.
Chen X, Qiu X, Zhu C, et al.Gated Recursive Neural Network for Chinese Word Segmentation [C]. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing.2015: 1744-1753.
Chen X, Xu L, Liu Z, et al.Joint Learning of Character and Word Embeddings [C]. In: Proceedings of the 24th International Conference on Artificial Intelligence. 2015: 1236-1242.
Sun Y, Lin L, Yang N, et al.Radical-enhanced Chinese Character Embedding [C]. In: Proceedings of the International Conference on Neural Information Processing. Springer International Publishing, 2014: 279-286.
(Yu Hongkui, Zhang Huaping, Liu Qun.Recognition of Chinese Organization Name Based on Role Tagging [C]. In: Proceedings of the 20th International Conference on Computer Processing of Oriental Languages. 2003: 79-87.)