[Objective] This paper tries to automatically identify the produce aliases, related human figures, places of origin and cited books from ancient local chronicles, aiming to establish a knowledge base for traditional products. [Methods] Firstly, we chose Local Chronicle of Yunnan: Produce as the basic corpus and preprocessed its texts to carry out corpus tagging. Then, we adopted four deep learning models (Bi-RNN, Bi-LSTM, Bi-LSTM-CRF and BERT) to identify the needed entities. Finally, we compared outputs of these models. [Results] The P-value and F-value of the Bi-LSTM model were 5.54% and 3.51% higher than those of the Bi-LSTM-CRF model. The R-value of the BERT model reached 83.36%, which was the best among all models. The Bi-LSTM-CRF model yielded the best results with the entity recognition of cited books (F-value=89.71%), and the BERT model had the best performance on character entities with a F-value of 87.90%. [Limitations] Due to the linguistic characteristics of ancient local chronicles and the domain knowledge required for identifying related entities, there may be errors in tagging. [Conclusions] Deep learning could help us identify needed entities from ancient local chronicles effectively.
徐晨飞, 叶海影, 包平. 基于深度学习的方志物产资料实体自动识别模型构建研究*[J]. 数据分析与知识发现, 2020, 4(8): 86-97.
Xu Chenfei, Ye Haiying, Bao Ping. Automatic Recognition of Produce Entities from Local Chronicles with Deep Learning. Data Analysis and Knowledge Discovery, 2020, 4(8): 86-97.
( Bao Ping, Li Xinsheng, Lu Yong. The Value and Utilization and Prospect of the Historical Materials of Products in Local Chronicles——Take Products in Local Chronicles for Example[J]. Agricultural History of China, 2018,37(3):117-126.)
谢韬. 基于古文学的命名实体识别的研究与实现[D]. 北京: 北京邮电大学, 2018.
( Xie Tao. Research and Implementation of Named Entity Recognition Based on Ancient Literature[D]. Beijing: Beijing University of Posts and Telecommunications, 2018.)
( Wang Zheng. Conditional Random Fields Based Location Name Recognition in Ancient Chinese——Take the “Romance of the Three Kingdoms” as an Example[D]. Nanning: Guangxi University for Nationalities, 2008.)
肖磊. 《左传》地名研究初探[J]. 文教资料, 2009(18):204-207.
( Xiao Lei. A Preliminary Study on Place Names in Zuo Zhuan[J]. Data of Culture and Education, 2009(18):204-207.)
汪青青. 先秦人名识别初探[J]. 文教资料, 2009(18):202-204.
( Wang Qingqing. A Preliminary Study on Name Recognition in Pre-Qin Period[J]. Data of Culture and Education, 2009(18):202-204.)
( Huang Shuiqing, Wang Dongbo, He Lin. Research on Constructing Automatic Recognition Model for Ancient Chinese Place Names Based on Pre-Qin Corpus[J]. Library and Information Service, 2015,59(12):135-140.)
( Ye Hui, Ji Donghong. Research on Symptom and Medicine Information Abstraction of TCM Book Jin Gui Yao Lue Based on Conditional Random Field[J]. Chinese Journal of Library and Information Science for Traditional Chinese Medicine, 2016,40(5):14-17.)
( Wang Dongbo, Gao Ruiqing, Shen Si, et al. Research on Automatic Recognition of Basic Entity Component of Historic Events for Pre-Qin Classics[J]. Journal of the National Library of China, 2018,27(1):65-77.)
( Heng Zhongqing. Research on Knowledge Organization & Content Mining of the Chinese Local Chronicle——Taking Local Chronicle of Guangdong: Produce as an Example[M]. Wuhu: Anhui Normal University Press, 2012.)
( Zhu Suoling. Research on the Application of Named Entity Recognition in Content Mining of Chinese Local Chronicles——Taking Local Chronicle: Produce of Guangdong, Fujian and Taiwan as Examples[D]. Nanjing: Nanjing Agricultural University, 2011.)