Extracting Relationship Among Characters from Local Chronicles with Text Structures and Contents
Wang Yongsheng,Wang Hao(),Yu Wei,Zhou Zeyu
School of Information Management, Nanjing University, Nanjing 210023, China Jiangsu Key Laboratory of Data Engineering and Knowledge Service, Nanjing 210023, China
[Objective] This study proposes a new method to extract relationship among characters from local chronicles, aiming to explore the culture and history information embedded in Yiwu Local Chronicles—Chapter of Persons. [Methods] We constructed the relationship extraction model based on text structures and contents. For text structures, we used the rule templates and word features to extract relationship from the original texts, which was also categorized with different granularity. For the text contents, we introduced a remotely supervised approach to extract relationship. Then, we combined the BERT+Bi-GRU+ATT and BERT+FC deep learning models to transform the relationship extraction to a multi-label classification task. Finally, we reduced the impacts of the noise from remote supervision on the model’s accuracy by correcting relationship labels. [Results] The proposed method realized high automation and yielded better extracted information. The BERT+FC models improved the F1 values by up-to 27%, while different relationship categories showed some affinity. The F1 value of the “strong co-occurrence relationship” was increased by 3% after label correction. [Limitations] We only investigated the relationships among characters in local chronicles. [Conclusions] The new method could effectively extract relationship among the same type of entities in historical Chinese documents.
王永生, 王昊, 虞为, 周泽聿. 融合结构和内容的方志文本人物关系抽取方法*[J]. 数据分析与知识发现, 2022, 6(2/3): 318-328.
Wang Yongsheng, Wang Hao, Yu Wei, Zhou Zeyu. Extracting Relationship Among Characters from Local Chronicles with Text Structures and Contents. Data Analysis and Knowledge Discovery, 2022, 6(2/3): 318-328.
( Zhang Shimin. A Typical Example of the Relationship Between Science and History in Guanzhong-Gaoling County Continuing Chronicle[J]. Chinese Culture, 2020(2):5-11.)
[3]
Zhou Z Y, Zhang H Y. Research on Entity Relationship Extraction in Financial and Economic Field Based on Deep Learning[C]// Proceedings of the 4th International Conference on Computer and Communications. IEEE, 2018: 2430-2435.
[4]
Rosario B. Extraction of Semantic Relations from Bioscience Text[M]. University of California, Berkeley, 2005.
[5]
Singhal A, Simmons M, Lu Z Y. Text Mining for Precision Medicine: Automating Disease-Mutation Relationship Extraction from Biomedical Literature[J]. Journal of the American Medical Informatics Association, 2016, 23(4):766-772.
doi: 10.1093/jamia/ocw041
pmid: 27121612
[6]
Liang C, Zan H, Liu Y, et al. Research on Entity Relation Extraction for Military Field[C]// Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation. 2018.
[7]
卢克治. 基于中医古籍的知识图谱构建与应用[D]. 北京: 北京交通大学, 2020.
[7]
( Lu Kezhi. The Construction and Application of Knowledge Graph Based on the Ancient Books of Traditional Chinese Medicine[D]. Beijing: Beijing Jiaotong University, 2020.)
( Li Na, Bao Ping. Visual Exploration of the Relationship Between Produce Names and Their Alias in Ancient Local Chronicles[J]. Library Tribune, 2017, 37(12):108-114.)
[9]
黄蓓静. 深度学习技术在中文人物关系抽取中的应用研究[D]. 上海: 华东师范大学, 2017.
[9]
( Huang Beijing. Study on the Application of Deep Learning Technology in Chinese Personal Relation Extraction[D]. Shanghai: East China Normal University, 2017.)
( Han Hongqi, Xu Shuo, Gui Jie, et al. Term Hierarchical Relation Extraction Method Based on Morphology Rule Template[J]. Journal of the China Society for Scientific and Technical Information, 2013, 32(7):708-715.)
( Li Dongmei, Zhang Yang, Li Dongyuan, et al. Review of Entity Relation Extraction Methods[J]. Journal of Computer Research and Development, 2020, 57(7):1424-1448.)
( Liu Hui, Jiang Qianjun, Gui Qianjin, et al. Review of Research Progress of Entity Relationship Extraction[J]. Application Research of Computers, 2020, 37(S2):1-5.)
( Zhang Lanxia, Hu Wenxin. Character Relation Extraction in Chinese Text Based on Bidirectional GRU Neural Network and Dual-Attention Mechanism[J]. Computer Applications and Software, 2018, 35(11):130-135.)
[14]
Wu S C, He Y F. Enriching Pre-Trained Language Model with Entity Information for Relation Classification[C]// Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 2019: 2361-2364.
[15]
Yi R L, Hu W X. Pre-Trained BERT-GRU Model for Relation Extraction[C]// Proceedings of the 8th International Conference on Computing and Pattern Recognition. ACM, 2019: 453-457.
[16]
胡欣. 基于网络媒体的人物关系分析方法研究[D]. 成都: 电子科技大学, 2020.
[16]
( Hu Xin. Research on Person Relationship Analysis Method Based on Network Media[D]. Chengdu: University of Electronic Science and Technology of China, 2020.)
( Xie Teng, Yang Junan, Liu Hui. Chinese Entity Relation Extraction Based on Multi-Feature BERT Model[J]. Computer Systems & Applications, 2021, 30(5):253-261.)
( Liu Zhongbao, Dang Jianfei, Zhang Zhijian. Research on Automatic Extraction of Historical Events and Construction of Event Graph Based on Historical Records[J]. Library and Information Service, 2020, 64(11):116-124.)
( Li Yueyan, Wang Hao, Meng Zhen, et al. Semantic Description and Display of Chinese Text Based on Linked Data[J]. Information Studies: Theory & Application, 2021, 44(6):171-179.)
( Wang Yifan, Li Bo, Shi Hua, et al. Annotation Method for Extracting Entity Relationship from Ancient Chinese Works[J]. Data Analysis and Knowledge Discovery, 2021, 5(9):63-74.)
( Wang Xiaoli, Ye Dongyi. Social Media Text Classification Method Based on Character-Word Feature Self-Attention Learning[J]. Pattern Recognition and Artificial Intelligence, 2020, 33(4):287-294.)
( Li Na. On the Knowledge Organization of Ancient Local Chronicle from the Perspective of Social Network Analysis—Taking Local Chronicle: Produce of Shanxi for Example[D]. Nanjing: Nanjing Agricultural University, 2017.)
( Li Na, Bao Ping. Establishment of Automatic Recognition Model of Location Names in Collection of Ancient Local Chronicles Oriented to Digital Humanities[J]. Library, 2018(5):67-73.)
( Xu Chenfei, Ye Haiying, Bao Ping. Automatic Recognition of Produce Entities from Local Chronicles with Deep Learning[J]. Data Analysis and Knowledge Discovery, 2020, 4(8):86-97.)
( Li Na. Construction of Automatic Recognition Model of Multi-Type Named Entities for Local Gazetteers[J]. Library Tribune, 2021, 41(12):113-123.)
[29]
Mintz M, Bills S, Snow R, et al. Distant Supervision for Relation Extraction Without Labeled Data[C]// Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. 2009: 1003-1011.
[30]
Zeng D, Liu K, Chen Y, et al. Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 1753-1762.
( Wang Zhibang, Wang Zhihua. The Deconstruction and Presentation of Yiwu’s History - “Yiwu City Magazine” After Reading[J]. China Local Records, 2013(7):19-24.)