Constructing Knowledge Base for Chinese Geographical Name
Li Xiaomin,Wang Hao(),Li Yueyan,Zhao Meng
School of Information Management, Nanjing University, Nanjing 210023, China Jiangsu Key Laboratory of Data Engineering and Knowledge Service (Nanjing University),Nanjing 210093, China
[Objective] This paper uses linked data technology to study the evolution of geographical names in China, aiming to more effectively conduct digital humanity research. [Methods] First, we constructed the knowledge base CGNE_Onto for the evolution of Chinese geographical names. Then, we formulated the strong and weak marker words to identify evolution type sentences from the historical data. Third, we utilized the BERT-BiLSTM-CRF model to identify the time and place name entities from the evolution type sentences. Fourth, we used the newly generated entities as classes to build the ontology knowledge base, which was visualized from the perspective of direct and indirect path relationship. Finally, we analyzed the numbers and reasons of different evolution types in each dynasty. [Results] The proposed model intuitively demonstrated the evolution of geographical names, and provided some new directions for the analysis of geographical names data. [Limitations] The experimental data set needs to be expanded to improve the quality of evolution feature words. [Conclusions] The knowledge base for place names clearly shows their historical evolutions, as well as the evolution types in different dynasties.
李晓敏,王昊,李跃艳,赵萌. 数字人文视域下中国行政区划地名演化知识库构建及分析研究*[J]. 数据分析与知识发现, 2022, 6(11): 139-153.
Li Xiaomin,Wang Hao,Li Yueyan,Zhao Meng. Constructing Knowledge Base for Chinese Geographical Name. Data Analysis and Knowledge Discovery, 2022, 6(11): 139-153.
(Ministry of Civil Affairs of the People’s Republic of China. Notice of the Ministry of Civil Affairs on Further Strengthening the Protection of Geographical Names and Cultural Heritage[EB/OL].[2021-12-12]. https://www.cpll.cn/law9322.shtml.)
(Li Na, Bao Ping. Establishment of Automatic Recognition Model of Location Names in Collection of Ancient Local Chronicles Oriented to Digital Humanities[J]. Library, 2018(5):67-73.)
(Wang Dongbo, Gao Ruiqing, Shen Si, et al. Research on Automatic Recognition of Basic Entity Component of Historic Events for Pre-Qin Classics[J]. Journal of the National Library of China, 2018, 27(1):65-77.)
[4]
李玉超. 新闻事件地名实体识别和地图链接技术研究[D]. 成都: 电子科技大学, 2020.
[4]
(Li Yuchao. Research on the Identification of Geographical Names of News Events and the Technology of MAP Linking[D]. Chengdu: University of Electronic Science and Technology of China, 2020.)
(Wei Yong, Li Hongfei, Hu Danlu, et al. A Method of Chinese Place Name Recognition Based on Composite Features[J]. Geomatics and Information Science of Wuhan University, 2018, 43(1): 17-23.)
(Shen Si, Zhu Danhao. Chinese Place Name Recognition Based on Deep Learning[J]. Transactions of Beijing Institute of Technology, 2017, 37(11): 1150-1155.)
(Wang Hui. Research on the Authority Control of Proper Nouns of China’s Maritime Customs Archives in Canton (Yuehaiguan) ——Based on Personal Names, Place Names, and Corporate Names[J]. Archives Science Study, 2020(4): 87-96.)
(Xia Cuijuan. The Opening and Application of Chinese Historical Geography Data in Digital Humanities Projects of Libraries[J]. Journal of Library Science in China, 2017, 43(2): 40-53.)
[10]
程宁. 古籍专名数据库的构建与统计分析[J]. 文教资料, 2019(35): 52-56.
[10]
(Cheng Ning. Construction and Statistical Analysis of Database of Proper Names of Ancient Books[J]. Data of Culture and Education, 2019(35): 52-56.)
[11]
达日玛. 清代蒙古盟旗地名数据库的构建[D]. 呼和浩特: 内蒙古大学, 2019.
[11]
(Da Rima. Construction of the Geographic Name Database of Mongolian League Banner in Qing Dynasty[D]. Hohhot: Inner Mongolia University, 2019.)
[12]
Santosh T Y S S, Sanyal D K, Bhowmick P K, et al. Gazetteer-Guided Keyphrase Generation from Research Papers[A]//Advances in Knowledge Discovery and Data Mining[M]. Springer, 2021: 655-667.
[13]
Goldberg D W, Wilson J P, Knoblock C A. Extracting Geographic Features from the Internet to Automatically Build Detailed Regional Gazetteers[J]. International Journal of Geographical Information Science, 2009, 23(1): 93-128.
doi: 10.1080/13658810802577262
[14]
于靖. 城市历史地名时空数据模型研究——以六朝建康为例[D]. 南京: 南京大学, 2015.
[14]
(Yu Jing. Research on Spatial-Temporal Data Modeling of Urban Historical Place Name——Taking Jian Kang in Six Dynasties as an Example[D]. Nanjing: Nanjing University, 2015.)
(Chen Jian, Li Hongwei, Zhang Bin, et al. Toponym Evolvement Analysis Based on the Toponym Ontology[J]. Journal of Geomatics Science and Technology, 2011, 28(6): 446-449.)
[16]
陈玉冰. 行政区划地名知识图谱的构建方法研究[D]. 合肥: 合肥工业大学, 2020.
[16]
(Chen Yubing. Research on the Construction Method of Knowledge Graph for Administrative Geographical Names[D]. Hefei: Hefei University of Technology, 2020.)
[17]
Yang L P, Lin G F, Chen A L, et al. A Spatio-Temporal Data Model for Administrative Division Place Names: A Case Study of Xiamen[C]/ Proceedings of the 6th International Symposium on Digital Earth: Models, Algorithms, and Virtual Reality. 2010: 73-82.
(Du Ping, Xu Peng. Expression of Spatio-Temporal Information of Place Names in Ontology[J]. Journal of Lanzhou Jiaotong University, 2016, 35(6): 137-140.)
[19]
胡颖. 家谱GIS中古今地名的时空关系研究[D]. 南京: 南京师范大学, 2008.
[19]
(Hu Ying. Spatio-Temporal Relationships among Chinese Ancient and Modern Placenames Oriented to Genealogy GIS[D]. Nanjing: Nanjing Normal University, 2008.)
[20]
Devlin J, Chang M W, Lee K, et al. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding[OL]. arXiv Preprint, arXiv: 1810.04805.
(Tang Xiaobo, Xiao Lu. Research on the Construction of the Multi-user Interest Ontology Based on Word Co-occurrence[J]. Information Studies: Theory & Application, 2012, 35(5): 99-102.)