Please wait a minute...
Advanced Search
数据分析与知识发现  2022, Vol. 6 Issue (4): 82-96     https://doi.org/10.11925/infotech.2096-3467.2021.0886
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
面向营商环境的知识图谱构建研究*
刘勘(),徐勤亚,於陆
中南财经政法大学信息与安全工程学院 武汉 430073
Constructing Knowledge Graph for Business Environment
Liu Kan(),Xu Qinya,Yu Lu
School of Information and Safety Engineering, Zhongnan University of Economics and Law, Wuhan 430073,China
全文: PDF (3267 KB)   HTML ( 31
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】 构建营商环境知识图谱,提升营商环境信息资源利用率,发现营商环境发展因素内部实体关系,为决策分析提供支持。【方法】 以北京市营商环境政策文本为数据集构建营商环境知识图谱,提出一种融合依存句法分析和语义角色标注的营商环境知识抽取方法,构建组合模型分类器筛选实体关系三元组,计算语义相似度进行关系名称融合对齐,并设计实验探究Trans R模型在营商环境领域链接预测任务效果差异的主要影响因素和使用调整策略,完成知识推理。【结果】 所构建的营商环境知识图谱包含31 955种实体,1 847种关系,45 682个三元组,通过Neo4j和Gephi进行存储和可视化,支持使用Cypher语句进行知识查询。【局限】 由于营商环境文本上下文信息复杂,如何针对指代不明确的实体构建模型、提高营商环境政策文本知识抽取效果进而改善知识图谱三元组质量有待后续研究。【结论】 通过构建营商环境知识图谱揭示了营商环境领域知识之间的关联性,为营商环境知识问答系统构建、政府业务流程整合重塑和制定优化营商环境决策提供科学依据。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
刘勘
徐勤亚
於陆
关键词 营商环境知识图谱知识抽取关系对齐链接预测    
Abstract

[Objective] This paper builds knowledge graph for business environment to improve the utilization of resources, aiming to discover the internal entity relationship of development factors, and analyze government decision-making. [Methods] We constructed the knowledge graph based on business environment policy of Beijing, and proposed a knowledge extraction method integrating dependency syntax analysis and semantic role annotation. Then, we constructed a combined classifier to identify entity relationship triples, calculate semantic similarity, as well as perform relationship name fusion and alignment. We also designed an experiment to explore the performance of trans R model in different link prediction tasks. Finally, we identified the main influencing factors and used adjustment strategies to complete knowledge reasoning. [Results] The newly constructed knowledge graph contains 31,955 entities, 1,847 relationships and 45,682 triples. The data was stored and visualized with Neo4j and Gephi, which also supported knowledge query using cypher statement. [Limitations] Due to the complex context information, more research is needed to build a model for unclear entities to improve the performance of knowledge extraction and the quality of knowledge graph triples. [Conclusions] Our new knowledge graph could help to build an effective Q&A system, and improve the government decision-making to optimize business environment.

Key wordsBusiness Environment    Knowledge Graph    Knowledge Extraction    Relationship Alignment    Link Prediction
收稿日期: 2021-08-23      出版日期: 2022-05-12
ZTFLH:  TP391  
基金资助:*中央高校基本科研业务费交叉学科创新研究项目(2722020JX007);中南财经政法大学硕士研究生实践创新项目(202151420)
通讯作者: 刘勘,ORCID:0000-0002-9686-9768     E-mail: liukan@zuel.edu.cn
引用本文:   
刘勘, 徐勤亚, 於陆. 面向营商环境的知识图谱构建研究*[J]. 数据分析与知识发现, 2022, 6(4): 82-96.
Liu Kan, Xu Qinya, Yu Lu. Constructing Knowledge Graph for Business Environment. Data Analysis and Knowledge Discovery, 2022, 6(4): 82-96.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2021.0886      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2022/V6/I4/82
Fig.1  营商环境数据采集
Fig.2  营商环境知识图谱构建流程
标签 含义
A0 施事
A1 受事
A2 影响范围
A3 动作开始
A4 动作结束
A5 其他动词相关
Table 1  核心语义角色
Fig.3  语义角色标注示例
Fig.4  依存句法分析示例
模型 精确率 召回率 F-score
MLP 0.841 0.854 0.848
SVM 0.787 0.906 0.843
FastText 0.796 0.867 0.830
组合模型 0.817 0.901 0.857
Table 2  各分类模型实验结果
Fig.5  组合模型分类器
序号 三元组
1 (人防部门,提出,人防工程设计条件)
2 (市区住房城乡建设部门,建立,企业违反承诺失信信息记载台账)
3 (施工单位,提交,工程项目安全生产标准化自评材料)
4 (市商务局, 申请, 投资补助)
5 (当事人, 办理, 动产担保业务)
6 (市住房城乡建设委, 发布, 资料清单)
Table 3  知识筛选结果(部分)
融合前 融合后
反馈,反馈向,反馈至,反馈给 反馈
提升,提高,增强,提高对,提高到,提升至 提升
实施,实施根据,实施对,实施依照,实施将,实施在,实施向,实施至 实施
……
Table 4  关系对齐前后对比示例
Fig.6  知识嵌入图
模型 参数名称 参数值
Trans R 嵌入维度 500
迭代次数 200
边距超参数 4.0
负采样方法 概率抽样
负采样数量 25
学习率 0.001
Table 5  模型参数设置
Fig.7  不同影响因素对营商环境链接预测效果的影响
Fig.8  负采样方法及数量对营商环境链接预测效果的影响
头实体 关系 预测结果
质量安全监督机构 相关 市住房城乡建设委;各区住建部门;申请资料
市住房城乡建设委 公布 本市保障安全施工资料清单;资料清单;电话
北京市公共资源交易建设工程分平台 相关 运行情况报告;施工内容;行政审批电子文件归档
Table 6  链接预测结果
Fig.9  营商环境知识图谱
Fig.10  营商环境知识图谱(部分)
[1] Bollacker K, Evans C, Paritosh P, et al. Freebase: A Collaboratively Created Graph Database for Structuring Human Knowledge[C]//Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. 2008: 1247-1250.
[2] Etzioni O, Cafarella M, Downey D, et al. Web-Scale Information Extraction in Knowitall: (Preliminary Results)[C]//Proceedings of the 13th International Conference on World Wide Web. 2004: 100-110.
[3] Suchanek F M, Kasneci G, Weikum G. Yago: A Core of Semantic Knowledge[C]//Proceedings of the 16th International Conference on World Wide Web. 2007: 697-706.
[4] Auer S, Bizer C, Kobilarov G, et al. DBpedia: A Nucleus for a Web of Open Data[C]//Proceedings of the 6th International Semantic Web Conference. 2007: 722-735.
[5] Carlson A, Betteridge J, Kisiel B, et al. Toward an Architecture for Never-Ending Language Learning[C]//Proceedings of the 24th AAAI Conference on Artificial Intelligence. 2010:1306-1313.
[6] Dong X, Gabrilovich E, Heitz G, et al. Knowledge Vault: A Web-Scale Approach to Probabilistic Knowledge Fusion[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2014: 601-610.
[7] 许智宏, 于子琪, 董永峰, 等. 影评情感分析知识图谱构建研究[J]. 计算机仿真, 2020, 37(8):424-430.
[7] ( Xu Zhihong, Yu Ziqi, Dong Yongfeng, et al. Research on Constructing the Knowledge Graph Based on Emotional Analysis of Film Review[J]. Computer Simulation, 2020, 37(8):424-430.)
[8] 欧阳剑, 梁珠芳, 任树怀. 大规模中国历代存世典籍知识图谱构建研究[J]. 图书情报工作, 2021, 65(5):126-135.
[8] ( Ouyang Jian, Liang Zhufang, Ren Shuhuai. Research on the Construction of Knowledge Graph of Large-Scale Chinese Ancient Books[J]. Library and Information Service, 2021, 65(5):126-135.)
[9] 刘鹏, 叶帅, 舒雅, 等. 煤矿安全知识图谱构建及智能查询方法研究[J]. 中文信息学报, 2020, 34(11):49-59.
[9] ( Liu Peng, Ye Shuai, Shu Ya, et al. Coalmine Safety: Knowledge Graph Construction and Its QA Approach[J]. Journal of Chinese Information Processing, 2020, 34(11):49-59.)
[10] Shen G W, Wang W L, Mu Q L, et al. Data-Driven Cybersecurity Knowledge Graph Construction for Industrial Control System Security[J]. Wireless Communications and Mobile Computing, 2020: 8883696.
[11] Fang W L, Ma L, Love P E D, et al. Knowledge Graph for Identifying Hazards on Construction Sites: Integrating Computer Vision with Ontology[J]. Automation in Construction, 2020, 119:103310.
doi: 10.1016/j.autcon.2020.103310
[12] Huang H C, Hong Z, Zhou H M, et al. Knowledge Graph Construction and Application of Power Grid Equipment[J]. Mathematical Problems in Engineering, 2020: 8269082.
[13] 廖开际, 黄琼影, 席运江. 在线医疗社区问答文本的知识图谱构建研究[J]. 情报科学, 2021, 39(3):51-59.
[13] ( Liao Kaiji, Huang Qiongying, Xi Yunjiang. Knowledge Graph Construction of Online Medical Community Q&A Texts[J]. Information Science, 2021, 39(3):51-59.)
[14] Rotmensch M, Halpern Y, Tlimat A, et al. Learning a Health Knowledge Graph from Electronic Medical Records[J]. Scientific Reports, 2017, 7:5994.
doi: 10.1038/s41598-017-05778-z pmid: 28729710
[15] Wang L, Xie H M, Han W T, et al. Construction of a Knowledge Graph for Diabetes Complications from Expert-Reviewed Clinical Evidences[J]. Computer Assisted Surgery (Abingdon), 2020, 25(1):29-35.
[16] Xiu X L, Qian Q, Wu S Z. Construction of a Digestive System Tumor Knowledge Graph Based on Chinese Electronic Medical Records: Development and Usability Study[J]. JMIR Medical Informatics, 2020, 8(10):e18287.
doi: 10.2196/18287
[17] 向军毅, 胡慧君, 刘宇, 等. COVID-19物资知识图谱的构建[J]. 武汉大学学报(理学版), 2020, 66(5):409-417.
[17] ( Xiang Junyi, Hu Huijun, Liu Yu, et al. Construction of COVID-19 Supplies Knowledge Graph[J]. Journal of Wuhan University (Natural Science Edition), 2020, 66(5):409-417.)
[18] 杜志强, 李钰, 张叶廷, 等. 自然灾害应急知识图谱构建方法研究[J]. 武汉大学学报(信息科学版), 2020, 45(9):1344-1355.
[18] ( Du Zhiqiang, Li Yu, Zhang Yeting, et al. Knowledge Graph Construction Method on Natural Disaster Emergency[J]. Geomatics and Information Science of Wuhan University, 2020, 45(9):1344-1355.)
[19] Xiao Z W, Zhang C X. Construction of Meteorological Simulation Knowledge Graph Based on Deep Learning Method[J]. Sustainability, 2021, 13(3):1311.
doi: 10.3390/su13031311
[20] 吴赛赛, 周爱莲, 谢能付, 等. 基于深度学习的作物病虫害可视化知识图谱构建[J]. 农业工程学报, 2020, 36(24):177-185.
[20] ( Wu Saisai, Zhou Ailian, Xie Nengfu, et al. Construction of Visualization Domain-Specific Knowledge Graph of Crop Diseases and Pests Based on Deep Learning[J]. Transactions of the Chinese Society of Agricultural Engineering, 2020, 36(24):177-185.)
[21] Zhang Y H, Zhu J, Zhu Q, et al. The Construction of Personalized Virtual Landslide Disaster Environments Based on Knowledge Graphs and Deep Neural Networks[J]. International Journal of Digital Earth, 2020, 13(12):1637-1655.
doi: 10.1080/17538947.2020.1773950
[22] 申云凤, 王英杰. 基于网络新闻语料的公共危机事件知识图谱构建[J]. 情报科学, 2021, 39(1):72-80.
[22] ( Shen Yunfeng, Wang Yingjie. Knowledge Mapping of Public Crisis Events Based on Internet News Corpus[J]. Information Science, 2021, 39(1):72-80.)
[23] 邵琦, 牟冬梅, 王萍, 等. 基于语义的突发公共卫生事件网络舆情主题发现研究[J]. 数据分析与知识发现, 2020, 4(9):68-80.
[23] ( Shao Qi, Mu Dongmei, Wang Ping, et al. Identifying Subjects of Online Opinion from Public Health Emergencies[J]. Data Analysis and Knowledge Discovery, 2020, 4(9):68-80.)
[24] 吕华揆, 洪亮, 马费成. 金融股权知识图谱构建与应用[J]. 数据分析与知识发现, 2020, 4(5):27-37.
[24] ( Lv Huakui, Hong Liang, Ma Feicheng. Constructing Knowledge Graph for Financial Equities[J]. Data Analysis and Knowledge Discovery, 2020, 4(5):27-37.)
[25] 陈璟浩, 曾桢, 李纲. 基于知识图谱的“一带一路”投资问答系统构建[J]. 图书情报工作, 2020, 64(12):95-105.
[25] ( Chen Jinghao, Zeng Zhen, Li Gang. A Question Answering System for “the Belt and Road” Investment Based on Knowledge Graph[J]. Library and Information Service, 2020, 64(12):95-105.)
[26] 王飞, 刘井平, 刘斌, 等. 代码知识图谱构建及智能化软件开发方法研究[J]. 软件学报, 2020, 31(1):47-66.
[26] ( Wang Fei, Liu Jingping, Liu Bin, et al. Survey on Construction of Code Knowledge Graph and Intelligent Software Development[J]. Journal of Software, 2020, 31(1):47-66.)
[27] 高晨翔, 黄新荣. 区域政务微博知识图谱构建及可视化研究[J]. 现代情报, 2020, 40(12):90-99.
[27] ( Gao Chenxiang, Huang Xinrong. Knowledge Graph Construction and Visualization of Regional Government Microblog[J]. Journal of Modern Information, 2020, 40(12):90-99.)
[28] Al-Khatib K, Hou Y F, Wachsmuth H, et al. End-to-End Argumentation Knowledge Graph Construction[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2020, 34(5):7367-7374.
[29] Nie Z W, Liu Y J, Yang L Y, et al. Construction and Application of Materials Knowledge Graph Based on Author Disambiguation: Revisiting the Evolution of LiFePO4[J]. Advanced Energy Materials, 2021, 11(16):2003580.
doi: 10.1002/aenm.202003580
[30] Buscaldi D, Dessì D, Motta E, et al. Mining Scholarly Publications for Scientific Knowledge Graph Construction[A]//Hitzler P, Kirrane S, Hartig O, et al. The Semantic Web: ESWC 2019 Satellite Events[M]. 2019: 8-12.
[31] 王雨飞, 张睿嘉, 王光辉. 营商环境、“五通”合作与亚欧国家经济增长[J]. 中国行政管理, 2020(9):114-120.
[31] ( Wang Yufei, Zhang Ruijia, Wang Guanghui. Business Environment, “Five Connectivity” Cooperation and Economic Growth of Asian and European Countries[J]. Chinese Public Administration, 2020(9):114-120.)
[32] Bétila R R. The Impact of Ease of doing Business on Economic Growth: A Dynamic Panel Analysis for African Countries[J]. SN Business & Economics, 2021, 1(10):1-34.
[33] 许中缘, 范沁宁. 法治化营商环境的区域特征、差距缘由与优化对策[J]. 武汉大学学报(哲学社会科学版), 2021, 74(4):149-160.
[33] ( Xu Zhongyuan, Fan Qinning. Regional Characteristics, Reasons of Differences and Optimization Countermeasures of a Law-Based Business Environment[J]. Wuhan University Journal (Philosophy & Social Science), 2021, 74(4):149-160.)
[34] 董雪芹. 基于科学知识图谱的营商环境研究热点与趋势分析[J]. 现代商贸工业, 2021, 42(21):24-25.
[34] ( Dong Xueqin. Research Hotspots and Trends of Business Environment Based on Scientific Knowledge Graph[J]. Modern Business Trade Industry, 2021, 42(21):24-25.)
[35] 万超, 孔锴. 优化营商环境的路径——基于知识图谱分析的视角[J]. 沈阳大学学报(社会科学版), 2021, 23(2):172-178.
[35] ( Wan Chao, Kong Kai. Path of Optimizing Business Environment—Perspective Based on Knowledge Map Analysis[J]. Journal of Shenyang University (Social Science), 2021, 23(2):172-178.)
[36] 张秦, 孙长坪. 基于CiteSpace的我国营商环境研究重点与趋势的知识图谱分析[J]. 统计与管理, 2021, 36(11):124-128.
[36] ( Zhang Qin, Sun Changping. Knowledge Graph Analysis of China’s Business Environment Research Priorities and Trends Based on CiteSpace[J]. Statistics and Management, 2021, 36(11):124-128.)
[37] Joulin A, Grave E, Bojanowski P, et al. Bag of Tricks for Efficient Text Classification[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. 2017: 427-431.
[38] 秦晓慧, 侯霞, 赵雪. 一种融合语义角色和依存句法的实体关系抽取算法[J]. 北京信息科技大学学报, 2019, 34(1):64-67.
[38] ( Qin Xiaohui, Hou Xia, Zhao Xue. An Entity Relation Extraction Algorithm Based on Semantic Roles Labeling and Dependency Parsing[J]. Journal of Beijing Information Science & Technology University, 2019, 34(1):64-67.)
[39] 王家辉, 夏志杰, 王诣铭, 等. 基于句法规则和社会网络分析的网络舆情热点主题可视化及演化研究[J]. 情报科学, 2020, 38(7):132-139.
[39] ( Wang Jiahui, Xia Zhijie, Wang Yiming, et al. Visualization and Evolution of Hot Topics of Internet Public Opinion Based on Syntax Rules and Social Network Analysis[J]. Information Science, 2020, 38(7):132-139.)
[40] Lin Y, Liu Z, Sun M, et al. Learning Entity and Relation Embeddings for Knowledge Graph Completion[C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence. 2015:2181-2187.
[41] 程开原, 姚俊萍, 李晓军, 等. 时态网络中知识图谱推荐: 关键技术与研究进展[J]. 中国电子科学研究院学报, 2021, 16(2):174-183.
[41] ( Cheng Kaiyuan, Yao Junping, Li Xiaojun, et al. Recommendation Based on Knowledge Graph in Temporal Networks: Key Technologies and Progress[J]. Journal of China Academy of Electronics and Information Technology, 2021, 16(2):174-183.)
[42] 余传明, 张贞港, 孔令格. 面向链接预测的知识图谱表示模型对比研究[J]. 数据分析与知识发现, 2021, 5(11):29-44.
[42] ( Yu Chuanming, Zhang Zhengang, Kong Lingge. Comparing Knowledge Graph Representation Models for Link Prediction[J]. Data Analysis and Knowledge Discovery, 2021, 5(11):29-44.)
[1] 张卫, 王昊, 陈玥彤, 范涛, 邓三鸿. 融合迁移学习与文本增强的中文成语隐喻知识识别与关联研究*[J]. 数据分析与知识发现, 2022, 6(2/3): 167-183.
[2] 刘政昊, 钱宇星, 衣天龙, 吕华揆. 知识关联视角下金融证券知识图谱构建与相关股票发现*[J]. 数据分析与知识发现, 2022, 6(2/3): 184-201.
[3] 程子佳, 陈翀. 面向流行性疾病科普的用户问题理解与答案内容组织*[J]. 数据分析与知识发现, 2022, 6(2/3): 202-211.
[4] 侯党, 傅湘玲, 高嵩峰, 彭雷, 王友军, 宋美琦. 基于企业知识图谱的企业关联关系挖掘*[J]. 数据分析与知识发现, 2022, 6(2/3): 212-221.
[5] 周阳,李学俊,王冬磊,陈方,彭莉娟. 炸药配方设计知识图谱的构建与可视分析方法研究*[J]. 数据分析与知识发现, 2021, 5(9): 42-53.
[6] 沈科杰, 黄焕婷, 化柏林. 基于公开履历数据的人物知识图谱构建*[J]. 数据分析与知识发现, 2021, 5(7): 81-90.
[7] 阮小芸,廖健斌,李祥,杨阳,李岱峰. 基于人才知识图谱推理的强化学习可解释推荐研究*[J]. 数据分析与知识发现, 2021, 5(6): 36-50.
[8] 李贺,刘嘉宇,李世钰,吴迪,金帅岐. 基于疾病知识图谱的自动问答系统优化研究*[J]. 数据分析与知识发现, 2021, 5(5): 115-126.
[9] 石湘,刘萍. 基于知识元语义描述模型的领域知识抽取与表示研究 *——以信息检索领域为例[J]. 数据分析与知识发现, 2021, 5(4): 123-133.
[10] 代冰,胡正银. 基于文献的知识发现新近研究综述 *[J]. 数据分析与知识发现, 2021, 5(4): 1-12.
[11] 朱冬亮, 文奕, 万子琛. 基于知识图谱的推荐系统研究综述*[J]. 数据分析与知识发现, 2021, 5(12): 1-13.
[12] 余传明, 张贞港, 孔令格. 面向链接预测的知识图谱表示模型对比研究*[J]. 数据分析与知识发现, 2021, 5(11): 29-44.
[13] 陈仕吉, 邱均平, 余波. 基于Overlay图谱的图情领域大数据主题分析*[J]. 数据分析与知识发现, 2021, 5(10): 51-59.
[14] 邵琦,牟冬梅,王萍,靳春妍. 基于语义的突发公共卫生事件网络舆情主题发现研究*[J]. 数据分析与知识发现, 2020, 4(9): 68-80.
[15] 梁野,李小元,许航,胡伊然. CLOpin:一种面向舆情分析与预警领域的跨语言知识图谱架构*[J]. 数据分析与知识发现, 2020, 4(6): 1-14.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn