Please wait a minute...
Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (4): 123-133    DOI: 10.11925/infotech.2096-3467.2020.0794
Current Issue | Archive | Adv Search |
Extraction and Representation of Domain Knowledge with Semantic Description Model and Knowledge Elements——Case Study of Information Retrieval
Shi Xiang,Liu Ping()
School of Information Management, Wuhan University, Wuhan 430072, China
Download: PDF (1837 KB)   HTML ( 10
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper tries to extract and integrate domain knowledge from heterogeneous data based on knowledge elements, aiming to enrich the semantic information of knowledge representation. [Methods] We proposed a new method to extract and represent knowledge based on the semantic description model with knowledge elements. Then, we examined our model in the field of information retrieval. [Results] We extracted 4,200 knowledge elements and 3,020 entities on information retrieval from Wikipedia and two classic textbooks. We could query the relationship between knowledge elements and their entities. [Limitations] The semantic relations among knowledge elements were not adequately explored, and the process of knowledge extraction was not fully automated. [Conclusions] This paper improves the semantics of knowledge representation, and provides new perspectives for domain knowledge service.

Key wordsKnowledge Element      Semantic Description Model      Knowledge Extraction      Knowledge Representation     
Received: 17 August 2020      Published: 24 November 2020
ZTFLH:  分类号: TP182  
Fund:National Natural Science Foundation of China(71573196)
Corresponding Authors: Liu Ping     E-mail: pliuleeds@126.com

Cite this article:

Shi Xiang,Liu Ping. Extraction and Representation of Domain Knowledge with Semantic Description Model and Knowledge Elements——Case Study of Information Retrieval. Data Analysis and Knowledge Discovery, 2021, 5(4): 123-133.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2020.0794     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2021/V5/I4/123

知识维度 类别 说明
知识内涵 定义 描述知识定义的知识元
思想 描述知识理论与思想的知识元
背景 描述知识历史研究背景的知识元
方法 描述知识应用步骤和实现方法的知识元
案例 描述知识应用的知识元
评价 描述知识应用效果的知识元
资源 描述知识相关资源的知识元
知识表现形式 文字 以文字语言形式存在的知识元
图文 以文本、图表混合形式存在的知识元
Domain Knowledge Element Division
类别 知识元
定义 指一种反馈循环,利用与当前查询相关的已知文档把查询q转换为改进查询qm,期望查询qm可以返回更多与q相关的文档。
思想 在信息检索的过程中通过用户交互来提高最终的检索效果。基本过程包括:(1)用户提交一个简短的查询;(2)系统返回初次检索结果;(3)用户对部分结果进行标注,将它们标注为相关或不相关;(4)系统基于用户的反馈计算出一个更好的查询表示信息需求;(5)利用新查询系统返回新的检索结果。
背景 在实现信息检索时由于用户检索需求本身不明确、不熟悉检索环境等问题,使构造的查询式简短、模糊,很难充分表达用户的真实需求,从而导致信息检索系统的准确率和召回率不够高。针对这一难题,学者提出相关反馈技术用以构造更高质量的查询主题,降低查询主题与用户实际需求的差距,尽可能使检索结果更好地满足用户的查询需求。
方法 20世纪70年代Salton提出的SMART系统中引入一种相关反馈算法:Rocchio算法,并广泛流传。在一个真实的信息检索场景中,假定有一个用户查询,并知道部分相关文档和不相关文档的信息,则可以通过如下公式得到修改后的查询向量qmqm=αqo+β1DrdjDrdj-γ1DnrdjDnrdj
案例 图像搜索是一个使用相关反馈的例子,在图像搜索中返回的结果非常直观,而且用户不容易用词语表达其需求,但是却很容易标识相关和不相关的图像结果。详见演示系统:https://nlp.stanford.edu/IR-book/html/htmledition/relevance-Feedback-and-pseudo-relevance-feedback-1.html。
评价 首先计算出原始查询q的正确率-召回率曲线,一轮相关反馈之后,计算出修改后的查询qm并再次计算出新的正确率-召回率曲线。
资源 Spink A, Losee R M.Feedback in Information Retrieval[J]. Annual Review of Information Science & Technology, 1996,31:33-78.
Knowledge Content Classification of Relevance Feedback Knowledge Element
知识元
描述维度
内容 实例
性质特征 知识元语句 余弦相似性是内积空间中两个非零向量之间相似性的度量,被定义为两个向量之间夹角的余弦
知识元标识符 /definition/text/10
知识内涵 定义
知识表现形式 文字
语义结构 概念集 余弦相似度
知识三元组 (相似度计算,子概念,余弦相似度)
词汇与概念映射 余弦相似性-余弦相似度
资源属性 知识载体 《信息检索导论》
资源标识符 /textbook/pdf/15
类型 PDF
来源 https://nlp.stanford.edu/IR-book
Example of Cosine Similarity Definition Knowledge Element
Example of Knowledge Element Model
Knowledge Extraction and Representation Based on Knowledge Element Semantic Description Model
知识元类型 规则模板
定义 *【称为|称之为|定义为|叫做|称|叫|定义是】【::】?<概念>*
思想 <概念>【的思想|的主要思想|基于】*
背景 *【年】?【提出|应用|研究】<概念>*
方法 <概念>*【步骤|方法|过程|公式|代码|算法】
评价 <概念>*【相较于|相比|优点|缺点|问题】
资源 <概念>*【文献|工具|会议|课程】
Knowledge Element Matching Rules (Partial)
Sketch Map of Knowledge Representation Method
Examples of Domain Knowledge Representation in Information Retrieval
Search Interface of Knowledge Element Retrieval System
Search Result of Knowledge Element Retrieval System
[1] 张立, 吴素平, 周丹. 国内外知识服务相关概念追踪与辨析[J]. 科技与出版, 2020,39(2):5-12.
[1] ( Zhang Li, Wu Suping, Zhou Dan. Tracking and Discrimination of Related Concepts of Knowledge Service at Home and Abroad[J]. Science-Technology & Publication, 2020,39(2):5-12.)
[2] 方俊伟, 崔浩冉, 贺国秀, 等. 基于先验知识TextRank的学术文本关键词抽取[J]. 情报科学, 2019,37(3):75-80.
[2] ( Fang Junwei, Cui Haoran, He Guoxiu, et al. Keyword Extraction of Academic Text with TextRank Model Based on Prior[J]. Information Science, 2019,37(3):75-80.)
[3] 王忠义, 夏立新, 李玉海. 基于知识内容的数字图书馆跨学科多粒度知识表示模型构建[J]. 中国图书馆学报, 2019,45(6):50-64.
[3] ( Wang Zhongyi, Xia Lixin, Li Yuhai. Construction of Interdisciplinary Multi-Granularity Knowledge Representation Model in Digital Library Based on Knowledge Content[J]. Journal of Library Science in China, 2019,45(6):50-64.)
[4] 张肃, 许慧. 基于知识图谱的企业知识服务模型构建研究[J]. 情报科学, 2020,38(8):68-73.
[4] ( Zhang Su, Xu Hui. Construction of Enterprise Knowledge Service Model Based on Knowledge Map[J]. Information Science, 2020,38(8):68-73.)
[5] Fawei B, Pan J Z, Kollingbaum M, et al. A Semi-automated Ontology Construction for Legal Question Answering[J]. New Generation Computing, 2019,37(9):453-478.
doi: 10.1007/s00354-019-00070-2
[6] 童名文, 牛琳, 杨琳, 等. 课程本体自动构建技术研究[J]. 计算机科学, 2016,43(S2):108-112.
[6] ( Tong Mingwen, Niu Lin, Yang Lin, et al. Research on Technique of Course Ontology Automatically Constructing[J]. Computer Science, 2016,43(S2):108-112.)
[7] Kashyap V, Borgida A. Representing the UMLS® Semantic Network Using OWL[C]// Proceedings of International Semantic Web Conference. 2003: 1-16.
[8] Rizvi R F, Jake V, Adam T J, et al. iDISK: The Integrated DIetary Supplements Knowledge Base[J]. Journal of the American Medical Informatics Association, 2012,27(4):539-548.
doi: 10.1093/jamia/ocz216
[9] Miwa M, Bansal M. End-to-End Relation Extraction Using LSTMs on Sequences and Tree Structures[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016: 1105-1116.
[10] 秦春秀, 杨智娟, 赵捧未, 等. 面向科技文献知识表示的知识元本体模型[J]. 图书情报工作, 2018,62(3):94-103.
[10] ( Qin Chunxiu, Yang Zhijuan, Zhao Pengwei, et al. The Knowledge Element Ontology Model of Scientific Literature for Knowledge Representation[J]. Library and Information Service, 2018,62(3):94-103.)
[11] 李贺, 杜杏叶. 基于知识元的学术论文内容创新性智能化评价研究[J]. 图书情报工作, 2020,64(1):93-104.
[11] ( Li He, Du Xingye. Research on Intelligent Evaluation for the Content Innovation of Academic Papers[J]. Library and Information Service, 2020,64(1):93-104.)
[12] 王萍, 王美月, 王益成, 等. 政府网站信息资源知识元模型与可视化表征研究[J]. 图书情报工作, 2018,62(23):14-21.
[12] ( Wang Ping, Wang Meiyue, Wang Yicheng, et al. Study on Knowledge Element Model and Visual Representation of Government Website Information Resources[J]. Library and Information Service, 2018,62(23):14-21.)
[13] 李晓飞. 基于知识元的网络学习资源聚合模型设计与应用研究[D]. 武汉:华中师范大学, 2017.
[13] ( Li Xiaofei. Research on the Design and Application of Network Learning Resource Aggregation Model Based on Knowledge Element[D]. Wuhan: Central China Normal University, 2017.)
[14] Liao K J, Xiong H H, Ye D H, et al. A Method of Emergency Management Based on Knowledge Element Theory[J]. Journal of Software, 2012,7(1):41-48.
[15] 李振, 周东岱. 教育知识图谱的概念模型与构建方法研究[J]. 电化教育研究, 2019,40(8):78-86,113.
[15] ( Li Zhen, Zhou Dongdai. Research on Conceptual Model and Construction Method of Educational Knowledge Graph[J]. e-Education Research, 2019,40(8):78-86,113.)
[16] 袁满, 仇婷婷, 胡超. 细粒度课程知识元组织模型及知识图谱实现[J]. 吉林大学学报(信息科学版), 2019,37(5):526-532.
[16] ( Yuan Man, Qiu Tingting, Hu Chao. Fine-Grained Course Knowledge Meta-Organization Model and Knowledge Graph Implementation[J]. Journal of Jilin University (Information Science Edition), 2019,37(5):526-532.)
[17] 谭荧, 唐亦非. 面向科学文献的事实知识元自动抽取方法研究[J]. 情报科学, 2020,38(4):23-27,36.
[17] ( Tan Ying, Tang Yifei. Automatic Extraction of Factual Knowledge Element from Scientific Literature[J]. Information Science, 2020,38(4):23-27, 36.)
[18] Xie N F, Wei X, Hao X N. Research on Knowledge Element Relation and Knowledge Service for Agricultural Literature Resource[C]// Proceedings of the 3rd International Conference on Innovation in Artificial Intelligence. 2019: 172-176.
[19] 冯琴荣, 苗夺谦, 程昳, 等. 知识的划分粒度表示法[J]. 模式识别与人工智能, 2009,22(1):64-69.
[19] ( Feng Qinrong, Miao Duoqian, Cheng Yi, et al. Knowledge Representation Using Partition Granularity[J]. Pattern Recognition and Artificial Intelligence, 2009,22(1):64-69.)
[20] Jiang L, Yang Z K, Wang J X. Knowledge Indexing of Chinese Text Based Knowledge Element[C]// Proceedings of the 1st International Symposium on Knowledge Acquisition & Modeling. 2008: 35-38.
[21] 高国伟, 王亚杰, 李佳卉, 等. 基于知识元的知识库架构模型研究[J]. 情报科学, 2016,34(3):37-41.
[21] ( Gao Guowei, Wang Yajie, Li Jiahui, et al. Knowledge Base Frame Structure Research Based on Knowledge Element[J]. Information Science, 2016,34(3):37-41.)
[22] 戎军涛. 学术文献内容知识元语义描述模型研究[J]. 情报科学, 2019,37(7):30-35.
[22] ( Rong Juntao. Semantic Description Model of Academic Literature Content Based on Knowledge Element[J]. Information Science, 2019,37(7):30-35.)
[23] 索传军, 盖双双. 知识元的内涵、结构与描述模型研究[J]. 中国图书馆学报, 2018,44(4):54-72.
[23] ( Suo Chuanjun, Gai Shuangshuang. The Connotation, Structure and Description Model of Knowledge Unit[J]. Journal of Library Science in China, 2018,44(4):54-72.)
[24] 付蕾. 知识元标引系统的设计与实现[D]. 武汉:华中师范大学, 2009.
[24] ( Fu Lei. Design and Implementation of Knowledge Element Indexing System[D]. Wuhan: Central China Normal University, 2009.)
[25] Mihalcea R, Tarau P. TextRank: Bringing Order into Texts[C]// Proceedings of the 9th Conference on Empirical Methods in Natural Language Processing. 2004: 404-411.
[26] 王洋洋. 基于海量学术资源的知识元抽取研究[D]. 宁波: 宁波大学, 2014.
[26] ( Wang Yangyang. Research on Knowledge Extraction Based on Massive Academic Resources[D]. Ningbo: Ningbo University, 2014.)
[27] Dong C H, Zhang J J, Zong C Q, et al. Character-based LSTM-CRF with Radical-level Features for Chinese Named Recognition[C]// Proceedings of the 24th International Conference on Computer Processing of Oriental Languages. 2016: 239-250.
[28] Zhou P, Shi W, Tian J, et al. Bidirectional Long Short-Term Memory Networks for Relation Classification[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2016: 207-212.
[29] 克里斯托弗·D·曼宁, 普拉巴卡尔·拉格万, 亨里奇·辛里奇. 信息检索导论[M]. 王斌译. 北京:人民邮电出版社, 2010.
[29] ( Manning C D, Raghavan P, Schütze H. Introduction to Information Retrieval[M]. Translated by Wang Bin. Beijing: Posts & Telecom Press, 2010.)
[30] 贝萨耶茨·里卡多, 里贝内托·贝蒂埃. 现代信息检索[M].黄萱菁, 张奇, 邱锡鹏译. 第2版. 北京: 机械工业出版社, 2012.
[30] ( Ricardo B Y, Berthier R. Modern Information Retrieval[M]. Translated by Huang Xuanjing, Zhang Qi, Qiu Xipeng. The 2nd Edition. Beijing: China Machine Press, 2012.)
[31] 余丽, 钱力, 付常雷, 等. 基于深度学习的文本中细粒度知识元抽取方法研究[J]. 数据分析与知识发现, 2019,3(1):38-45.
[31] ( Yu Li, Qian Li, Fu Changlei, et al. Extracting Fine-grained Knowledge Units from Texts with Deep Learning[J]. Data Analysis and Knowledge Discovery, 2019,3(1):38-45.)
[1] Yu Chuanming, Wang Manyi, Lin Hongjun, Zhu Xingyu, Huang Tingting, An Lu. A Comparative Study of Word Representation Models Based on Deep Learning[J]. 数据分析与知识发现, 2020, 4(8): 28-40.
[2] Yu Chuanming,Yuan Sai,Zhu Xingyu,Lin Hongjun,Zhang Puliang,An Lu. Research on Deep Learning Based Topic Representation of Hot Events[J]. 数据分析与知识发现, 2020, 4(4): 1-14.
[3] Chuanming Yu,Haonan Li,Manyi Wang,Tingting Huang,Lu An. Knowledge Representation Based on Deep Learning:Network Perspective[J]. 数据分析与知识发现, 2020, 4(1): 63-75.
[4] Hongxia Xu,Chunwang Li. Review of Knowledge Extraction of Scientific Literature[J]. 数据分析与知识发现, 2019, 3(3): 14-24.
[5] Ying Wang,Li Qian,Jing Xie,Zhijun Chang,Beibei Kong. Building Knowledge Graph with Sci-Tech Big Data[J]. 数据分析与知识发现, 2019, 3(1): 15-26.
[6] Sun Lin,Wang Yanzhang. Identifying Competitive Intelligence Based on Knowledge Element[J]. 数据分析与知识发现, 2018, 2(6): 25-36.
[7] Chen Guo,Xiao Lu. Linking Knowledge Elements from Online Community[J]. 数据分析与知识发现, 2017, 1(11): 75-83.
[8] Liu Jianhua,Wang Ying,Zhang Zhixiong,Li Chuanxi. Extracting Semantic Knowledge from Plant Species Diversity Collections[J]. 数据分析与知识发现, 2017, 1(1): 37-46.
[9] Ma Xukai, Ding Shengchun. Research on Intelligent Retrieval of Complex Product Design Knowledge[J]. 现代图书情报技术, 2014, 30(9): 44-50.
[10] Hu Zhengyin, Fang Shu. Review on Text-based Patent Technology Mining[J]. 现代图书情报技术, 2014, 30(6): 62-70.
[11] Chen Ying, Li Jiao, Li Junlian. A Knowledge Representation Method for Pharmaceutical Products in China[J]. 现代图书情报技术, 2013, (6): 9-15.
[12] Hua Bolin. Extracting Information Method Term from Chinese Academic Literature[J]. 现代图书情报技术, 2013, (6): 68-75.
[13] Dong Hui Xu Lei. Knowledge Representation in History Field Expert System Application Based on Ontology[J]. 现代图书情报技术, 2010, 26(7/8): 72-78.
[14] Jiang Caihong,Qiao Xiaodong ,Zhu Lijun. Ontology-based Patent Abstracts' Knowledge Extraction[J]. 现代图书情报技术, 2009, 3(2): 23-28.
[15] Zhang Zhixiong,Wu Zhenxin,Liu Jianhua,Xu Jian,Hong Na,Zhao Qi. Analysis of State-of-the-Art Knowledge Extraction Technologies[J]. 现代图书情报技术, 2008, 24(8): 2-11.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn