Please wait a minute...
Advanced Search
现代图书情报技术  2016, Vol. 32 Issue (7-8): 87-93    DOI: 10.11925/infotech.1003-3513.2016.07.11
  本期目录 | 过刊浏览 | 高级检索 |
主题标引文献的语义关系发现研究*
李晓瑛(),夏光辉,李丹亚
中国医学科学院医学信息研究所 北京 100020
Finding Semantic Relations Among Subject Indexed Papers
Li Xiaoying(),Xia Guanghui,Li Danya
Institute of Medical Information, Chinese Academy of Medical Sciences, Beijing 100020, China
全文: PDF(563 KB)   HTML ( 38
输出: BibTeX | EndNote (RIS)      
摘要 

目的】利用文献的主题标引结果, 发现其中隐含的重要语义关系。【方法】基于MEDLINE数据库中的生物医学主题标引文献, 提出一种语义关系发现算法, 涉及主题词组配原则、主题标引规则以及基于加权标引词和关系出现频次的优化方法等多个环节。【结果】收集疾病与症状方面的实验数据对算法进行实验验证, 并结合领域专家审核, 结果表明本文所发现语义关系的准确率可达到95%以上。【局限】本文所研究的语义关系发现算法仅适用于具有主题标引结果的文献。【结论】从大规模生物医学主题标引文献中发现中英文两种语言的语义关系是有效可行的, 对其他领域语义关系的发现具有极高的借鉴意义。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
李晓瑛
夏光辉
李丹亚
关键词 语义关系发现标引文献组配原则阈值    
Abstract

[Objective] This paper tries to identify important and implicit semantic relations among the subject indexed papers. [Methods] Based on the subject indexed biomedical papers from MEDLINE, we proposed an algorithm consisting of subjects coordinating and indexing rules, as well as optimization rules for weighted indexing results and relation occurrences. The new algorithm was then examined with experimental disease data. [Results] With the help of domain experts’ verification, the precision of the new algorithm was higher than 95%. [Limitations] The proposed method was only appropriate for papers with subject indexing. [Conclusions] The proposed algorithm can be used to identify semantic relations among English and Chinese subjects indexed biomedical papers, and help us develop algorithms in other areas.

Key wordsFinding semantic relations    Indexed papers    Coordinating rules    Threshold
收稿日期: 2016-03-09     
基金资助:*本文系国家社会科学基金项目“基于复杂网络的公众健康知识网络构建研究”(项目编号:15CTQ020)和中央级公益性科研院所基本科研业务费项目“生物医学术语服务系统建设关键问题研究”(项目编号: 15R0109)的研究成果之一
引用本文:   
李晓瑛,夏光辉,李丹亚. 主题标引文献的语义关系发现研究*[J]. 现代图书情报技术, 2016, 32(7-8): 87-93.
Li Xiaoying,Xia Guanghui,Li Danya. Finding Semantic Relations Among Subject Indexed Papers. New Technology of Library and Information Service, DOI:10.11925/infotech.1003-3513.2016.07.11.
链接本文:  
http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2016.07.11
[1] U.S. National Library of Medicine. MEDLINE Fact Sheet [EB/OL]. [2016-03-01]. .
[2] 黄勋, 游宏梁, 于洋. 关系抽取技术研究综述[J]. 现代图书情报技术, 2013(11): 30-39.
[2] (Huang Xun, You Hongliang, Yu Yang.A Review of Relation Extraction[J]. New Technology of Library and Information Service, 2013(11): 30-39.)
[3] 徐健, 张智雄, 吴振新. 实体关系抽取的技术方法综述[J].现代图书情报技术, 2008(8): 18-23.
[3] (Xu Jian, Zhang Zhixiong, Wu Zhenxin.Review on Techniques of Entity Relation Extraction[J]. New Technology of Library and Information Service, 2008(8): 18-23.)
[4] Yu H, Hatzivassiloglou V, Friedman C, et al.Automatic Extraction of Gene and Protein Synonyms from MEDLINE and Journal Articles [C]. In: Proceedings of the 2002 AMIA Annual Symposium. 2002.
[5] 宋锐, 林鸿飞, 常富洋. 中文比较句识别及比较关系抽取[J]. 中文信息学报, 2009, 23(2): 102-122.
[5] (Song Rui, Lin Hongfei, Chang Fuyang.Chinese Comparative Sentences Identification and Comparative Relations Extraction[J]. Journalof Chinese Information Processing, 2009, 23(2): 102-122.)
[6] 韩红旗, 徐硕, 桂婕, 等. 基于词形规则模板的术语层次关系抽取方法[J]. 情报学报, 2013, 32(7): 708-715.
[6] (Han Hongqi, Xu Shuo, Gui Jie, et al.Term Hierarchical Relation Extraction Method Based on Morphology Rule Template[J]. Journal of the China Society for Scientific and Technical Information, 2013, 32(7): 708-715.)
[7] Reichartz F, Korte H, Paass G.Dependency Tree Kernels for Relation Extraction from Natural Language Text [A]. // Machine Learning and Knowledge Discovery in Databases[M]. Springer Berlin Heidelberg, 2009.
[8] 孙霞, 董乐红. 基于监督学习的同义关系自动抽取方法[J]. 西北大学学报: 自然科学版, 2008, 38(1): 35-39.
[8] (Sun Xia, Dong Lehong.Automatic Extraction of Synonymy Relation Using Supervised Learning[J]. Journal of Northwest University: Natural Science Edition, 2008, 38(1): 35-39.)
[9] 庞晓东. 基于监督学习的校友实体关系抽取研究[D]. 天津: 南开大学, 2012.
[9] (Pang Xiaodong.Research on the Alumni Entity Relation Extraction Using Supervised Learning[D]. Tianjin: Nankai University, 2012.)
[10] Rozenfeld B, Feldman R.High-Performance Unsupervised Relation Extraction from Large Corpora[C]. In: Proceedings of the 6th International Conference on Data Mining. 2006: 1032-1037.
[11] 马超. 基于Web信息使用改进的无监督关系抽取方法构建交通本体[J]. 计算机系统应用, 2015, 24(12): 273-276.
[11] (Ma Chao.Using Improved Unsupervised Relation Extraction Method to Construct Traffic Ontology Based on Web[J]. Computer Systems & Applications, 2015, 24(12): 273-276.)
[12] Zhang Z.Weakly-Supervised Relation Classification for Information [C]. In: Proceedings of the 13th ACM International Conference on Information & Knowledge Management. 2004.
[13] Fan M, Zhao D, Zhou Q, et al.Distant Supervision for Relation Extraction with Matrix Completion [C]. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Maryland, USA. 2014.
[14] Sabou M, Mathieu A, Motta S.CARLET: Semantic Relation Discovery by Harvesting Online Ontologies [C]. In: Proceedings of the 5th European Semantic Web Conference. 2008.
[15] 李熙, 徐德智. 基于WordNet的概念语义相似度研究[J]. 湖南科技学院学报, 2008, 29(12): 115-116.
[15] (Li Xi, Xu Dezhi.Concept Semantic Similarity Researching Based on WordNet[J]. Journal of Hunan University of Science and Engineering, 2008, 29(12): 115-116.)
[16] U.S. National Library of Medicine. MeSH Browser [EB/OL]. [2016-03-01]. .
[17] 中国医学科学院医学信息研究所. 中文医学主题词表[EB/ OL]. [2016-03-01]. .
[17] (Institute of Medical Information, Chinese Academy of Medical Sciences. Chinese Medical Subject Headings [EB/OL]. [2016-03-01].
[18] 肖晓旦. 生物医学文献主题标引[M]. 长沙: 湖南科学技术出版社, 2005: 65-68.
[18] (Xiao Xiaodan.Biomedical Literature Subject Indexing [M]. Changsha: Hunan Science & Technology Press, 2005: 65-68.)
[1] 赵雅楠,王育清. 基于不确定近邻的旅游产品协同过滤推荐算法研究*[J]. 数据分析与知识发现, 2018, 2(7): 63-71.
[2] 武兴龙,刘新旺 . 二元语义信息检索模型*[J]. 现代图书情报技术, 2006, 1(6): 43-46.
[3] 苏东出,陈和平,孙萍. 基于高低通滤波特征的文本图像快速二值化方法——谈数字图像处理技术在数字图书馆中的应用[J]. 现代图书情报技术, 2005, 21(3): 43-44.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn