Please wait a minute...
Advanced Search
现代图书情报技术  2016, Vol. 32 Issue (4): 81-90    DOI: 10.11925/infotech.1003-3513.2016.04.10
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
基于语义网络的研究兴趣相似性度量方法*
巴志超1,2(),李纲1,朱世伟2
1武汉大学信息管理学院 武汉 430072
2山东省科学院情报研究所 济南 250014
Similarity Measurement of Research Interests in Semantic Network
Ba Zhichao1,2(),Li Gang1,Zhu Shiwei2
1School of Information Management, Wuhan University, Wuhan 430072, China
2Information Research Institute of Shandong Academy of Sciences, Ji’nan 250014, China
全文: PDF(712 KB)   HTML ( 45
输出: BibTeX | EndNote (RIS)      
摘要 

目的】为准确识别研究内容相似但使用不同关键词的作者关系, 解决传统共现分析方法缺乏语义关联的问题, 提出一种基于关键词语义网络构建的作者研究兴趣相似性度量方法。【方法】通过引入word2vec模型对作者关键词进行词向量表示, 将关键词表示成语义级别的低维实值分布; 计算关键词之间的语义相关度并构造关键词语义网络, 采用JS距离对构建的作者研究兴趣矩阵进行相似性度量。【结果】该方法能计算出共现及非共现词对的相关性, 有效地挖掘出作者之间的潜在合作关系。【局限】训练语料的数量和准确性有待进一步提高, 提出的度量方法仅考虑两个作者之间的潜在合作关系。【结论】研究结果对改进基于传统的共现分析方法度量作者合作关系具有重要的参考价值。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
巴志超
李纲
朱世伟
关键词 作者网络神经网络语言模型语义相似度研究兴趣矩阵    
Abstract

[Objective] This study aims to identify relationship among authors of papers with similar contents but different keywords, and then tries to add more sematic factors to the co-occurrence analysis. [Methods] We proposed a method to gauge the similarity of research interests based on the keywords semantic network system. First, all keywords were represented as word vectors and translated into low dismension distribution with the help of neural network language—word2vec model. Second, we calculated the semantic association of keywords to build up a semantic network. Finally, we adopted the Jensen-Shannon distance method to measure the similarity of research interests. [Results] The proposed approach can accurately identify the similarities of co-occurrence and non co-occurrence terms and then effectively predict potential cooperation among authors. [Limitations] The amount and accuracy of training materials need to be increased. At present, we could only find potential cooperation between two authors. More research is needed to explore the possibilities of cooperation among multi-authors. [Conclusions] The proposed method could help to improve the performance of traditional co-occurrence analysis.

Key wordsAuthor-network network    Neural network language model    Semantic similarity    Matrix of research interests
收稿日期: 2015-12-02     
基金资助:*本文系国家自然科学基金项目“科研团队动态演化规律研究”(项目编号: 71273196)、山东省重点研发计划项目“可定制大数据知识服务平台关键技术研究及应用”(项目编号: 2015GGX101037)和山东省科学院青年基金项目“基于本体标注的科技文档挖掘方法关键技术研究”(项目编号: 2013QN036)的研究成果之一
引用本文:   
巴志超,李纲,朱世伟. 基于语义网络的研究兴趣相似性度量方法*[J]. 现代图书情报技术, 2016, 32(4): 81-90.
Ba Zhichao,Li Gang,Zhu Shiwei. Similarity Measurement of Research Interests in Semantic Network. New Technology of Library and Information Service, DOI:10.11925/infotech.1003-3513.2016.04.10.
链接本文:  
http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2016.04.10
[1] 邱均平, 刘国徽, 董克. 基于合作分析的知识聚合与学科知识结构研究——以国内知识管理领域为例[J]. 情报理论与实践, 2014, 37(8): 6-11.
[1] (Qiu Junping, Liu Guohui, Dong Ke.Research on Knowledge Aggregation and Discipline Structure Based on Collaboration Analysis—Taking the Field of Knowledge Management in Domestic as an Example[J]. Information Studies: Theory&Application, 2014, 37(8): 6-11.)
[2] 李纲, 李岚凤, 毛进, 等. 作者合著网络中研究兴趣相似性实证研究[J]. 图书情报工作, 2015, 59(2): 75-81.
[2] (Li Gang, Li Lanfeng, Mao Jin, et al.Empirical Research on Similarity of Research Interests in Co-authorship Network[J]. Library and Information Service, 2015, 59(2): 75-81.)
[3] 王福生, 石秀春, 杨洪勇. 基于作者簇的科研合作网络模型[J]. 情报理论与实践, 2009, 32(1): 35-37.
[3] (Wang Fusheng, Shi Xiuchun, Yang Hongyong.Research on Scientific Collaboration Network Based on Author Cliques[J]. Information Studies: Theory & Application, 2009, 32(1): 35-37.)
[4] Abramo G, D’Angelo C A, Costa F. Identifying Interdisciplinary Through the Disciplinary Classification of Coauthors of Scientific Publications[J]. Journal of the American Society for Information Science and Technology, 2012, 63(11): 2206-2222.
[5] 邱均平, 张晓培. 基于CSSCI的国内知识管理领域作者共被引分析[J]. 情报科学, 2011, 29(10): 1141-1145.
[5] (Qiu Junping, Zhang Xiaopei.Author Co-citation Analysis of Knowledge Management in China Based on the CSSCI[J]. Information Science, 2011, 29(10): 1141-1145.)
[6] 宋艳辉, 武夷山. 基于作者文献耦合分析的情报学知识结构研究[J]. 图书情报工作, 2014, 58(1): 117-123.
[6] (Song Yanhui, Wu Yishan.Resarch on Knowledge Structure of Information Science Based on Author Bibliographic-coupling Analysis[J]. Library and Information Service, 2014, 58(1): 117-123.)
[7] 孙海生. 作者关键词共现网络及实证研究[J]. 情报杂志, 2012, 31(9): 63-67.
[7] (Sun Haisheng.Author Keyword Co-Occurrence Network Analysis: An Empirical Research[J]. Journal of Intelligence, 2012, 31(9): 63-67.)
[8] 刘萍, 郭月培, 郭怡婷. 利用作者关键词网络探测作者相似性[J]. 现代图书情报技术, 2013(12): 62-69.
[8] (Liu Ping, Guo Yuepei, Guo Yiting.Use of Author-Keyword Network for Detecting Author Similarity[J]. New Technology of Library and Information, 2013(12): 62-69.)
[9] Jan Van Eck N, Waltman L. Appropritate Similarity Measure for Author Co-citation Analysis[J]. Journal of the American Society for Information Science and Technology, 2008, 59(10): 1653-1661.
[10] 邱均平, 李小涛. 基于引文网络挖掘和时序分析的知识扩散研究[J]. 情报理论与实践, 2014, 37(7): 5-10.
[10] (Qiu Junping, Li Xiaotao.Research on Knowledge Diffusion Based on Citation Network Mining and Timing Analysis[J]. Information Studies: Theory & Application, 2014, 37(7): 5-10.)
[11] Zhao D, Strotman A.Evolution of Research Activities and Intellectual Influences in Information Science 1996-2005: Introducing Author Bibliographic-coupling Analysis[J]. Journal of the American Society for Information Science and Technology, 2008, 59(13): 2070-2086.
[12] 陈远, 王菲菲. 基于CSSCI的国内情报学领域作者文献耦合分析[J]. 情报资料工作, 2011, 32(5): 6-12.
[12] (Chen Yuan, Wang Feifei.An Analysis on the Bibliographic Coupling in the Field of Information Studies in China: Based on CSSCI[J]. Information and Documentation Services, 2011, 32(5): 6-12.)
[13] 王知津, 周鹏, 谢丽娜. 用ABCA 方法识别和阐释我国当代情报学研究领域[J]. 情报学报, 2013, 32(1): 4-12.
[13] (Wang Zhijin, Zhou Peng, Xie Lina.The Identification and Explanation of Research Fields of Contemporary Information Science in China Using ABCA Method[J]. Journal of the China Society for Scientific and Technical Information, 2013, 32(1): 4-12.)
[14] 陈卫静, 郑颖. 基于作者关键词耦合的潜在合作关系挖掘[J]. 情报杂志, 2013, 32(5): 127-131.
[14] (Chen Weijing, Zheng Ying.Mining Potential Cooperative Relationships Based on the Author Keyword Coupling Analysis[J]. Journal of Intelligence, 2013, 32(5): 127-131.)
[15] Morris S A, Yen G G.Crossmaps: Visualization of Overlapping Relationships in Collections of Journal Papers[J]. Proceedings of the National Academy of Sciences, 2004, 101(S1): 5291-5296.)
[16] Onyancha O B, Ocholla D N.Is HTV/AIDS in Africa Distinct? What Can We Learn from an Analysis of the Literature[J]. Scientometrics, 2009, 79(1): 277-296.
[17] 邱均平, 陈木佩. 我国计量学领域作者合作关系研究[J]. 情报理论与实践, 2012, 35(11): 56-60.
[17] (Qiu Junping, Chen Mupei.Research on Author Collaboration in the Metrology Field in China[J]. Information Studies: Theory&Application, 2012, 35(11): 56-60.)
[18] 丁敬达. 创新知识社区内部科学交流的特征和规律——基于某国家重点实验室的实证分析[J]. 情报学报, 2011, 30(10): 1086-1094.
[18] (Ding Jingda.Characteristics and Regularity in Scientific Communication Within Innovative Knowledge Community: An Empirical Study of a State Key Laboratory[J]. Journal of the China Society for Scientific and Technical Information, 2011, 30(10): 1086-1094.)
[19] Mikolov T, Sutskever I, Chen K, et al.Distributed Representations of Words and Phrases and Their Compositionality [C]. In: Proceedings of the Neural Infornational Processing Systems Conference. Nevada, United States: Neural Information Processing Systems Foundation, 2013: 3111-3119.)
[20] Morin F, Bengio Y.Hierarchical Probabilistic Neural Network Language Model [C]. In: Proceedings of the International Workshop on Artificial Intelligence and Statistics. Cambridge: Cambridge University Press, 2005: 246-252.
[21] Polzehl J, Spokoiny V.Propagation-Separation Approach for Local Likelihood Estimation[J]. Probability Theory and Related Fields, 2006, 135(3): 335-362.
[22] Callon M, Courtial J P, Laville F.Co-word Analysis as a Tool for Describing the Network of Interactions Between Basic and Technological Research: The Case of Polymer Chemsitry[J]. Scientmetrics, 1991, 22(1): 155-205.
[23] 郑华川, 于晓欧, 辛彦. 利用共词聚类分析探讨抗原CD44研究现状[J]. 中华医学图书情报杂志, 2002, 11(2): 1-3.
[23] (Zheng Huachuan, Yu Xiaoou, Xin Yan.Antigen CD44 with Clustered Analysis of Co-words: A Status Quo Investigation[J]. Chinese Journal of Medical Library and Information Science, 2002, 11(2): 1-3.)
[24] Endres D M, Schindelin J E.A New Metric for Probability Distributions[J]. IEEE Transactions on Information Theory, 2003, 49(7): 1858-1860.
[1] 陈二静,姜恩波. 文本相似度计算方法研究综述[J]. 数据分析与知识发现, 2017, 1(6): 1-11.
[2] 翟东升,蔡文浩,张杰,李振飞. 改进的中文商标语义相似度计算方法研究[J]. 数据分析与知识发现, 2017, 1(11): 19-28.
[3] 刘健,毕强,刘庆旭,王福. 数字文献资源内容服务推荐研究*——基于本体规则推理和语义相似度计算[J]. 现代图书情报技术, 2016, 32(9): 70-77.
[4] 毕强, 刘健, 鲍玉来. 基于语义相似度的文本聚类研究*[J]. 数据分析与知识发现, 2016, 32(12): 9-16.
[5] 黄孝喜, 张华, 陆蓓, 王荣波, 吴铤. 一种基于词语抽象度的汉语隐喻识别方法[J]. 现代图书情报技术, 2015, 31(4): 34-40.
[6] 刘怀亮, 杜坤, 秦春秀. 基于知网语义相似度的中文文本分类研究[J]. 现代图书情报技术, 2015, 31(2): 39-45.
[7] 范雪雪, 王志荣, 徐晤, 梁银, 马小虎. 基于医学本体的术语相似度算法研究[J]. 现代图书情报技术, 2015, 31(12): 57-64.
[8] 胡吉明, 肖璐. 向量空间模型文本建模的语义增量化改进研究[J]. 现代图书情报技术, 2014, 30(10): 49-55.
[9] 何超, 张玉峰. 融合语义相似度的商务情报链接分析算法研究[J]. 现代图书情报技术, 2013, 29(3): 27-32.
[10] 孙海霞, 李军莲, 李丹亚, 吴英杰, 李晓瑛. 基于CMeSH语义系统的领域自由词-主题词语义映射研究[J]. 现代图书情报技术, 2013, 29(11): 46-51.
[11] 马军红. 分阶段融合的文本语义相似度计算方法[J]. 现代图书情报技术, 2013, 29(10): 20-26.
[12] 王莉. 基于关键词链的动态分面研究[J]. 现代图书情报技术, 2012, 28(7): 76-81.
[13] 邢美凤. 科技文献关键词冗余解决方案研究[J]. 现代图书情报技术, 2012, 28(1): 34-39.
[14] 徐健 张智雄 肖卓 邓昭俊. 科技术语语义相似度计算方法研究综述[J]. 现代图书情报技术, 2010, 26(7/8): 51-57.
[15] 孙海霞 钱庆 吴英杰 李军莲. MeSH词表的语义相似度计算研究*[J]. 现代图书情报技术, 2010, 26(6): 12-16.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn