Please wait a minute...
New Technology of Library and Information Service  2013, Vol. 29 Issue (11): 60-67    DOI: 10.11925/infotech.1003-3513.2013.11.09
Current Issue | Archive | Adv Search |
Research of Co-word Analysis Method of Combining Keywords Extension and Domain Ontology
Tang Xiaobo, Xiao Lu
Center for Studies of Information Resources, Wuhan University, Wuhan 430072, China
Download: PDF(1251 KB)   HTML  
Export: BibTeX | EndNote (RIS)      
Abstract  This paper puts forward a new co-word analysis process model according to the deficiency in tradition co-word analysis. This model improves the traditional methods of co-word analysis from two aspects. At first, this paper supplements the indexing keywords because they cannot fully describe the topic content of the thesis. High frequency words from indexing key words are chosen to constitute a supplementary dictionary. Paper candidate keywords are extracted from the title by the word segmentation technology based on the supplement dictionary, and then the candidate keywords are supplemented according to certain rules. Secondly,domain Ontology is introduced to calculate the high frequency keywords for semantic similarity because the co-occurrence frequencies are difficult to accurately describe the similarity between two words,considering the co-occurrence frequency and semantic similarity. Then the correlation is used to describe the word similarity, and is the basis of building co-word matrix. Finally, experiments prove the effectiveness of this improved method.
Key wordsCo-word analysis      Extension dictionary      Domain Ontology     
Received: 29 July 2013      Published: 29 November 2013
:  TP391  

Cite this article:

Tang Xiaobo, Xiao Lu. Research of Co-word Analysis Method of Combining Keywords Extension and Domain Ontology. New Technology of Library and Information Service, 2013, 29(11): 60-67.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2013.11.09     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2013/V29/I11/60

[1] 廖胜姣,肖仙桃. 基于文献计量的共词分析研究进展[J]. 情报科学,2008,26(6):855-859.(Liao Shengjiao, Xiao Xiantao. Research Advances on the Bibiometrics-based Co-word Analysis[J]. Information Science, 2008,26(6):855-859.)
[2] 钟伟金,李佳. 共词分析法研究(一)——共词分析的过程与方式[J]. 情报杂志,2008,27(5):70-72.(Zhong Weijin, Li Jia. The Research of Co-word Analysis(1) —— The Process and Methods of Co-word Analysis[J]. Journal of Information, 2008,27(5):70-72.)
[3] 李颖,贾二鹏,马力. 国内外共词分析研究综述[J]. 新世纪图书馆,2012(1):23-27.(Li Ying, Jia Erpeng, Ma Li. Co-word Analysis Research Review at Home and Abroad[J]. New Century Library, 2012(1):23-27.)
[4] 李纲,李轶. 一种基于关键词加权的共词分析方法[J]. 情报科学,2011, 29(3):321-324.(Li Gang, Li Yi. A New Method for Weighted Co-word Analysis Based on Keywords[J]. Information Science, 2011,29(3):321-324.)
[5] 邵作运,李秀霞. 共词分析中作者关键词规范化研究——以图书馆个性化信息服务研究为例[J]. 情报科学,2012,30(5):731-735.(Shao Zuoyun, Li Xiuxia. Study on the Standardization of Author Keywords in Co-word Analysis——Taking Library Personalized Information Services Study as Example[J]. Information Science, 2012,30(5):731-735.)
[6] 沈君,王续琨,陈悦,等. 战略坐标视角下的专利技术主题分析——以第三代移动通信技术为例[J]. 情报杂志,2012,31(11):88-94.( Shen Jun, Wang Xukun, Chen Yue,et al. Analysis on Technology Focus from the Perspective of Strategic Diagram: A Case in the Field of 3G Mobile Communication[J]. Journal of Information, 2012,31(11):88-94.)
[7] 韩红旗,安小米. 科技论文关键词的战略图分析[J]. 情报理论与实践,2012,35(9):86-90.(Han Hongqi, An Xiaomi. A Strategic Diagram Method for the Analysis of the Keywords in Scientific Papers[J]. Information Studies: Theory & Application, 2012,35(9):86-90.)
[8] 章成志. 自动标引研究的回顾与展望[J]. 现代图书情报技术, 2007(11):33-39.(Zhang Chengzhi. Review and Prospect of Automatic Indexing Research[J]. New Technology of Library and Information Service, 2007(11):33-39.)
[9] 邓三鸿, 王昊, 秦嘉杭,等. 基于字角色标注的中文书目关键词标引研究[J]. 中国图书馆学报, 2012,38(2):38-49.(Deng Sanhong, Wang Hao, Qin Jiahang, et al. Research on Keywords Indexing for Chinese Bibliography Based on Word Roles Annotation[J]. Journal of Library Science in China, 2012,38(2):38-49.)
[10] 肖红,许少华. 基于词汇同现模型的关键词自动提取方法研究[J]. 沈阳理工大学学报,2009,28(5):38-41.(Xiao Hong, Xu Shaohua. A Method of Automatic Keyword Extraction Based on Co-occurrence Model[J]. Transactions of Shenyang Ligong University, 2009,28(5):38-41.)
[11] Anjewierden A, Kabel S.Automatic Indexing of PDF Documents with Ontologies[C].In:Proceedings of the 13th Belgian/Dutch Conference on Artificial Intelligence(BNAIC'01),Amsterdam, Neteherlands.2001:23-30.
[12] Tomokiyo T, Hurst M.A Language Model Approach to Keyphrase Extraction[C].In:Proceedings of the ACL 2003 Workshop on Multiword Expressions:Analysis,Acquisition&Treatment (MWE'03),Sapporo, Japan. Stroudsburg: Association for Computational Linguistics, 2003:33-40.
[13] Hulth A.Improved Automatic Keyword Extraction Given More Linguistic Knowledge[C].In:Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, Sapporo, Japan. Stroudsburg:Association for Computational Linguistics, 2003:216-223.
[14] 钟伟金. 基于主要主题词加权的共词聚类分析法效果研究[J]. 情报学报,2009,28(2):214-219.(Zhong Weijin. Research into the Effects of Weighted Co-word Cluster Analysis Based on Major Descriptor[J]. Journal of the China Society for Scientific and Technical Information, 2009,28(2):214-219.)
[15] 吴清强,赵亚娟. 基于论文属性的加权共词模型探讨[J]. 情报学报,2008,27(1):89-92.(Wu Qingqiang, Zhao Yajuan. Research in the Weighted Co-word Analysis Based on the Attributes of Articles[J]. Journal of the China Society for Scientific and Technical Information, 2008,27(1):89-92.)
[16] An X Y, Wu Q Q. Co-word Analysis of the Trends in Stem Cells Field Based on Subject Heading Weighting[J]. Scientometrics, 2011, 88(1): 133-144.
[17] 李纲,王忠义. 基于语义的共词分析方法研究[J]. 情报杂志,2011,30(12):145-149.(Li Gang, Wang Zhongyi. Research on the Semantic-based Co-word Analysis[J]. Journal of Information, 2011,30(12):145-149.)
[18] 张启宇,朱玲,张雅萍. 中文分词算法研究综述[J]. 情报探索,2008(11):53-56.(Zhang Qiyu, Zhu Ling, Zhang Yaping. Review of Chinese Word Segmentation Algorithm[J]. Information Research, 2008(11):53-56.)
[19] 奉国和,郑伟. 国内中文自动分词技术研究综述[J]. 图书情报工作,2011,55(2):41-45.(Feng Guohe, Zhen Wei. Review of Chinese Automatic Word Segmentation[J]. Library and Information Service, 2011,55(2):41-45.)
[20] 王昊, 邓三鸿, 苏新宁. 基于字序列标注的中文关键词抽取研究[J]. 现代图书情报技术, 2011(12):39-45.(Wang Hao, Deng Sanhong, Su Xinning. Research on Chinese Keywords Extraction Based on Characters Sequence Annotation[J]. New Technology of Library and Information Service, 2011(12):39-45.)
[21] 于江德,李学钰,樊孝忠.信息抽取中领域本体的设计和实现[J]. 电子科技大学学报,2008,37(5):746-749.(Yu Jiangde, Li Xueyu, Fan Xiaozhong. Design and Implementation of Domain Ontology for Information Extraction[J]. Journal of University of Electronic Science and Technology of China, 2008,37(5):746-749.)
[22] Gruber T R. A Translation Approach to Portable Ontology Specifications[J]. Knowledge Acquisition,1993,5(2):199-220.
[23] 杜小勇,李曼,王珊. 本体学习研究综述[J]. 软件学报,2006,17(9):1837-1847.(Du Xiaoyong, Li Man, Wang Shan. A Survey on Ontology Learning Research[J]. Journal of Software, 2006,17(9):1837-1847.)
[24] 朱恒民,马静,黄卫东,等. 基于领域本体实现全网信息的智能搜索方法研究[J]. 情报学报,2010,29(1):9-15.(Zhu Hengmin, Ma Jing, Huang Weidong, et al. Study on Method of the Global Web Intelligent Search Based on Domain Ontology[J]. Journal of the China Society for Scientific and Technical Information, 2010,29(1):9-15.)
[1] Qikai Cheng,Jiamin Wang,Wei Lu. Discovering Domain Vocabularies Based on Citation Co-word Network[J]. 数据分析与知识发现, 2019, 3(6): 57-65.
[2] Youshi He,Shufang He. Sentiment Mining of Online Product Reviews Based on Domain Ontology[J]. 数据分析与知识发现, 2018, 2(8): 60-68.
[3] Lu Jiaying,Yuan Qinjian,Huang Qi,Qian Yunjie. Building Product Domain Ontology with Concept Lattice Theory[J]. 现代图书情报技术, 2016, 32(5): 38-46.
[4] Hong Ma, Yongming Cai. A CA-LDA Model for Chinese Topic Analysis: Case Study of Transportation Law Literature[J]. 数据分析与知识发现, 2016, 32(12): 17-26.
[5] Bao Yulai,Bi Qiang. Semantic Retrieval for Mongolian Music: An Explorative Study[J]. 现代图书情报技术, 2016, 32(11): 94-100.
[6] Zhang Fan, Le Xiaoqiu. Research on Recognition of Concept Attribute Instances in Innovation Sentences of Scientific Research Paper[J]. 现代图书情报技术, 2015, 31(5): 15-23.
[7] Duan Yufeng, Zhu Wenjing, Chen Qiao, Liu Wei, Liu Fenghong. The Study on Out-of-Vocabulary Identification on a Model Based on the Combination of CRFs and Domain Ontology Elements Set[J]. 现代图书情报技术, 2015, 31(4): 41-49.
[8] Duan Yufeng, Huang Sisi. Research on Construction of Chinese Plant Species Diversity Domain Ontology Based on BFO[J]. 现代图书情报技术, 2015, 31(12): 72-79.
[9] Yan Shiyan, Wang Shengqing, Luo Yunchuan, Huang Haojun. An Ontology Collaborative Construction Model Based on FCA in Cloud Computing Environment[J]. 现代图书情报技术, 2014, 30(3): 49-56.
[10] Zhao Yuxiang,Peng Xixian. Media as a Community? Literature Based Topic Evaluation in Information Systems Discipline[J]. 现代图书情报技术, 2014, 30(1): 56-65.
[11] Hu Changping, Chen Guo. A New Feature Selection Method Based on Term Contribution in Co-word Analysis[J]. 现代图书情报技术, 2013, 29(7/8): 89-93.
[12] Yao Xiaona, Zhu Zhongming, Wang Sili. Research on Automatic Semantic Annotation for Geosciences[J]. 现代图书情报技术, 2013, (4): 48-53.
[13] Xu Xin, Guo Jinlong. Construction of Subject Knowledge Base——Taking the Domain of Chinese Cuisine Culture as an Example[J]. 现代图书情报技术, 2013, (12): 2-9.
[14] Guo Jinlong, Hong Yunjia, Xu Xin. Construction and Application of Ontology in the Domain of Chinese Cuisine Culture[J]. 现代图书情报技术, 2013, (12): 10-18.
[15] Hong Yunjia, Xu Xin. Study on Multi-level Text Clustering for Knowledge Base Based on Domain Ontology——Taking Knowledge Base of Chinese Cuisine Culture as an Example[J]. 现代图书情报技术, 2013, (12): 19-26.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn