Please wait a minute...
Data Analysis and Knowledge Discovery  2019, Vol. 3 Issue (6): 57-65    DOI: 10.11925/infotech.2096-3467.2018.1159
Current Issue | Archive | Adv Search |
Discovering Domain Vocabularies Based on Citation Co-word Network
Qikai Cheng,Jiamin Wang(),Wei Lu
(School of Information Management, Wuhan University, Wuhan 430072, China);(Information Retrieval and Knowledge Mining Laboratory, Wuhan University, Wuhan 430072, China)
Download: PDF (528 KB)   HTML ( 4
Export: BibTeX | EndNote (RIS)      

[Objective] This paper identifies basic vocabularies of a specific domain from academic papers, aiming to grasp the knowledge structure and development context. [Methods] We combined the citation network and the co-word analysis to construct a citation co-word network. Then, we used the PageRank algorithm to evaluate the importance of the candidate words. We examined the proposed method with 110,360 articles in computer science. [Results] Our new method was compared with the word frequency method and co-word analysis qualitatively and quantitatively. We found that the proposed method performed well, and the average precision of a blind selection experiment reached 72.6%. [Limitations] The proposed method was only examined with computer science articles. [Conclusions] The new strategies could improve the performance of basic vocabulary discovery in one specific domain.

Key wordsBasic Vocabulary      Citation Co-word Network      PageRank      Word Frequency      Co-word Analysis     
Received: 19 October 2018      Published: 15 August 2019

Cite this article:

Qikai Cheng,Jiamin Wang,Wei Lu. Discovering Domain Vocabularies Based on Citation Co-word Network. Data Analysis and Knowledge Discovery, 2019, 3(6): 57-65.

URL:     OR

[1] Courtial J P.Comments on Leydesdorff’s Article[J]. Journal of the American Society for Information Science, 1998, 49(1): 98.
[2] Su H N, Lee P C.Mapping Knowledge Structure by Keyword Co-occurrence: A First Look at Journal Papers in Technology Foresight[J]. Scientometrics, 2010, 85(1): 65-79.
[3] Hu J M, Zhang Y.Research Patterns and Trends of Recommendation System in China Using Co-Word Analysis[J]. Information Processing and Management, 2015, 51(4): 329-339.
[4] Sun Y W, Zhai Y.Mapping the Knowledge Domain and the Theme Evolution of Appropriability Research Between 1986 and 2016: A Scientometric Review[J]. Scientometrics, 2018, 116(1): 203-230.
[5] Khasseh A A, Soheili F, Moghaddam H S, et al.Intellectual Structure of Knowledge in iMetrics: A Co-Word Analysis[J]. Information Processing & Management, 2017, 53(3): 705-720.
[6] Ravikumar S, Agrahari A, Singh S N.Mapping the Intellectual Structure of Scientometrics: A Co-Word Analysis of the Journal Scientometrics (2005-2010)[J]. Scientometrics, 2015, 102(1): 929-955.
[7] Soriano A S, Álvarez C L, Valdés R M T. Bibliometric Analysis to Identify an Emerging Research Area: Public Relations Intelligence — A Challenge to Strengthen Technological Observatories in the Network Society[J]. Scientometrics, 2018, 115(3): 1591-1641.
[8] 胡昌平, 陈果. 科技论文关键词特征及其对共词分析的影响[J]. 情报学报, 2014, 33(1): 23-32.
[8] (Hu Changping, Chen Guo.Characteristics of Keywords in Scientific Papers and Their Impact on Co-word Analysis[J]. Journal of the China Society for Scientific and Technical Information, 2014, 33(1): 23-32.)
[9] 李树青. 基于引文关键词加权共现技术的图情学科领域本体自动构建方法研究[J]. 情报学报, 2012, 31(4): 371-380.
[9] (Li Shuqing.Research on Automatic Construction of Domain Ontology in Library and Information Science Based on Weighted Co-occurrence of Citation Keywords[J]. Journal of the China Society for Scientific and Technical Information, 2012, 31(4): 371-380.)
[10] Yan B N, Lee T S, Lee T P.Mapping the Intellectual Structure of the Internet of Things (IoT) Field (2000-2014): A Co-Word Analysis[J]. Scientometrics, 2015,105(2): 1285-1300.
[11] Wang Z S, Zhao H, Wang Y.Social Networks in Marketing Research 2001-2014: A Co-Word Analysis[J]. Scientometrics, 2015, 105(1): 65-82.
[12] Donohue J C.Understanding Scientific Literature: A Bibliographic Approach[M]. Cambridge: The MIT Press, 1973: 101.
[13] Booth A D.A “Law” of Occurrences for Words of Low Frequency[J]. Information and Control, 1967, 10(4): 386-393.
[14] Yang Y, Wu M, Cui L.Integration of Three Visualization Methods Based on Co-Word Analysis[J]. Scientometrics, 2011, 90(2): 659-673.
[15] Yan B N, Lee T S, Lee T P.Analysis of Research Papers on E-Commerce (2000-2013): Based on a Text Mining Approach[J]. Scientometrics, 2015, 105(1): 403-417.
[16] 李纲, 巴志超. 共词分析过程中的若干问题研究[J]. 中国图书馆学报, 2017, 43(4): 93-113.
[16] (Li Gang, Ba Zhichao.Co-word Analysis: Limitations and Solutions[J]. Journal of Library Science in China, 2017, 43(4): 93-113.)
[17] Choi J, Yi S, Lee K C.Analysis of Keyword Networks in MIS Research and Implications for Predicting Knowledge Evolution[J]. Information & Management, 2011, 48(8): 371-381.
[18] Zhu W, Guan J.A Bibliometric Study of Service Innovation Research: Based on Complex Network Analysis[J]. Scientometrics, 2013, 94(3): 1195-1216.
[19] Ocholla D N, Onyancha O B, Britz J.Can Information Ethics Be Conceptualized by Using the Core/Periphery Model?[J]. Journal of Informetrics, 2010, 4(4): 492-502.
[20] Liu J X, Zheng C H, Xu Y.Extracting Plants Core Genes Responding to Abiotic Stresses by Penalized Matrix Decomposition[J]. Computers in Biology & Medicine, 2012, 42(5): 582-589.
[21] Ding Y, Song M, Han J, et al.Entitymetrics: Measuring the Impact of Entities[J]. PLoS One, 2013, 8(8): e71416.
[22] Song M, Han N G, Kim Y H, et al.Discovering Implicit Entity Relation with the Gene-Citation-Gene Network[J]. PLoS One, 2013, 8(12): e84639.
[23] 吴清强, 赵亚娟. 基于论文属性的加权共词模型探讨[J]. 情报学报, 2008, 27(2): 89-92.
[23] (Wu Qingqiang, Zhao Yajuan.Research in the Weighted Co-word Analysis Based on the Attributes of Articles[J]. Journal of the China Society for Scientific and Technical Information, 2008, 27(2): 89-92.)
[24] 葛菲, 谭宗颖. 基于文献计量学的科学结构及其演化的研究方法述评[J]. 情报杂志, 2012, 31(12): 34-39.
[24] (Ge Fei, Tan Zongying.Review of Science Structure and Evolution of Bibliometric Methods[J]. Journal of Intelligence, 2012, 31(12): 34-39.)
[25] Brin S, Page L.The Anatomy of a Large-Scale Hypertextual Web Search Engine[C]// Proceedings of the 7th International Conference on World Wide Web. 1998: 107-117.
[26] Zhao W Y, Mao J, Lu K.Ranking Themes on Co-Word Networks: Exploring the Relationships Among Different Metrics[J]. Information Processing & Management, 2018, 54(2): 203-218.
[27] 陈果, 肖璐, 赵雪芹. 领域知识分析中的关键词选择方法研究——一种以学科为背景的全局视角[J]. 情报学报, 2014, 33(9): 959-968.
[27] (Chen Guo, Xiao Lu, Zhao Xueqin.A Keyword Selection Method Based on the Combination of Popularity and Domain Relevancy of Keywords: A Holistic Perspective[J]. Journal of the China Society for Scientific and Technical Information, 2014, 33(9): 959-968.)
[1] Wu Jinming,Hou Yuefang,Cui Lei. Automatic Expression of Co-occurrence Clustering Based on Indexing Rules of Medical Subject Headings[J]. 数据分析与知识发现, 2020, 4(9): 133-144.
[2] Peng Chen,Lv Xueqiang,Sun Ning,Zang Le,Jiang Zhaocai,Song Li. Building Phrase Dictionary for Defective Products with Convolutional Neural Network[J]. 数据分析与知识发现, 2020, 4(11): 112-120.
[3] Chen Xiaowei,Shi Yutian. Identifying Key Nodes in Social Network with Improved PageRank Algorithm[J]. 数据分析与知识发现, 2017, 1(8): 68-75.
[4] Liu Tong,Yang Jingcheng. Evaluating Online Healthcare Consultation Feedbacks Based on Signal Transmission Algorithm[J]. 数据分析与知识发现, 2017, 1(11): 29-36.
[5] Hong Ma, Yongming Cai. A CA-LDA Model for Chinese Topic Analysis: Case Study of Transportation Law Literature[J]. 数据分析与知识发现, 2016, 32(12): 17-26.
[6] Zhao Yuxiang,Peng Xixian. Media as a Community? Literature Based Topic Evaluation in Information Systems Discipline[J]. 现代图书情报技术, 2014, 30(1): 56-65.
[7] Xiong Liyan, Tan Long, Zhong Maosheng. An Automatic Term Extraction System of Improved C-value Based on Effective Word Frequency[J]. 现代图书情报技术, 2013, 29(9): 54-59.
[8] Tang Xiaobo, Fang Xiaoke. Research on Microblog Ranking Strategy with the Social Relations[J]. 现代图书情报技术, 2013, 29(9): 74-81.
[9] Hu Changping, Chen Guo. A New Feature Selection Method Based on Term Contribution in Co-word Analysis[J]. 现代图书情报技术, 2013, 29(7/8): 89-93.
[10] Tang Xiaobo, Xiao Lu. Research of Co-word Analysis Method of Combining Keywords Extension and Domain Ontology[J]. 现代图书情报技术, 2013, 29(11): 60-67.
[11] Ye Chunlei, Leng Fuhai. Theme Identification Empirical Study on Technical Documentation in Full-text[J]. 现代图书情报技术, 2012, 28(1): 53-57.
[12] Lu Wei, Peng Yu, Chen Wu. Hot Research Topics Detection Based on SOM[J]. 现代图书情报技术, 2011, 27(1): 63-68.
[13] Yang Ying, Cui Lei. Evolution of Topics About Medical Informatics by Improved Co-word Cluster Analysis[J]. 现代图书情报技术, 2011, 27(1): 83-87.
[14] Wang Lixue,Leng Fuhai,Wang Haixia. Research on Technology Readiness Level and Identified Methods[J]. 现代图书情报技术, 2010, 26(3): 58-63.
[15] Duan Xiaoli, Wang Yu. The Subject Extraction Based on Topic Segmentation and PageRank Algorithm[J]. 现代图书情报技术, 2010, 26(12): 34-39.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938