Please wait a minute...
Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (3): 45-59    DOI: 10.11925/infotech.2096-3467.2020.1103
Current Issue | Archive | Adv Search |
Review of Keyword Extraction Studies
Hu Shaohu,Zhang Yingyi,Zhang Chengzhi()
School of Economics and Management, Nanjing University of Science & Technology, Nanjing 210094, China
Download: PDF (794 KB)   HTML ( 16
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper reviews the methods, features and evaluation procedures of keyword extraction research, aiming to provide reference for future studies. [Coverage] We searched the Web of Science, DBLP, Engineering Index, Google Scholar, CNKI and Wanfang Data with “Keyword Extraction”, “Keyword Generation”,“Keyphrase Extraction”, and “Keyphrase Generation”, etc. A total of 89 representative literature were retrieved. [Methods] First, we analyzed the development of keyword extraction techniques. Then, we summarized related studies from the perspectives of research methods, characteristics and evaluation process. [Results] The keyword extraction methods, which gradually shifted from feature-driven models to data-driven models due to the development of machine learning, also faced problems like data labeling and evaluation criteria. [Limitations] We examined more mainstream methods for keyword extraction. [Conclusions] This paper summarizes the developing trends of keyword extraction methods, as well as the dis-advantages of existing evaluation mechanism.

Key wordsExtraction      ExtractiveExtraction      AbstractiveGeneration     
Received: 10 October 2020      Published: 24 November 2020
ZTFLH:  TP393  
  G250  
Fund:National Natural Science Foundation of China(72074113)
Corresponding Authors: Zhang Chengzhi     E-mail: zhangcz@njust.edu.cn

Cite this article:

Hu Shaohu,Zhang Yingyi,Zhang Chengzhi. Review of Keyword Extraction Studies. Data Analysis and Knowledge Discovery, 2021, 5(3): 45-59.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2020.1103     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2021/V5/I3/45

General Pattern of Unsupervised Methods
General Pattern of Keyword Classification
Sequential Tagging Pattern for Keyword Extraction
方法类型 方法描述 优点 缺点
非监督方法 简单统计 基于N-gram[11]、TF-IDF[12]、词频[6]、词共现[13]等统计指标抽取关键词 操作简单易行 准确率不高,且在不同数据集上的表现不稳定
图结构 基于图结构对候选词进行排序,如TextRank[17]、SingleRank[18]、SGRank[19] 可以体现候选词间的联系 准确率有限,且不适用于短文本
语言模型 通过语言模型计算候选词的信息量,并以此作为单词重要程度的依据[33] 操作简单易行 带有较强的主观性,缺少严谨的评价指标
有监督方法 分类模型 传统机器学习 选择特征表示单词并通过模型将其进行区分,常见的模型包括NB[34,39]、SVM[35]、决策树[47,48,49] 抽取准确率较高 忽略了上下文的语境对候选词的影响
深度学习 利用深度学习模型对关键词与非关键词加以区分,如MLP[50]
序列标注模型 传统机器学习 在判断当前单词的标签时会考虑上下文的信息,常见的模型为CRF[36,55-56] 抽取准确率较高 需要大规模的标注语料支持
深度学习 利用循环神经网络实现对序列的标注[52,53,54]
List of Keyword Extraction Methods
“Encoding-Decoding” Pattern for Keyword Generation
特征类别 特征描述 例子
基于统计的特征 统计单词的某一属性作为特征,常见的有词频特征与长度特征 TF-IDF[1,34,37,40-41,70-71]、对数TF-IDF[42]、布尔TF-IDF[43]
基于位置的特征 将单词出现在文档(句子)中的位置作为特征,在结构层次分明的文档中尤为有效 首次出现位置[38,42,51,72]、平均出现位置[42]、最后出现位置等特征[41,51,73]
基于语言的特征 单词在语言属性上的特征,一般指词性,但也包括单词的形态特征等 词性[40,46]、后缀[40]、大小写[16,73]、是否为粗体[16,56,73]
其他特征 特定方法的特征,如图模型的特殊结构使其具有一些特有的特征 度中心度、接近中心性、介数中心性、特征向量中心性等[75]
基于外部资源的特征,利用外部资源对原有特征进行补充,或是采用外部资源的指标作为特征 维基词条[76]、维基百科关键词度[77]、搜索引擎评分[76]、眼动特征[78,79]
基于词嵌入的特征,利用词嵌入表示单词间的语义关系 Word2Vec[80]、Glove[81]
Related Feature Classification of Keyword Extraction
[1] Turney P D. Learning Algorithms for Keyphrase Extraction[J]. Information Retrieval, 2000,2(4):303-336.
[2] 章成志. 自动标引研究的回顾与展望[J]. 现代图书情报技术, 2007(11):33-39.
[2] ( Zhang Chengzhi. Review and Prospect of Automatic Indexing Research[J]. New Technology of Library and Information Service, 2007(11):33-39.)
[3] 赵京胜, 朱巧明, 周国栋, 等. 自动关键词抽取研究综述[J]. 软件学报, 2017,28(9):2431-2449.
[3] ( Zhao Jingsheng, Zhu Qiaoming, Zhou Guodong, et al. Review of Research in Automatic Keyword Extraction[J]. Journal of Software, 2017,28(9):2431-2449.)
[4] Liu Z, Huang W, Zheng Y, et al. Automatic Keyphrase Extraction via Topic Decomposition[C]// Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Massachusetts, USA. Association for Computational Linguistics, 2010: 366-376.
[5] Hassaïne A, Mecheter S, Jaoua A. Text Categorization Using Hyper Rectangular Keyword Extraction: Application to News Articles Classification[C]// Proceedings of the 15th International Conference on Relational and Algebraic Methods in Computer Science, Braga, Portugal. Springer, 2015,9348:312-325.
[6] Luhn H P. A Statistical Approach to Mechanized Encoding and Searching of Literary Information[J]. IBM Journal of Research and Development, 1957,1(4):309-317.
[7] Merrouni Z A, Frikh B, Ouhbi B. Automatic Keyphrase Extraction: A Survey and Trends[J]. Journal of Intelligent Information Systems, 2020,54(2):391-424.
[8] 常耀成, 张宇翔, 万怀宇, 等. 特征驱动的关键词提取算法综述[J]. 软件学报, 2018,29(7):2046-2070.
[8] ( Chang Yaocheng, Zhang Yuxiang, Wan Huaiyu, et al. Features Oriented Survey of State-of-the-Art Keyphrase Extraction Algorithms[J]. Journal of Software, 2018,29(7):2046-2070.)
[9] Papagiannopoulou E, Tsoumakas G. A Review of Keyphrase Extraction[J]. Wiley Interdisciplinary Reviews Data Mining & Knowledge Discovery, 2020,10(2):e1339.
[10] Meng R, Zhao S, Han S, et al. Deep Keyphrase Generation[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver,Canada. Association for Computational Linguistics, 2017: 582-592.
[11] Cohen J D. Highlights: Language- and Domain-Independent Automatic Indexing Terms for Abstracting[J]. Journal of the American Society for Information Science, 1995,46(3):162-174.
[12] Salton G, Yang C S, Yu C T. A Theory of Term Importance in Automatic Text Analysis[J]. Journal of the American Society for Information Science, 1975,26(1):33-44.
[13] Matsuo Y, Ishizuka M. Keyword Extracyion from a Single Document Using Word Co-occurrence Statistical Information[J]. International Journal on Artificial Intelligence Tools, 2008,13(1):157-169.
[14] Barker K, Cornacchia N. Using Noun Phrase Heads to Extract Document Keyphrases[C]// Proceedings of the 13th Biennial Conference of the Canadian Society on Computational Studies of Intelligence: Advances in Artificial Intelligence, Quebec, Canada. Springer, 2000:40-52.
[15] Edmundson H P. New Method in Automatic Abstracting[J]. Journal of the ACM, 1969,16(2):264-285.
[16] Campos R, Mangaravite V, Pasquali A, et al. YAKE! Collection-independent Automatic Keyword Extractor[C]// Proceedings of the 40th European Conference on IR Research, Grenoble, France. Springer, 2018:806-810.
[17] Mihalcea R, Tarau P. TextRank: Bringing Order into Text[C]// Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain. Association for Computational Linguistics, 2004: 404-411.
[18] Wan X, Xiao J. Single Document Keyphrase Extraction Using Neighborhood Knowledge[C]// Proceedings of the 23rd AAAI Conference on Artificial Intelligence,Illinois, USA. AAAI Press, 2008: 855-860.
[19] Danesh S, Sumner T, Martin J H. SGRank: Combining Statistical and Graphical Methods to Improve the State of the Art in Unsupervised Keyphrase Extraction[C]// Proceedings of the 4th Joint Conference on Lexical and Computational Semantics, Colorado,USA. 2015: 117-126.
[20] Florescu C, Caragea C. PositionRank: An Unsupervised Approach to Keyphrase Extraction from Scholarly Documents[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada. Association for Computational Linguistics, 2017: 1105-1115.
[21] Gollapalli S D, Caragea C. Extracting Keyphrases from Research Papers Using Citation Networks[C]// Proceedings of the 28th AAAI Conference on Artificial Intelligence, Quebec, Canada. AAAI Press, 2014: 1629-1635.
[22] Liu Z, Li P, Zheng Y, et al. Clustering to Find Exemplar Terms for Keyphrase Extraction[C]// Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Suntec, Singapore. ACL, 2009: 257-266.
[23] Bougouin A, Boudin F, Daille B. TopicRank: Graph-Based Topic Ranking for Keyphrase Extraction[C]// Proceedings of the 6th International Joint Conference on Natural Language Processing, Nagoya, Japan. ACL, 2013: 543-551.
[24] Boudin F. Unsupervised Keyphrase Extraction with Multipartite Graphs[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Louisiana, USA. Association for Computational Linguistics, 2018: 667-672.
[25] Sterckx L, Demeester T, Deleu J, et al. Topical Word Importance for Fast Keyphrase Extraction[C]// Proceedings of the 24th International Conference on World Wide Web, Florence, Italy. ACM, 2015: 121-122.
[26] Teneva N, Cheng W. Salience Rank: Efficient Keyphrase Extraction with Topic Modeling[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada. Association for Computational Linguistics, 2017: 530-535.
[27] Collobert R, Weston J, Bottou L, et al. Natural Language Processing (Almost) from Scratch[J]. Journal of Machine Learning Research, 2011,12:2493-2537.
[28] Wang R, Liu W, McDonald C. Corpus-independent Generic Keyphrase Extraction Using Word Embedding Vectors[C]. Software Engineering Research Conference, 2014,39:1-8.
[29] Wang R, Liu W, McDonald C. Using Word Embeddings to Enhance Keyword Identification for Scientific Publications[C]// Proceedings of the 26th Australasian Database Conference, Melbourne, Australia. Springer, 2015: 257-268.
[30] Mahata D, Kuriakose J, Shah R R, et al. Key2Vec: Automatic Ranked Keyphrase Extraction from Scientific Articles Using Phrase Embeddings[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, USA. Association for Computational Linguistics, 2018: 634-639.
[31] Shi W, Zheng W, Yu J X, et al. Keyphrase Extraction Using Knowledge Graphs[J]. Data Science and Engineering, 2017,2(4):275-288.
[32] Yu Y, Ng V. WikiRank: Improving Keyphrase Extraction Based on Background Knowledge[C]// Proceedings of the 11th Edition of the Language Resources and Evaluation Conference, Miyazaki, Japan. European Language Resources Association, 2018: 3723-3727.
[33] Tomokiyo T, Hurst M. A Language Model Approach to Keyphrase Extraction[C]// Proceedings of the ACL 2003 Workshop on Multiword Expressions: Analysis, Acquisition and Treatment, Sapporo, Japan. Association for Computational Linguistics, 2003,18:33-40.
[34] Frank E, Paynter G W, Witten I H, et al. Domain-Specific Keyphrase Extraction[C]// Proceedings of the 16th International Joint Conference on Artificial Intelligence, Stockholm, Sweden. Morgan Kaufmann, 1999: 668-673.
[35] Wang J, Peng H. Keyphrases Extraction from Web Document by the Least Squares Support Vector Machine[C]// Proceedings of the 2005 IEEE / WIC / ACM International Conference on Web Intelligence, Compiegne, France. IEEE Computer Society, 2005: 293-296.
[36] Zhang C, Wang H, Liu Y, et al. Automatic Keyword Extraction from Documents Using Conditional Random Fields[J]. Journal of Computer Information Systems, 2008,4(3):1169-1180.
[37] Ding Z, Zhang Q, Huang X. Keyphrase Extraction from Online News Using Binary Integer Programming[C]// Proceedings of the 5th International Joint Conference on Natural Language Processing, Chiang Mai, Thailand. Association for Computer Linguistics, 2011: 165-173.
[38] Haddoud M, Mokhtari A, Lecroq T, et al. Accurate Keyphrase Extraction from Scientific Papers by Mining Linguistic Information[C]// Proceedings of the 1st Workshop on Mining Scientific Papers: Computational Linguistics and Bibliometrics Co-located with 15th International Society of Scientometrics and Informetrics Conference, Istanbul, Turkey. 2015: 12-17.
[39] Turney P D. Coherent Keyphrase Extraction via Web Mining[C]// Proceedings of the 18th International Joint Conference on Artificial Intelligence, Acapulco, Mexico. Morgan Kaufmann, 2003: 434-442.
[40] Nguyen T D, Kan M Y. Keyphrase Extraction in Scientific Publications[C]// Proceedings of the 10th International Conference on Asian Digital Libraries, Hanoi, Vietnam. Springer, 2007: 317-326.
[41] Medelyan O, Frank E, Witten I H. Human-competitive Tagging Using Automatic Keyphrase Extraction[C]// Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Suntec, Singapore. Association for Computational Linguistics, 2009: 1318-1327.
[42] Haddoud M, Abdeddaïm S. Accurate Keyphrase Extraction by Discriminating Overlapping Phrases[J]. Journal of Information Science, 2014,40(4):488-500.
[43] Caragea C, Bulgarov F A, Godea A, et al. Citation-Enhanced Keyphrase Extraction from Research Papers: A Supervised Approach[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar. Association for Computational Linguistics, 2014: 1435-1446.
[44] Zhang K, Xu H, Tang J, et al. Keyword Extraction Using Support Vector Machine[C]// Proceedings of the 7th International Conference of Web-Age Information Management, Hong Kong, China. Springer, 2006: 85-96.
[45] 章成志. 基于集成学习的自动标引方法研究[J]. 情报学报, 2010,29(1):3-8.
[45] ( Zhang Chengzhi. Research on Automatic Indexing Method Based on Ensemble Learning[J]. Journal of the China Society for Scientific and Technical Information, 2010,29(1):3-8.)
[46] Hulth A. Improved Automatic Keyword Extraction Given More Linguistic Knowledge[C]// Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, Sapporo, Japan. Association for Computational Linguistics, 2003: 216-223.
[47] Ercan G, Cicekli I. Using Lexical Chains for Keyword Extraction[J]. Information Processing & Management, 2007,43(6):1705-1714.
[48] Sterckx L, Caragea C, Demeester T, et al. Supervised Keyphrase Extraction as Positive Unlabeled Learning[C]// Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, USA. Association for Computational Linguistics, 2016: 1924-1929.
[49] Krapivin M, Autayeu A, Marchese M, et al. Keyphrases Extraction from Scientific Documents: Improving Machine Learning Approaches with Natural Language Processing[C]// Proceedings of the 12th International Conference on Asia-Pacific Digital Libraries. Springer, 2010: 102-111.
[50] Sarkar K, Nasipuri M, Ghose S. Machine Learning Based Keyphrase Extraction: Comparing Decision Trees, Naïve Bayes, and Artificial Neural Networks[J]. Journal of Information Processing Systems, 2012,8(4):693-712.
[51] Aquino G O, Lanzarini L C. Keyword Identification in Spanish Documents Using Neural Networks[J]. Journal of Computer Science and Technology, 2015,15(2):55-60.
[52] Zhang Q, Wang Y, Gong Y, et al. Keyphrase Extraction Using Deep Recurrent Neural Networks on Twitter[C]// Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin,USA. Association for Computational Linguistics, 2016: 836-845.
[53] Basaldella M, Antolli E, Serra G, et al. Bidirectional Lstm Recurrent Neural Network for Keyphrase Extraction[C]// Proceedings of the 14th Italian Research Conference on Digital Libraries, Udine, Italy. Springer, 2018: 180-187.
[54] Alzaidy R, Caragea C, Giles C L. Bi-LSTM-CRF Sequence Labeling for Keyphrase Extraction from Scholarly Documents[C]// Proceedings of the 2019 World Wide Web Conference. ACM, 2019: 2551-2557.
[55] Bhaskar P, Nongmeikapam K, Bandyopadhyay S. Keyphrase Extraction in Scientific Articles: A Supervised Approach[C]// Proceedings of the 24th International Conference on Computational Linguistics, Austin, USA. Indian Institute of Technology Bombay, 2012: 17-24.
[56] Gollapalli S D, Li X L, Yang P. Incorporating Expert Knowledge into Keyphrase Extraction[C]// Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, USA. AAAI Press, 2017: 3180-3187.
[57] Liu Z, Chen X, Zheng Y, et al. Automatic Keyphrase Extraction by Bridging Vocabulary Gap[C]// Proceedings of the 15th Conference on Computational Natural Language Learning, Portland, USA. ACL, 2011: 135-144.
[58] Koehn P. Statistical Machine Translation[M]. Cambridge,UK: Cambridge University Press, 2010.
[59] Brown P F, Pietra S D, Pietra V J D, et al. The Mathematics of Statistical Machine Translation: Parameter Estimation[J]. Computational Linguistics, 1993,19(2):263-311.
[60] Cho K, van Merrienboer B, Gülçehre Ç, et al. Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar. ACL, 2014: 1724-1734.
[61] Chen J, Zhang X, Wu Y, et al. Keyphrase Generation with Correlation Constraints[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium. Association for Computational Linguistics, 2018: 4057-4066.
[62] Zhang Y, Xiao W. Keyphrase Generation Based on Deep Seq2Seq Model[J]. IEEE Access, 2018,6:46047-46057.
[63] Chen W, Gao Y, Zhang J, et al. Title-Guided Encoding for Keyphrase Generation[C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence, Honolulu,USA. AAAI Press, 2019: 6268-6275.
[64] Chen W, Chan H P, Li P, et al. Exclusive Hierarchical Decoding for Deep Keyphrase Generation[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2020: 1095-1105.
[65] Chen W, Chan H P, Li P, et al. An Integrated Approach for Keyphrase Generation via Exploring the Power of Retrieval and Extraction[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis,USA. Association for Computational Linguistics, 2019: 2846-2856.
[66] Wang Y, Li J, Chan H P, et al. Topic-Aware Neural Keyphrase Generation for Social Media Language[C]// Proceedings of the 57th Conference of the Association for Computational Linguistics, Florence, Italy. Association for Computational Linguistics, 2019: 2516-2526.
[67] Chan H P, Chen W, Wang L, et al. Neural Keyphrase Generation via Reinforcement Learning with Adaptive Rewards[C]// Proceedings of the 57th Conference of the Association for Computational Linguistics, Florence, Italy. Association for Computational Linguistics, 2019: 2163-2174.
[68] Ye H, Wang L. Semi-Supervised Learning for Neural Keyphrase Generation[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels,Belgium. Association for Computational Linguistics, 2018: 4142-4153.
[69] Wang Y, Liu Q, Qin C, et al. Exploiting Topic-Based Adversarial Neural Network for Cross-Domain Keyphrase Extraction[C]// Proceedings of the 2018 IEEE International Conference on Data Mining, Sentosa, Singapore. IEEE Computer Society, 2018: 597-606.
[70] Jones K S. A Statistical Interpretation of Term Specificity and Its Application in Retrieval[J]. Journal of Documentation, 1972,28(1):11-21.
[71] Salton G, Buckley C. Term-Weighting Approaches in Automatic Text Retrieval[J]. Information Processing & Management, 1988,24(5):513-523.
[72] Zhang W, Feng W, Wang J. Integrating Semantic Relatedness and Words’ Intrinsic Features for Keyword Extraction[C]// Proceedings of the 23rd International Joint Conference on Artificial Intelligence, Beijing, China. IJCAI, 2013: 1115-2231.
[73] Nguyen T D, Luong M T. WINGNUS: Keyphrase Extraction Utilizing Document Logical Structure[C]// Proceedings of the 5th International Workshop on Semantic Evaluation, Uppsala, Sweden. Association for Computer Linguistics, 2010: 166-169.
[74] Marujo L, Gershman A, Carbonell J G, et al. Supervised Topical Key Phrase Extraction of News Stories Using Crowdsourcing, Light Filtering and Co-reference Normalization[C]// Proceedings of the 8th International Conference on Language Resources and Evaluation, Istanbul,Turkey. European Language Resources Association, 2012: 399-403.
[75] Boudin F. A Comparison of Centrality Measures for Graph-Based Keyphrase Extraction[C]// Proceedings of the 6th International Joint Conference on Natural Language Processing, Nagoya, Japan. ACL, 2013: 834-838.
[76] Eichler K, Neumann G. DFKI KeyWE: Ranking Keyphrases Extracted from Scientific Articles[C]// Proceedings of the 5th International Workshop on Semantic Evaluation, Uppsala,Sweden. Association for Computer Linguistics, 2010: 150-153.
[77] Berend G. Exploiting Extra-textual and Linguistic Information in Keyphrase Extraction[J]. Natural Language Engineering, 2016,22(1):73-95.
[78] Zhang Y, Zhang C. Using Human Attention to Extract Keyphrase from Microblog Post[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence,Italy. Association for Computational Linguistics, 2019: 5867-5872.
[79] Zhang Y, Zhang C. Enhancing Keyphrase Extraction from Microblogs Using Human Reading Time[J]. Journal of the Association for Information Science and Technology, 2020.
[80] Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[C]// Proceedings of the 1st International Conference on Learning Representations, Scottsdale,USA. Association for Computational Linguistics, 2013: 1-12.
[81] Pennington J, Socher R, Manning C D. Glove: Global Vectors for Word Representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar. Association for Computational Linguistics, 2014: 1532-1543.
[82] Zhang Y, Zhang C, Li J. Joint Modeling of Characters, Words, and Conversation Contexts for Microblog Keyphrase Extraction[J]. Journal of the Association for Information Science and Technology, 2020,71(5):553-567.
[83] Manning C D, Raghavan P, Schütze H. Introduction to Information Retrieval[M]. Cambridge,UK: Cambridge University Press, 2008.
[84] Voorhees E M. The TREC-8 Question Answering Track Report[C]// Proceedings of the 8th Text Retrieval Conference, Gaithersburg,USA. National Institute of Standards and Technology (NIST), 1999: 246-500.
[85] Liu L, Özsu M T. Encyclopedia of Database Systems[M]. New York,USA: Springer US, 2009.
[86] Ristad E S, Yianilos P N. Learning String-edit Distance[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998,20(5):522-532.
[87] Dagan I, Pereira F C N, Lee L. Similarity-Based Estimation of Word Cooccurrence Probabilities[C]// Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, Las Cruces, USA. ACL, 1994: 272-278.
[88] 章成志, 周冬敏. 自动标引通用评价模型研究[J]. 情报学报, 2009,28(1):40-47.
[88] ( Zhang Chengzhi, Zhou Dongmin. General Evaluation Model for Automatic Indexing[J]. Journal of the China Society for Scientific and Technical Information, 2009,28(1):40-47.)
[89] Chen P I, Lin S J. Automatic Keyword Prediction Using Google Similarity Distance[J]. Expert Systems with Applications, 2010,37(3):1928-1938.
[1] Cheng Bin,Shi Shuicai,Du Yuncheng,Xiao Shibin. Keyword Extraction for Journals Based on Part-of-Speech and BiLSTM-CRF Combined Model[J]. 数据分析与知识发现, 2021, 5(3): 101-108.
[2] Dai Jianhua, Deng Yubin. Extracting Emotion-Cause Pairs Based on Emotional Dilation Gated CNN[J]. 数据分析与知识发现, 2020, 4(8): 98-106.
[3] Xia Tian. Extracting Key-phrases from Chinese Scholarly Papers[J]. 数据分析与知识发现, 2020, 4(7): 76-86.
[4] Li Chengliang,Zhao Zhongying,Li Chao,Qi Liang,Wen Yan. Extracting Product Properties with Dependency Relationship Embedding and Conditional Random Field[J]. 数据分析与知识发现, 2020, 4(5): 54-65.
[5] Liu Liu,Qin Tianyun,Wang Dongbo. Automatic Extraction of Traditional Music Terms of Intangible Cultural Heritage[J]. 数据分析与知识发现, 2020, 4(12): 68-75.
[6] Cai Jingxuan,Wu Jiang,Wang Chengkun. Predicting Usefulness of Crowd Testing Reports with Deep Learning[J]. 数据分析与知识发现, 2020, 4(11): 102-111.
[7] Tao Yue,Yu Li,Zhang Runjie. Active Learning Strategies for Extracting Phrase-Level Topics from Scientific Literature[J]. 数据分析与知识发现, 2020, 4(10): 134-143.
[8] Wang Yi,Shen Zhe,Yao Yifan,Cheng Ying. Domain-Specific Event Graph Construction Methods:A Review[J]. 数据分析与知识发现, 2020, 4(10): 1-13.
[9] Hui Nie,Huan He. Identifying Implicit Features with Word Embedding[J]. 数据分析与知识发现, 2020, 4(1): 99-110.
[10] Gang Li,Huayang Zhou,Jin Mao,Sijing Chen. Classifying Social Media Users with Machine Learning[J]. 数据分析与知识发现, 2019, 3(8): 1-9.
[11] Mingzhu Sun,Jing Ma,Lingfei Qian. Extracting Keywords Based on Topic Structure and Word Diagram Iteration[J]. 数据分析与知识发现, 2019, 3(8): 68-76.
[12] Xiaofeng Li,Jing Ma,Chi Li,Hengmin Zhu. Identifying Commodity Names Based on XGBoost Model[J]. 数据分析与知识发现, 2019, 3(7): 34-41.
[13] Xiuxian Wen,Jian Xu. Research on Product Characteristics Extraction and Hedonic Price Based on User Comments[J]. 数据分析与知识发现, 2019, 3(7): 42-51.
[14] Qingtian Zeng,Xiaohui Hu,Chao Li. Extracting Keywords with Topic Embedding and Network Structure Analysis[J]. 数据分析与知识发现, 2019, 3(7): 52-60.
[15] Ruihua Qi,Junyi Zhou,Xu Guo,Caihong Liu. Extracting Book Review Topics with Knowledge Base[J]. 数据分析与知识发现, 2019, 3(6): 83-91.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn