Please wait a minute...
Data Analysis and Knowledge Discovery  2019, Vol. 3 Issue (6): 83-91    DOI: 10.11925/infotech.2096-3467.2018.0887
Current Issue | Archive | Adv Search |
Extracting Book Review Topics with Knowledge Base
Ruihua Qi1,2(),Junyi Zhou1,2,Xu Guo2,Caihong Liu2
1(Linguistics Research Center, Dalian University of Foreign Languages, Dalian 116044, China)
2(Research Center for Multilingual Big Data in Cyberspace, Dalian University of Foreign Languages, Dalian 116044, China)
Download: PDF (1976 KB)   HTML ( 15
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper tries to extract topics from book reviews with the help of natural language semantics. [Methods] We proposed a method to retrieve the explicit and implicit topic keywords with the global semantic information from common sense knowledge base. [Results] The sentence coverage rate with the knowledge base method and the lexical diversity of the proposed method were 30.8% and 0.36% higher than those of the Double-Propagation algorithm. Then, based on the extracted topic words, we created a cluster map to identify the topic keywords identified by the nodes cluster centrality. [Limitations] There is no domain knowledge base in the field of book reviews. [Conclusions] The proposed method based on Knowledge Base improves the sentence coverage and lexical diversity of topics extracted from book reviews.

Key wordsKnowledge Base      Book Review      Topic Extraction     
Received: 10 August 2018      Published: 15 August 2019

Cite this article:

Ruihua Qi,Junyi Zhou,Xu Guo,Caihong Liu. Extracting Book Review Topics with Knowledge Base. Data Analysis and Knowledge Discovery, 2019, 3(6): 83-91.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2018.0887     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2019/V3/I6/83

[1] 刘君. 试论文献的隐性主题[J]. 图书情报知识, 1996(2): 24-27.
[1] (Liu Jun.On the Implicit Topic of Literature[J]. Documentation, Information and Knowledge, 1996(2): 24-27.)
[2] Hu M, Liu B.Mining and Summarizing Customer Reviews[C]// Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2004: 168-177.
[3] Hu M, Liu B.Mining Opinion Features in Customer Reviews[C]// Proceedings of the 19th National Conference on Artificial Intelligence. 2004: 755-760.
[4] Qiu G, Liu B, Bu J, et al.Opinion Word Expansion and Target Extraction Through Double Propagation[J]. Computational Linguistics, 2011, 37(1): 9-27.
[5] Poria S, Cambria E, Ku L W, et al.A Rule-based Approach to Aspect Extraction from Product Reviews[C]// Proceedings of the 2nd Workshop on Natural Language Processing for Social Media. 2014: 28-37.
[6] Su Q, Xu X, Guo H, et al.Hidden Sentiment Association in Chinese Web Opinion Mining[C]// Proceedings of the 17th International Conference on World Wide Web. ACM, 2008: 959-968.
[7] Jin W, Ho H H.A Novel Lexicalized HMM-based Learning Framework for Web Opinion Mining[C]// Proceedings of the 26th Annual International Conference on Machine Learning. 2009: 465-472.
[8] Poria S, Cambria E, Gelbukh A.Aspect Extraction for Opinion Mining with a Deep Convolutional Neural Network[J]. Knowledge-Based Systems, 2016, 108: 42-49.
[9] Cruz I, Gelbukh A, Sidorov G.Implicit Aspect Indicator Extraction for Aspect Based Opinion Mining[J]. International Journal of Computational Linguistics and Applications, 2014, 5(2): 135-152.
[10] Zhang Y, Zhu W.Extracting Implicit Features in Online Customer Reviews for Opinion Mining[C]// Proceedings of the 22nd International Conference on World Wide Web. 2013: 103-104.
[11] 冯淑芳, 王素格. 面向观点挖掘的汽车评价本体知识库的构建[J]. 计算机应用与软件, 2011, 28(5): 45-47, 105.
[11] (Feng Shufang, Wang Suge.Automobile Reviews Ontology Knowledge Base Construction Oriented Towards Opinion Mining[J].Computer Applications and Software, 2011, 28(5): 45-47, 105.)
[12] 王素格, 李大宇, 李旸. 基于联合模型的商品口碑数据情感挖掘[J]. 清华大学学报: 自然科学版, 2017, 57(9): 926-931.
[12] (Wang Suge, Li Dayu, Li Yang.Sentiment Mining of Commodity Reputation Data Based on Joint Model[J]. Journal of Tsinghua University: Science and Technology, 2017, 57(9): 926-931.)
[13] Zhang P, Gu H, Gartrell M, et al.Group-based Latent Dirichlet Allocation (Group-LDA): Effective Audience Detection for Books in Online Social Media[J]. Knowledge-Based Systems, 2016, 105: 134-146.
[14] Sohail S S, Siddiqui J, Ali R.Book Recommendation System Using Opinion Mining[C]// Proceedings of the 2013 International Conference on Advances in Computing, Communications and Informatics. 2013: 1609-1614.
[15] Sohail S S, Siddiqui J, Ali R.Feature Extraction and Analysis of Online Reviews for the Recommendation of Books Using Opinion Mining Technique[J]. Perspectives in Science, 2016, 8: 754-756.
[16] 陈晓美. 网络评论观点知识发现研究[D]. 长春: 吉林大学, 2014.
[16] (Chen Xiaomei.Study of Knowledge Discovery of Opinions from Web Reviews[D]. Changchun: Jilin University, 2014.)
[17] Cambria E, Chandra P, Sharma A, et al.Do not Feel the Trolls[C]// Proceedings of the CEUR Workshop. 2010.
[18] Rajagopal D, Cambria E, Olsher D, et al.A Graph-based Approach to Commonsense Concept Extraction and Semantic Similarity Detection[C]// Proceedings of the 22nd International Conference on World Wide Web. ACM, 2013: 565-570.
[19] 李锋. 基于核心关键词的聚类分析——兼论共词聚类分析的不足[J]. 情报科学, 2017, 35(8): 68-71, 78.
[19] (Li Feng.Clustering Analysis Based on Core Keyword—— Concurrently Discuss the Deficiency of Co-word Analysis[J]. Information Science, 2017, 35(8): 68-71, 78.)
[20] 傅柱, 王曰芬. 共词分析中术语收集阶段的若干问题研究[J]. 情报学报, 2016, 35(7): 704-713.
[20] (Fu Zhu, Wang Yuefen.A Discussion on Some Questions of Term Collection in Co-Word Analysis[J]. Journal of the China Society for Scientific and Technical Information, 2016, 35(7): 704-713.)
[21] 胡昌平, 陈果. 科技论文关键词特征及其对共词分析的影响[J]. 情报学报, 2014, 33(1): 23-32.
[21] (Hu Changping, Chen Guo.Characteristics of Keywords in Scientific Papers and Their Impact on Co-Word Analysis[J]. Journal of the China Society for Scientific and Technical Information, 2014,33(1): 23-32.)
[22] Wang Z Y, Li G, Li C Y, et a1. Research on the Semantic-Based Co-Word Analysis[J]. Scientometrics, 2012, 90(3): 855-875.
[23] Waltman L, Van Eck N J. A Smart Local Moving Algorithm for Large-Scale Modularity-Based Community Detection[J]. The European Physical Journal B, 2013, 86(11): 471.
[24] Waltman L, Van Eck N J, Noyons E C M. A Unified Approach to Mapping and Clustering of Bibliometric Networks[J]. Journal of Informetrics, 2010, 4(4): 629-635.
[25] Manning C D.Part-of-Speech Tagging from 97% to 100%: Is It Time for Some Linguistics?[C]//Proceedings of the 2011 International Conference on Intelligent Text Processing and Computational Linguistics. Berlin, Heidelberg: Springer, 2011: 171-189.
[26] Chris D P.Another Stemmer[J].ACM SIGIR Forum, 1990, 24(3): 56-61.
[27] SenticNet. Concept Parser[OL]. [2018-01-28]. .
[28] Van Eck N J, Waltman L. Software Survey: VOSviewer, a Computer Program for Bibliometric Mapping[J]. Scientometrics, 2010, 84(2): 523-538.
[29] 杨颖, 崔雷. 基于共词可视化的学科战略情报研究[J]. 情报学报, 2011, 30(3): 325-330.
[29] (Yang Ying, Cui Lei.Subject Strategic Information Research Based on Visualization of Co-Word Network[J]. Journal of the China Society for Scientific and Technical Information, 2011, 30(3): 325-330.)
[1] Li Wenna,Zhang Zhixiong. Research on Knowledge Base Error Detection Method Based on Confidence Learning[J]. 数据分析与知识发现, 2021, 5(9): 1-9.
[2] Wen Pingmei,Ye Zhiwei,Ding Wenjian,Liu Ying,Xu Jian. Developments of Named Entity Disambiguation[J]. 数据分析与知识发现, 2020, 4(9): 15-25.
[3] He Weilin,Feng Guohe,Xie Hongling. Analyzing Scientific Literature with Content Similarity - Topics over Time Model[J]. 数据分析与知识发现, 2018, 2(11): 64-72.
[4] Chen Guo,Xiao Lu. Linking Knowledge Elements from Online Community[J]. 数据分析与知识发现, 2017, 1(11): 75-83.
[5] Zhou Pengcheng,Wu Chuan,Lu Wei. Entity Linking Method for Short Texts with Multi-Knowledge Bases: Case Study of Wikipedia and Freebase[J]. 现代图书情报技术, 2016, 32(6): 1-11.
[6] Wang Yuefen,Fu Zhu,Chen Bikun. Analyzing Knowledge Structure Research with LDA Model[J]. 现代图书情报技术, 2016, 32(4): 8-19.
[7] Guo Shunli,Zhang Xiangxian. Building Sentiment Analysis Dictionary for Chinese Book Reviews[J]. 现代图书情报技术, 2016, 32(2): 67-74.
[8] Dongsheng Zhai, He Liu, Jie Zhang, Liwei Cai. Managing Patent Semantic Knowledge with Graph Database[J]. 数据分析与知识发现, 2016, 32(12): 66-75.
[9] Wu Wankun, Wu Qinglie, Gu Jinjiang. Hot Topic Extraction from E-commerce Microblog Based on EM-LDA Integrated Model[J]. 现代图书情报技术, 2015, 31(11): 33-40.
[10] Jiang Xun, Xu Xukan, Su Xinning. Knowledge Service-oriented Model of Knowledge Base Frame Structure Research Based on Double-base Cooperating[J]. 现代图书情报技术, 2014, 30(2): 55-62.
[11] Xu Xin, Hong Yunjia. Study on Text Visualization of Clustering Result for Domain Knowledge Base —— Take Knowledge Base of Chinese Cuisine Culture as the Object[J]. 现代图书情报技术, 2014, 30(10): 25-32.
[12] Wang Dongbo, Zhu Danhao. Research of Mining the Word Category Knowledge for Chinese Syntactic Function Distribution Knowledge Base[J]. 现代图书情报技术, 2013, 29(3): 33-37.
[13] Xu Xin, Guo Jinlong. Construction of Subject Knowledge Base——Taking the Domain of Chinese Cuisine Culture as an Example[J]. 现代图书情报技术, 2013, (12): 2-9.
[14] Guo Jinlong, Hong Yunjia, Xu Xin. Construction and Application of Ontology in the Domain of Chinese Cuisine Culture[J]. 现代图书情报技术, 2013, (12): 10-18.
[15] Hong Yunjia, Xu Xin. Study on Multi-level Text Clustering for Knowledge Base Based on Domain Ontology——Taking Knowledge Base of Chinese Cuisine Culture as an Example[J]. 现代图书情报技术, 2013, (12): 19-26.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn