Please wait a minute...
Data Analysis and Knowledge Discovery  2017, Vol. 1 Issue (9): 74-82    DOI: 10.11925/infotech.2096-3467.2017.09.08
Orginal Article Current Issue | Archive | Adv Search |
Analyzing Online Reviews with Dynamic Sentiment Topic Model
Li Hui, Hu Yunfeng()
School of Economics and Management, Xidian University, Xi’an 710071, China
Download: PDF (1197 KB)   HTML ( 3
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper analyzes online reviews to identify the patterns of their topic contents and sentiments. [Methods] First, we obtained the sentiment of the reviews with the SSTM model. Then, we proposed a DSTM model based on the document, document sentiment distribution and words. Finally, we estimated the distribution of sentiment-topic and the keywords. [Results] We modeled the review datasets by time slice and found the changing trends of contents and sentiments over time. [Limitations] The proposed model did not include the relationship among different subjects, which might generate errors. [Conclusions] The DSTM model, which integrates the external time features, can effectively analyze the evolution of online review topics.

Key wordsShort-text Sentiment-Topic Model      Dynamic Sentiment Topic Model      Parameter Estimation      Sentiment Online Reviews     
Received: 07 April 2017      Published: 18 October 2017
ZTFLH:  G350  

Cite this article:

Li Hui,Hu Yunfeng. Analyzing Online Reviews with Dynamic Sentiment Topic Model. Data Analysis and Knowledge Discovery, 2017, 1(9): 74-82.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2017.09.08     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2017/V1/I9/74

参数 具体含义
α 主题θ的先验狄利克雷参数
θ 情感s的主题分布
z 文档中词汇的主题
π 文档的情感分布
s 文档采样的某一情感
φ 主题的词分布
wi 文档中的第i个词汇
E 文档集的情感数量
D 文档子集中的文档数量
W 文档中的词汇数量
K 文档集的主题数量
积极情感 消极情感
主题1 主题2 主题3 主题5 主题6
样子 系统 功能 发热 灵敏
手机 反应 软件 失灵 屏幕
操作 卸载
后盖 四核 配置 充电 触屏
顺手 齐全 不行 分辨率
做工 内存 通话 电池
速度 性价比 字体
漂亮 流畅 像素 充电器
配置 运行 信号 每天 失灵
电源键 性能 毫安
[1] Somprasertsri G, Lalitrojwong P.Mining Feature-Opinion in Online Customer Reviews for Opinion Summarization[J]. Journal of Universal Computer Science, 2010, 16(6): 938-955.
doi: 10.3217/jucs-016-06-0938
[2] Zhuang L, Jing F, Zhu X Y.Movie Review Mining and Summarization[C]// Proceedings of the 15th ACM International Conference on Information and Knowledge Management. ACM, 2006: 43-50.
[3] Hu M, Liu B.Mining and Summarizing Customer Reviews[C]// Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, Washington, USA. 2004: 168-177.
[4] Jo Y, Oh A H.Aspect and Sentiment Unification Model for Online Review Analysis[C]//Proceedings of the 4th ACM International Conference on Web Search and Data Mining. ACM, 2011: 815-824.
[5] Lin C, He Y, Everson R, et al.Weakly Supervised Joint Sentiment-topic Detection from Text[J]. IEEE Transactions on Knowledge and Data Engineering, 2012, 24(6): 1134-1145.
doi: 10.1109/TKDE.2011.48
[6] Blei D M, Ng A Y, Jordan M I.Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
[7] 熊蜀峰, 姬东鸿. 面向产品评论分析的短文本情感主题模型[J]. 自动化学报, 2016, 42(8): 1227-1237.
doi: 10.16383/j.aas.2016.c150591
[7] (Xiong Shufeng, Ji Donghong.A Short Text Sentiment-topic Model for Product Review Analysis[J]. Acta Automatica Sinica, 2016, 42(8): 1227-1237.)
doi: 10.16383/j.aas.2016.c150591
[8] Blei D M, Lafferty J D.Dynamic Topic Models[C]// Proceedings of the 23rd International Conference on Machine Learning. 2006: 113-120.
[9] Griffiths T L, Steyversm M.Finding Scientific Topics[J]. Proceedings of the National Academy of Science of the United States of America, 2004, 101(S1): 5228-5235.
doi: 10.1073/pnas.0307752101
[10] Alsumaitl L, Barbará D, Domeniconic C.On-line LDA: Adaptive Topic Models for Mining Text Streams with Applications to Topic Detection and Tracking[C]// Proceedings of the 8th IEEE International Conference on Data Mining. 2008.
[11] Yan X, Guo J, Lan Y, et a1. A Biterm Topic Model for Short Texts[C]//Proceedings of the 22nd International Conference on World Wide Web. 2013.
[12] Andrzejewski D, Zhu X.Latent Dirichlet Allocation with Topic-in-Set Knowledge[C]// Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing.2009: 43-48.
[13] Xu H, Zhang F, Wang W.Implicit Feature Identification in Chinese Reviews Using Explicit Topic Mining Model[J]. Knowledge-Based Systems, 2015, 76: 166-175.
doi: 10.1016/j.knosys.2014.12.012
[14] 李实. 中文网络客户评论中的产品特征挖掘方法研究[D]. 哈尔滨: 哈尔滨工业大学, 2009.
[14] (Li Shi.Research on the Approaches of Mining Product Features from Chinese Customer Reviews on the Internet [D]. Harbin: Harbin Institute of Technology, 2009.)
[15] 李超雄, 黄发良, 温肖谦, 等. 基于动态主题情感混合模型的微博主题情感演化分析方法[J]. 计算机应用, 2015, 35(10): 2905-2910.
doi: 10.11772/j.issn.1001-9081.2015.10.2905
[15] (Li Chaoxiong, Huang Faliang, Wen Xiaoqian, et al.Evolution Analysis Method of Microblog Topic-Sentiment Based on Dynamic Topic Sentiment Combining Model[J]. Journal of Computer Applications, 2015, 35(10): 2905-2910.)
doi: 10.11772/j.issn.1001-9081.2015.10.2905
[16] 徐戈, 王厚峰. 自然语言处理中主题模型的发展[J]. 计算机学报, 2011, 34(8): 1423-1436.
[16] (Xu Ge, Wang Houfeng.The Development of Topic Models in Natural Language Processing[J]. Chinese Journal of Computers, 2011, 34(8): 1423-1436.)
[1] Jian DU. Measuring Uncertainty of Medical Knowledge: A Literature Review [J]. 数据分析与知识发现, 0, (): 1-.
[2] Nie Lei,Fu Juan,Yi Chengqi,Yang Daoling. Measuring Enterprise’s Offline Resumption with Mobile Device Positioning Data[J]. 数据分析与知识发现, 2020, 4(7): 38-49.
[3] Yue Lixin,Liu Ziqiang,Hu Zhengyin. Evolution Analysis of Hot Topics with Trend-Prediction[J]. 数据分析与知识发现, 2020, 4(6): 22-34.
[4] Pan Youneng,Ni Xiuli. Recommending Online Medical Experts with Labeled-LDA Model[J]. 数据分析与知识发现, 2020, 4(4): 34-43.
[5] Liang Yanping,An Lu,Liu Jing. Topic Resonance of Micro-blogs on Similar Public Health Emergencies[J]. 数据分析与知识发现, 2020, 4(2/3): 122-133.
[6] Deng Jiangao,Zhang Xuan,Fu Zhu,Wei Qingming. Tracking Online Public Opinion Based on System Dynamics: Case Study of “Xiangshui Explosion Accident”[J]. 数据分析与知识发现, 2020, 4(2/3): 110-121.
[7] Zhe Hu,Xianjin Zha,Yalan Yan. Interactive Behaviors of Online Health Community Users in Emergency[J]. 数据分析与知识发现, 2019, 3(12): 10-20.
[8] Guanghui Ye,Jinqing Yang. Route Recommendation Based on Two-way Link Analysis of Urban Name Entities[J]. 数据分析与知识发现, 2019, 3(11): 79-88.
[9] Hongfei Ling,Shiyan Ou. Review of Automatic Labeling for Topic Models[J]. 数据分析与知识发现, 2019, 3(9): 16-26.
[10] Bowen Liu,Rujiang Bai,Yanting Zhou,Xiaoyue Wang. Identifying Frontier Topics from Funding and Paper——Case Study of Carbon Nanotube[J]. 数据分析与知识发现, 2019, 3(8): 114-122.
[11] Xiuxian Wen,Jian Xu. Research on Product Characteristics Extraction and Hedonic Price Based on User Comments[J]. 数据分析与知识发现, 2019, 3(7): 42-51.
[12] Shiqi Deng,Liang Hong. Constructing Domain Ontology for Intelligent Applications: Case Study of Anti Tele-Fraud[J]. 数据分析与知识发现, 2019, 3(7): 73-84.
[13] Zhai Dongsheng,Hu Dengjin,Zhang Jie,He Xijun,Liu He. Hierarchical Classification Model for Invention Patents[J]. 数据分析与知识发现, 2017, 1(12): 63-73.
[14] He Wanying,Yang Jianlin. Ranking Learning Method Based on Random Walk Model[J]. 数据分析与知识发现, 2017, 1(12): 41-48.
[15] Zhang Lin,Qin Ce,Ye Wenhao. Automatic Recognition of Legal Language Entities Based on Conditional Random Fields[J]. 数据分析与知识发现, 2017, 1(11): 46-52.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn