Please wait a minute...
Advanced Search
现代图书情报技术  2014, Vol. 30 Issue (12): 44-50     https://doi.org/10.11925/infotech.1003-3513.2014.12.06
  知识组织与知识管理 本期目录 | 过刊浏览 | 高级检索 |
依存句法模板下的商品特征标签抽取研究
聂卉, 杜嘉忠
中山大学资讯管理学院 广州 510006
Using Dependency Parsing Pattern to Extract Product Feature Tags
Nie Hui, Du Jiazhong
School of Information Management, Sun Yat-Sen University, Guangzhou 510006, China
全文: PDF (569 KB)   HTML  
输出: BibTeX | EndNote (RIS)      
摘要 

[目的]面向在线商品评论, 通过探索"产品特征-观点"对应关系的识别方法, 抽取商品特征标签, 凝练评论精华.在网络资讯良莠混杂的环境下, 帮助用户有效获得有价值的资讯.[方法]引入依存语法关系, 对评论模板实现自动分类、过滤、泛化并形成模板库.基于模板库和外部词典提取特征标签, 同时确立候选标签的筛选过滤机制.[结果]面向真实的网络评论集, 本文方法的性能优于单纯过滤与泛化的抽取方法.F值最优达到56.5%, 调整参数后, 准确率达到65%.[局限]需要在特征抽取前依据评论语句质量进行前期过滤, 考虑特征词库的自动化获取, 在模板形成过程中, 还需添加更多的句法关系, 进一步提高特征标签的抽取准确度.[结论]单纯依据句法模板频率进行模板过滤的方法有提升空间.特征抽取过程考虑模板的长度特征, 设定抽取窗口, 对特征标签进行筛选、合并特征能获取更好的抽取结果.

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
杜嘉忠
聂卉
关键词 评论挖掘标签抽取依存句法分析    
Abstract

[Objective] The method of association recognition for features and the relevant opinions is investigated in order to extract features tags and summarize users' generated online reviews, which is helpful for Web users to access useful information effectively, especially when online information normally varies greatly in quality. [Methods] The dependency parsing is employed to obtain the extraction templates, the template library is constructed after the processes of classifying, filtering and generalization. In terms of the templates and the corresponding external lexicons, feature tags are extracted and sifted out according to the filtering rules. [Results] The experiment results indicate that the method outperforms the similar one which is only based on templates filtration or generalization. The performance of F-measure achieves 56.5% and the accuracy could reach 65% by adjusting the corresponding parameters. [Limitations] The filtering strategy for improving the quality of review data is not conducted in the research. Building feature lexicon automatically and adding more syntactic relations need to consider to extend the library of templates and make improvement of extraction accuracy further. [Conclusions] The better performance can be achieved by finding the most appropriate values for the template-specific parameters, such as the length of template, or by adopting an effective filtering window strategy to detect the noise templates.

Key wordsReview mining    Tags extraction    Dependency parsing analysis
收稿日期: 2014-06-23      出版日期: 2015-01-20
:  TP391  
基金资助:

本文系广东省哲学社会科学"十二五"规划2013年度项目"基于情境和用户感知的知识推荐机制研究"(项目编号:CD13CTS01)的研究成果之一.

通讯作者: 聂卉 E-mail: issnh@mail.sysu.edu.cn     E-mail: issnh@mail.sysu.edu.cn
作者简介: 作者贡献声明: 聂卉: 论题拟定, 修改完善论文, 最终版本修订; 杜嘉忠: 提出研究思路, 设计研究方案, 数据采集与实验, 初稿撰写.
引用本文:   
聂卉, 杜嘉忠. 依存句法模板下的商品特征标签抽取研究[J]. 现代图书情报技术, 2014, 30(12): 44-50.
Nie Hui, Du Jiazhong. Using Dependency Parsing Pattern to Extract Product Feature Tags. New Technology of Library and Information Service, 2014, 30(12): 44-50.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2014.12.06      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2014/V30/I12/44

[1] 中国互联网络发展状况统计报告(2014年7月) [EB/OL]. [2014-07-29]. http://www.cnnic.net.cn/gywm/xwzx/rdxw/2014/ 201407/W020140721559080702009.pdf. (China Internet Network Development State Statistical Report [EB/OL]. [2014-07-29]. http://www.cnnic.net.cn/gywm/xwzx/rdxw/ 2014/201407/W020140721559080702009.pdf.)
[2] 蒋音播. 消费者网络口碑传播的动机研究[D]. 武汉: 华中科技大学, 2009. (Jiang Yinbo. The Motivation of the Spread of Electronic Word-of-Mouth [D]. Wuhan: Huazhong University of Science & Technology, 2009.)
[3] Liu B. Sentiment Analysis and Opinion Mining [M]. Morgan & Claypool Publishers, 2012.
[4] Hu M, Liu B. Mining and Summarizing Customer Reviews [C]. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2004: 168-177.
[5] Kim S M, Hovy E. Determining the Sentiment of Opinions [C]. In: Proceedings of the COLING, 2004: 1367-1373.
[6] Kobayashi N, Inui K, Matsumoto Y, et al. Collecting Evaluative Expressions for Opinion Extraction [A].//Natural Language Processing (IJCNLP 2004) [M]. Heidelberg, Berlin: Springer, 2005: 596-605.
[7] Bloom K, Garg N, Argamon S. Extracting Appraisal Expressions [C]. In: Proceedings of Human Language Technology Conferences of the North American Chapter of the Association of Computational Linguistics (HLT-NAACL). 2007: 308-315.
[8] Zhuang L, Jing F, Zhu X. Movie Review Mining and Summarization [C]. In: Proceedings of the 2006 ACM International Conference on Information and Knowledge Management, Arlington, Virginia, USA. ACM, 2006: 43-50.
[9] 娄德成, 姚天昉. 汉语句子语义极性分析和观点抽取方法的研究[J]. 计算机应用, 2006, 26(11): 2622-2625. (Lou Decheng, Yao Tianfang. Semantic Polarity Analysis and Opinion Mining on Chinese Review Sentences [J]. Computer Application, 2006, 26(11): 2622-2625.)
[10] 王素格, 吴苏红. 基于依存关系的旅游景点评论的特征-观点对抽取[J]. 中文信息学报, 2012, 26(3): 116-121. (Wang Suge, Wu Suhong. Feature-Opinion Extraction in Scenic Spots Reviews Based on Dependency Relation [J]. Journal of Chinese Information Processing, 2012, 26(3): 116-121.)
[11] 赵妍妍, 秦兵, 车万翔, 等. 基于句法路径的情感评价单元识别[J]. 软件学报, 2011, 22(5): 887-898. (Zhao Yanyan, Qin Bing, Che Wanxiang, et al. Appraisal Expression Recognition Based on Syntactic Path [J]. Journal of Software, 2011, 22(5): 887-898.)
[12] 陈炯, 张虎, 曹付元. 面向中文客户评论的评价搭配识别研究[J]. 计算机工程与设计, 2013, 34(3): 1073-1077. (Chen Jiong, Zhang Hu, Cao Fuyuan. Research on Identification of Evaluation Collocation from Chinese Customer Reviews [J]. Computer Engineering and Design, 2013, 34(3): 1073-1077.)
[13] 黄亿华, 濮小佳, 袁春风, 等. 基于句法树结构的情感评价单元抽取算法[J]. 计算机应用研究, 2011, 28(9): 3229-3234. (Huang Yihua, Pu Xiaojia, Yuan Chunfeng, et al. Appraisal Expression Extraction Based on Parse Tree Structure [J]. Application Research of Computers, 2011, 28(9): 3229-3234.)
[14] Che W, Li Z, Liu T. LTP: A Chinese Language Technology Platform [C]. In: Proceedings of the 23rd International Conference on Computational Linguistics -COLING, Beijing, China. 2010:13-16.
[15] NLPIR/ICTCLAS汉语分词系统[EB/OL]. [2014-07-19]. http:// ictclas.nlpir.org/. (NLPIR/ICTCLAS Chinese Segmentation System [EB/OL]. [2014-07-19]. http://ictclas.nlpir.org/.)

[1] 沈卓,李艳. 基于PreLM-FT细粒度情感分析的餐饮业用户评论挖掘[J]. 数据分析与知识发现, 2020, 4(4): 63-71.
[2] 李博诚,张云秋,杨铠西. 面向微博商品评论的情感标签抽取研究 *[J]. 数据分析与知识发现, 2019, 3(9): 115-123.
[3] 聂卉. 结合词向量和词图算法的用户兴趣建模研究 *[J]. 数据分析与知识发现, 2019, 3(12): 30-40.
[4] 李琳, 李辉. 一种基于概念向量空间的文本相似度计算方法[J]. 数据分析与知识发现, 2018, 2(5): 48-58.
[5] 张艳丰,李贺,彭丽徽. 基于模糊情感计算的商品在线评论用户品牌转换意向研究*[J]. 现代图书情报技术, 2016, 32(5): 64-71.
[6] 唐晓波, 邱鑫. 面向主题的高质量评论挖掘模型研究[J]. 现代图书情报技术, 2015, 31(7-8): 104-112.
[7] 张帆, 乐小虬. 领域科技文献创新点句中主题属性实例识别方法研究[J]. 现代图书情报技术, 2015, 31(5): 15-23.
[8] 唐晓波, 肖璐. 基于依存句法网络的文本特征提取研究[J]. 现代图书情报技术, 2014, 30(11): 31-37.
[9] 王永, 张勤, 杨晓洁. 中文网络评论中产品特征提取方法研究[J]. 现代图书情报技术, 2013, (12): 70-73.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn