Please wait a minute...
Advanced Search
数据分析与知识发现  2019, Vol. 3 Issue (6): 12-20     https://doi.org/10.11925/infotech.2096-3467.2018.0696
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
大众性问答社区答案质量排序方法研究*
易明(),张婷婷
华中师范大学信息管理学院 武汉 430079
Ranking Answer Quality of Popular Q&A Community
Ming Yi(),Tingting Zhang
School of Information Management, Central China Normal University, Wuhan 430079, China
全文: PDF (553 KB)   HTML ( 11
输出: BibTeX | EndNote (RIS)      
摘要 

目的】针对大众性问答社区答案质量参差不齐的现状, 对答案质量排序方法进行探讨。【方法】依据信息接受模型, 从感知价值角度构建答案质量排序初始指标体系; 采用K-Medoids聚类算法对初始指标进行离散化, 同时利用粗糙集理论对初始指标进行约简并赋予权值, 进而修正指标体系; 运用加权灰色关联分析计算答案的加权灰色关联度, 以产生排序结果。【结果】针对“知乎”6类话题下6个问题的2 297条相关数据进行实验分析, 排序靠前的答案通常采用图文结合的表达方式、答案所含信息量高, 且回答者社区参与度较高, 从而答案的质量较高。【局限】数据规模需要扩大, 对排序方法的评价还可以优化。【结论】73名“知乎”用户对原始排序与本研究排序进行满意度评价, 结果表明本文方法具有优越性。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
易明
张婷婷
关键词 大众性问答社区答案质量排序感知价值粗糙集理论加权灰色关联分析    
Abstract

[Objective] This paper proposes a new method to rank the quality of answers from a popular Q&A community in China. [Methods] First, based on the information acceptance model, we established initial quality indicators for the answer’s perceived values. Then, we discretized these indicators with the K-Medoids clustering algorithm. Third, we reduced and weighted the indictors with the help of rough set theory. Finally, we generated the formal rankings with the weighted grey correlation analysis. [Results] We evaluated the proposed method with 2 297 answers for six different types of questions from the Q&A website of “Zhihu”. We found that the answers ranked higher generally included textual message with images. These answers were also more informative than others and involved active members of the Q&A community. [Limitations] The size of our dataset needs to be expanded, and the evaluation method of the proposed model could be optimized. [Conclusions] The proposed method is an effective way to rank the quality of answers from the Q&A community.

Key wordsCommon Q&A Community    Answer Quality Ranking    Perceived Value    Rough Set Theory    Weighted Grey Correlation Analysis
收稿日期: 2018-07-02      出版日期: 2019-08-15
基金资助:*本文系国家社会科学基金项目“基于人类动力学的社交网络信息交流行为研究”(项目编号: 16BTQ076)的研究成果之一
引用本文:   
易明,张婷婷. 大众性问答社区答案质量排序方法研究*[J]. 数据分析与知识发现, 2019, 3(6): 12-20.
Ming Yi,Tingting Zhang. Ranking Answer Quality of Popular Q&A Community. Data Analysis and Knowledge Discovery, 2019, 3(6): 12-20.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2018.0696      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2019/V3/I6/12
[1] Hosseini M, Moore J, Almaliki M, et al.Wisdom of the Crowd Within Enterprises: Practices and Challenges[J]. Computer Networks, 2015, 90: 121-132.
[2] Fichman P.A Comparative Assessment of Answer Quality on Four Question Answering Sites[J]. Journal of Information Science, 2011, 37(5): 476-486.
[3] Zhu Z, Bernhard D, Gurevych I.A Multi-Dimensional Model for Assessing the Quality of Answers in Social Q&A Sites[C]// Proceedings of the 2009 International Conference on Information Quality.2009: 264-265.
[4] Shah C, Pomerantz J.Evaluating and Predicting Answer Quality in Community QA[C]// Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval.2010: 411-418.
[5] Yan Z, Zhou J.Optimal Answerer Ranking for New Questions in Community Question Answering[J].Information Processing and Management, 2015, 51(1): 163-178.
[6] Yang L, Qiu M, Gottipati S, et al.CQArank: Jointly Model Topics and Expertise in Community Question Answering[C]// Proceedings of the 22nd ACM International Conference on Information & Knowledge Management. 2013: 99-108.
[7] 刘瑜, 袁健. 基于RTEM模型的问答社区候选答案排序方法[J]. 电子科技, 2016, 29(5): 130-134.
[7] (Liu Yu, Yuan Jian.Candidate Answer Sorting Method of Q&A Community Questions Based on RTEM Model[J]. Electronic Science and Technology, 2016, 29(5): 130-134.)
[8] 张成, 曲明成, 倪宁, 等. 基于概率潜在语义分析模型的自动答案选择[J]. 计算机工程, 2011, 37(14): 70-72.
[8] (Zhang Cheng, Qu Mingcheng, Ni Ning, et al.Automatic Answer Selection Based on Probabilistic Latent Semantic Analysis Model[J]. Computer Engineering, 2011, 37(14): 70-72.)
[9] Guo L, Hu X.Identifying Authoritative and Reliable Contents in Community Question Answering with Domain Knowledge[C]//Proceedings of the 17th Pacific-Asia Conference on Knowledge Discovery and Data Mining. 2013: 133-142.
[10] 来社安, 蔡中民. 基于相似度的问答社区问答质量评价方法[J]. 计算机应用与软件, 2013, 30(2): 266-269.
[10] (Lai Shean, Cai Zhongmin.Question Answering Quality Evaluation for Community Question Answering Based on Similarity[J]. Computer Applications and Software, 2013, 30(2): 266-269.)
[11] 王伟, 冀宇强, 王洪伟, 等. 中文问答社区答案质量的评价研究: 以知乎为例[J]. 图书情报工作, 2017, 61(22): 36-44.
[11] (Wang Wei, Ji Yuqiang, Wang Hongwei, et al.Evaluating Chinese Answers’ Quality in the Community QA System: A Case Study of Zhihu[J].Library and Information Service, 2017, 61(22): 36-44.)
[12] Ginsca A L, Popescu A.User Profiling for Answer Quality Assessment in Q&A Communities[C]//Proceedings of the 2013 Workshop on Data-Driven User Behavioral Modelling and Mining from Social Media.2013: 25-28.
[13] 孔维泽, 刘奕群, 张敏, 等. 问答社区中回答质量的评价方法研究[J].中文信息学报, 2011, 25(1): 3-8.
[13] (Kong Weize, Liu Yiqun, Zhang Min, et al.Answer Quality Analysis on Community Question Answering[J]. Journal of Chinese Information Processing, 2011, 25(1): 3-8.)
[14] 姜雯, 许鑫, 武高峰. 附加情感特征的在线问答社区信息质量自动化评价[J]. 图书情报工作, 2015, 59(4): 100-105.
[14] (Jiang Wen, Xu Xin, Wu Gaofeng.Online Q&A Community Automatically Information Quality Evaluation with Sentiment Feature[J]. Library and Information Service, 2015, 59(4): 100-105.)
[15] John B M, Chua A Y K, Goh D H L. What Makes a High-Quality User-Generated Answer?[J]. IEEE Internet Computing, 2011, 15(1): 66-71.
[16] 李晨, 巢文涵, 陈小明, 等.中文社区问答中问题答案质量评价和预测[J]. 计算机科学, 2011, 38(6): 230-236.
[16] (Li Chen, Chao Wenhan, Chen Xiaoming, et al.Quality Evaluation and Prediction for Question and Answer in Chinese Community Question Answering[J]. Computer Science, 2011, 38(6): 230-236.)
[17] Sussman S W, Siegal W S.Informational Influence in Organizations: An Integrated Approach to Knowledge Adoption[J]. Information Systems Research, 2003, 14(1): 47-65.
[18] 王洪伟, 孟园. 在线评论质量有用特征识别: 基于GBDT特征贡献度方法[J]. 中文信息学报, 2017, 31(3): 109-117.
[18] (Wang Hongwei, Meng Yuan.Helpful Features Identification of Online Reviews Quality Based on GBDT Feature Contribution[J].Journal of Chinese Information Processing, 2017, 31(3): 109-117.)
[19] Radev D R, Jing H, Styś M, et al.Centroid-Based Summarization of Multiple Documents[J]. Information Processing & Management, 2004, 40(6): 919-938.
[20] Joyce E, Kraut R.Predicting Continued Participation in Newsgroups[J]. Journal of Computer-Mediated Communication, 2006, 11(3): 723-747.
[21] 周志远, 沈固朝. 粗糙集理论在情报分析指标权重确定中的应用[J]. 情报理论与实践, 2012, 35(9): 61-65.
[21] (Zhou Zhiyuan, Shen Guchao.Application of Rough Set Theory in Determining the Weight of Intelligence Analysis Index[J]. Information Studies: Theory&Application, 2012, 35(9): 61-65.)
[22] 张政超, 关欣, 何友, 等. 粗糙集理论数据处理方法及其研究[J]. 计算机技术与发展, 2010, 20(4): 12-16, 20.
[22] (Zhang Zhengchao, Guan Xin, He You, et al.Rough Sets Data Processing Method and Its Research[J]. Computer Technology and Development, 2010, 20(4): 12-16, 20.)
[23] 张雪萍, 龚康莉, 赵广才. 基于MapReduce的K-Medoids并行算法[J]. 计算机应用, 2013, 33(4): 1023-1025, 1035.
[23] (Zhang Xueping, Gong Kangli, Zhao Guangcai.Parallel K-Medoids Algorithm Based on MapReduce[J]. Journal of Computer Applications, 2013, 33(4): 1023-1025, 1035.)
[24] Fahad A, Alshatri N, Tari Z, et al.A Survey of Clustering Algorithms for Big Data: Taxonomy and Empirical Analysis[J]. IEEE Transactions on Emerging Topics in Computing, 2014, 2(3): 267-279.
[25] PawlakZ. Rough Set[J]. International Journal of Computer and Information Sciences, 1982, 11(5): 341-356.
[26] 孙晶晶. 基于粗糙集理论的决策表属性约简与规则约简算法研究及相关应用[D]. 郑州: 中国人民解放军信息工程大学, 2005.
[26] (Sun Jingjing.Research on Attribute Reduction and Rule Reduction Algorithm of Decision Table Based on Rough Set Theory[D]. Zhengzhou: Information Engineering University, 2005.)
[27] 邓聚龙. 灰色系统基本方法[M]. 武汉: 华中理工大学出版社, 1987.
[27] (Deng Julong.Basic Methods of Grey System[M]. Wuhan: Huazhong University of Science & Technology Press, 1987.)
[28] 于亮, 方志耕, 吴利丰, 等. 基于灰色类别差异特性的评价指标客观权重极大熵配置模型[J]. 系统工程理论与实践, 2014, 3(8): 2065-2070.
[28] (Yu Liang, Fang Zhigeng, Wu Lifeng, et al.Maximum Entropy Configuration Model of Objective Index Weight Based on Grey Category Characteristics Difference[J]. Systems Engineering- Theory&Practice, 2014, 3(8): 2065-2070.)
[29] 黄涛. 基于灰色关联度分析的模糊群决策方法研究[D].广州: 华南理工大学, 2016.
[29] (Huang Tao.Research of Fuzzy Multi-Attribute Decision Making Method Based on Grey Correlation Analysis[D]. Guangzhou: South China University of Technology, 2016.)
[1] 李健,王明月,许路明,田英春. 基于用户感知价值的医疗信息服务评价体系构建*[J]. 数据分析与知识发现, 2019, 3(2): 118-126.
[2] 甘春梅, 黄凯, 许嘉仪, 林恬恬. 社会化商务持续意愿影响因素的实证研究: 技术性因素与感知价值的影响*[J]. 数据分析与知识发现, 2018, 2(4): 29-37.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn