Please wait a minute...
Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (2): 83-93    DOI: 10.11925/infotech.2096-3467.2020.0626
Current Issue | Archive | Adv Search |
Optimizing Quality Evaluation for Answers of Q&A Community
Shen Wang,Li Shiyu,Liu Jiayu,Li He()
School of Management, Jilin University, Jilin 130022, China
Download: PDF (1193 KB)   HTML ( 9
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper tries to construct a new quality evaluation system for answers from a Q&A community (Zhihu in China). [Methods] First, we established a quality criteria based on user evaluation and data characteristics. Then, we created vectors for the answers. Third, we used the SVM model to learn the label representation of texts as well as the accuracy of text classification. [Results] The proposed system yielded a classification accuracy of 85.32%, which is higher than the one only included user evaluation criteria (61.44%) and the other one only adopted data characteristics (79.10%). [Limitations] Our evaluation method might be biased due to the subjective annotations. [Conclusions] The proposed method is an effective way to evaluate answer quality of the Q&A community.

Key wordsQ&A Community      Zhihu      User Standard      Data Characteristics      Response Quality Evaluation      SVM Model     
Received: 01 July 2020      Published: 11 March 2021
ZTFLH:  G206  
Fund:National Natural Science Foundation of China(71974075)
Corresponding Authors: Li He ORCID:0000-0001-8847-3619     E-mail: lihe200303@163.com

Cite this article:

Shen Wang, Li Shiyu, Liu Jiayu, Li He. Optimizing Quality Evaluation for Answers of Q&A Community. Data Analysis and Knowledge Discovery, 2021, 5(2): 83-93.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2020.0626     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2021/V5/I2/83

一级类别指标 二级类别指标 解释 注释示例
内容 准确度 答案正确无误的程度 ·“这个答案完全正确。”
相关性 答案适用于该问题的程度 ·“+1,非常深入的回答。在设置系统以节省电池电量方面帮了我大忙。”
清晰度 表示答案语言的表述是否清晰 ·“难以置信的答案!这是非常清楚的,也是我在这个网站上看到的最好的答案之一。”
合理性 答案是否具有很好的逻辑性 ·“我根本不是数学专家(恰恰相反),但对我来说,这似乎完全合乎逻辑。为什么这会很棒?”
丰富性 答案对于问题在广度上的覆盖程度 ·“非常详细,几乎涵盖所有内容!+1”
学术性 答案对于问题在深度上的覆盖程度 ·“+1,非常深入的回答。这样专业角度的回答加深了我对于它的理解。”
客观性 答案的真实程度与公正程度 ·“+1感谢您提供我对这个问题的第一个诚实客观的意见!你说的话我都同意!”
写作风格 答案中词的选择和句子流畅性的独特特征 ·“这可能是我听过的最好的简单英语、小学水平的各种数字解释。”
认知程度 原创性 答案的创新程度 ·“如果有竞争对手提出这样一个想法来获得正确答案,我会认为这是最原始和最有创意的。”
易读性 答案容易被理解的程度 ·“这是我最容易理解的方法……”
效用 有用性 答案能够在多大程度上快速满足问题的信息需求 ·“这对我在许多系统上都很有用。”
解决方案可行性 在何种程度上答案是有益的,并从其使用中获得好处 ·“这是一个很好的解决方法,可以通过软件来实现。不过,我从来没有找到一个像这本一样好的教程,只是要自学而已。”
信息来源 对外部来源的引用 答案包含问答网站以外的其他参考的程度 ·“谢谢你给的链接。对我非常有帮助。”
回答者的专业性 答案提供者对于该问题领域的专业程度 ·“我喜欢我们这里有真正的教授”。
外部因素 外部验证 回答的正确性能够被第三方验证或证明的程度 ·“这是我的教授告诉我的。”
可用的替代方案数量 可以被用来替代答案中方案的其他方案数量 ·“相对于楼上提出的观点,我认为这个问题还可以从这种角度考虑。”
回答效率 在答案出现之前提问者必须等待的时间 ·“你的回答非常广泛,而且非常迅速。我接受你的回答。”
社会情绪 回答者的态度 答案表达的感觉或情感 ·“我会把它写在文章里”是你在摄影方面最糟糕的态度。
回答者的努力 作答过程中的工作量或注意程度 ·“感谢您花时间分解问题,并为每个问题提供许多解决方案和解释。”
回答者的经验 回答者与任务实际接触的程度 ·“很高兴听到你也遇到了同样的问题。”
达成共识程度 提问者和回答者在多大程度上有相同的意见或感觉 ·“即使我们仍然没有明确的答案,至少我们现在有了一些共识。”
用户感受 回答者喜欢或不喜欢某些事物的倾向 ·“我喜欢。与所有其他的反应完全不同,这很酷。”
幽默程度 他人认为答案幽默或令人愉快的程度 ·“投赞成票是因为这很有趣。”
User Evaluation Criteria Index System
数据特征 一级类别指标 二级类别指标
文本特征 语言特征 专有名词数
问题答案长度比
文本风格 形象
直接
非文本特征 用户信息 回答次数
提问数
文章数
收录数
审阅信息 赞同数
评论数
收藏数
信源可信度 链接特征
用户标签
Data Characteristic Index System
评价体系 准确率 C,g
融合用户评价标准和数据特征的回答质量评价体系 85.32% 16,0.25
用户标准评价体系 61.44% 4,4
数据特征评价体系 79.10% 16,5.66
Experimental Results
Contour Map of Parameter Optimization of Response Quality Evaluation System Integrating User Evaluation Criteria and Data Features
Result of Response Quality Evaluation System Integrating User Evaluation Criteria and Data Features
Contour Map of Parameter Optimization of User Evaluation Standard Index System
Result of User Evaluation Standard Index System
Contour Map of Parameter Optimization of Data Feature Evaluation Index System
Result of Data Feature Evaluation Index System
[1] Delone W H, McLean E R. Information Systems Success: The Quest for the Dependent Variable[J]. Information Systems Research, 1992,3(1):60-95.
doi: 10.1287/isre.3.1.60
[2] Eppler M J. Managing Information Quality: Increase the Value of Information in Knowledge-Intensive Products and Processes[M]. Berlin, Germany: Springer-Verlag, 2006.
[3] Lin Y H, Shen H Y. SmartQ: A Question and Answer System for Supplying High-Quality and Trustworthy Answers[J]. IEEE Transactions on Big Data, 2018,4(4):600-613.
doi: 10.1109/TBigData.6687317
[4] Procaci T B, Siqueira S W M, Nunes B P, et al. Modelling Experts Behaviour in Q&A Communities to Predict Worthy Discussions[C]//Proceedings of the 17th IEEE International Conference on Advanced Learning Technologies. 2017: 291-295.
[5] 齐云飞, 赵宇翔, 朱庆华. 在线问答社区中参与者知识行为研究综述[J]. 图书情报知识, 2018(3):103-112.
[5] ( Qi Yunfei, Zhao Yuxiang, Zhu Qinghua. A Review on Participants’ Knowledge Behaviors in Online Q&A Community[J]. Documentation, Informaiton & Knowledge, 2018(3):103-112.)
[6] 马丽. 微信公众平台用户信息采纳行为影响因素研究[D]. 哈尔滨:黑龙江大学, 2018.
[6] ( Ma Li. Research on Influencing Factors of User Information Adoption Behavior of WeChat Public Platform[D]. Harbin: Heilongjiang University, 2018.)
[7] Fu H Y, Oh S. Quality Assessment of Answers with User-Identified Criteria and Data-Driven Features in Social Q&A[J]. Information Processing and Management, 2019,56(1):14-28.
doi: 10.1016/j.ipm.2018.08.007
[8] Le L T, Shah C, Choi E. Assessing the Quality of Answers Autonomously in Community Question-Answering[J]. International Journal on Digital Libraries, 2019,20:351-367.
doi: 10.1007/s00799-019-00272-5
[9] Tao D H, LeRouge C, Smith K J, et al. Defining Information Quality into Health Websites: A Conceptual Framework of Health Website Information Quality for Educated Young Adults[J]. JMIR Human Factors, 2017,4(4):e25.
doi: 10.2196/humanfactors.6455 pmid: 28986336
[10] 郭顺利. 社会化问答社区用户生成答案知识聚合及服务研究[D]. 长春: 吉林大学, 2018.
[10] ( Guo Shunli. Research on User Generate Answers Knowledge Aggregation and Knowledge Service of Social Question & Answer Community[D]. Changchun: Jilin University, 2018.)
[11] 王杰. 学术类微信公众号信息质量评价体系研究[D]. 保定: 河北大学, 2018.
[11] ( Wang Jie. Study of Assessment System of Academic WeChat Public Number Information Quality[D]. Baoding: Hebei University, 2018.)
[12] 张克永, 李贺. 健康微信公众平台信息质量评价指标体系研究[J]. 情报科学, 2017,35(11):143-148, 155.
[12] ( Zhang Keyong, Li He. Research on the Evaluation System of Health Information Quality in WeChat Public Platform[J]. Information Science, 2017,35(11):143-148,155.)
[13] 赵玉遂, 许燕, 吴青青, 等. 应用德尔菲法构建网络健康信息质量评价指标体系[J]. 预防医学, 2018,30(2):121-124.
[13] ( Zhao Yusui, Xu Yan, Wu Qingqing, et al. The Development of an Evaluation Index System on Health Information on the Internet Using Delphi Method[J]. Journal of Preventive Medicine, 2018,30(2):121-124.)
[14] 付予我, 刘春年. 基于AHP的ERP系统信息质量评价指标体系研究[J]. 科技广场, 2017(10):159-163.
[14] ( Fu Yuwo, Liu Chunnian. Research on the Evaluation Index System of ERP System Information Quality Based on AHP[J]. Science Mosaic, 2017(10):159-163.)
[15] 姜雯, 许鑫. 在线问答社区信息质量评价研究综述[J]. 现代图书情报技术, 2014(6):41-50.
[15] ( Jiang Wen, Xu Xin. Review on Information Quality Evaluation of Online Community Question Answering Sites[J]. New Technology of Library and Information Service, 2014(6):41-50.)
[16] Sahu T P, Nagwani N K, Verma S. Topical Authoritative Answerer Identification on Q&A Posts Using Supervised Learning in CQA Sites[C]//Proceedings of the 9th Annual ACM India Conference. ACM, 2016: 129-132.
[17] 姜雯, 许鑫, 武高峰. 附加情感特征的在线问答社区信息质量自动化评价[J]. 图书情报工作, 2015,59(4):100-105.
[17] ( Jiang Wen, Xu Xin, Wu Gaofeng. Online Q&A Community Automatically Information Quality Evaluation with Sentiment Feature[J]. Library and Information Service, 2015,59(4):100-105.)
[18] 胡媛, 韦肖莹, 王灿. 微博信息质量评价指标体系构建研究[J]. 情报科学, 2017,35(6):44-50.
[18] ( Hu Yuan, Wei Xiaoying, Wang Can. Research on the Construction of Micro-blog Information Quality Evaluation Indicator System[J]. Information Science, 2017,35(6):44-50.)
[19] Vapnik V N, Lerner A Y. Recognition of Patterns with Help of Generalized Portraits[J]. Avtomat. i Telemekh., 1963,24(6):774-780.
[20] Vapnik V, Chervonenkis A. A Note on One Class of Perceptrons[J]. Automation and Remote Control, 1964,25(1).
[21] 林浩, 李雷孝, 王慧. 支持向量机在智能交通系统中的研究应用综述[J]. 计算机科学与探索, 2020,14(6):901-917.
[21] ( Lin Hao, Li Leixiao, Wang Hui. Survey on Research and Application of Support Vector Machines in Intelligent Transportation System[J]. Journal of Frontiers of Computer Science & Technology, 2020,14(6):901-917.)
[22] 郭顺利, 张向先, 陶兴, 等. 社会化问答社区用户生成答案质量自动化评价研究——以“知乎”为例[J]. 图书情报工作, 2019,63(11):118-130.
[22] ( Guo Shunli, Zhang Xiangxian, Tao Xing, et al. Research on Automated Evaluation of User Generated Answer Quality in Social Question and Answer Community ——Taking “Zhihu” as an Example[J]. Library and Information Service, 2019,63(11):118-130.)
[23] 崔敏君. 多特征层次化答案质量评价方法研究[D]. 太原: 太原理工大学, 2016.
[23] ( Cui Minjun. Research on Multi Feature Hierarchical Answer Quality Evaluation Method[D]. Taiyuan: Taiyuan University of Technology, 2016.)
[24] 孔维泽, 刘奕群, 张敏, 等. 问答社区中回答质量的评价方法研究[J]. 中文信息学报, 2011,25(1):3-8.
[24] ( Kong Weize, Liu Yiqun, Zhang Min, et al. Answer Quality Analysis on Community Question Answering[J]. Journal of Chinese Information Processing, 2011,25(1):3-8.)
[25] Kim S, Oh S. Users’ Relevance Criteria for Evaluating Answers in a Social Q&A Site[J]. Journal of the Association for Information Science and Technology, 2009,60(4):716-727.
[26] Bian J, Liu Y D, Agichtein E, et al. Finding the Right Facts in the Crowd: Factoid Question Answering over Social Media[C]//Proceedings of the 17th International Conference on World Wide Web. 2008: 467-476.
[27] 方秋棠. 《知乎日报》每日推送选题研究[J]. 西部皮革, 2016,38(18):295.
[27] ( Fang Qiutang. Research on Daily Push Topics of Zhihu Daily[J]. Western Leather, 2016,38(18):295.)
[28] 杜裕琳. 知乎日报:与知乎社区实现共赢[J]. 青年记者, 2015(23):63-64.
[28] ( Du Yulin. Zhihu Daily: Win Together with Zhihu Community[J]. Young Journalist, 2015(23):63-64.)
[29] 徐启华, 耿帅, 师军. 基于大规模训练集SVM的发动机故障诊断[J]. 航空动力学报, 2011,26(12):2841-2848.
[29] ( Xu Qihua, Geng Shuai, Shi Jun. Fault Diagnosis Method for Aero-Engine Based on SVM with Large-Scale Training Set[J]. Journal of Aerospace Power, 2011,26(12):2841-2848.)
[30] 汤荣志, 段会川, 孙海涛. SVM训练数据归一化研究[J]. 山东师范大学学报(自然科学版), 2016,31(4):60-65.
[30] ( Tang Rongzhi, Duan Huichuan, Sun Haitao. Research on Normalization of SVM Training Data[J]. Journal of Shandong Normal University (Natural Science), 2016,31(4):60-65.)
[31] 李骏亮, 李耕, 张曙. 命名实体识别在无效文本过滤中的应用——过滤影视作品中的无效评论[J]. 电子技术, 2017,46(9):56-60, 49.
[31] ( Li Junliang, Li Geng, Zhang Shu. Application of Named Entity Recognition in Invalid Text Filtering —— Filtering Invalid Comments in Film and Television Works[J]. Electronic Technology, 2017,46(9):56-60, 49.)
[32] 屈玉涛, 邓万宇. 基于MATLAB的SVM分类预测实现[J]. 信息通信, 2017(3):33-34.
[32] ( Qu Yutao, Deng Wanyu. Realization of SVM Classification Prediction Based on MATLAB[J]. Information & Communications, 2017(3):33-34.)
[1] He Yue,Feng Yue,Zhao Shupeng,Ma Yufeng. Recommending Contents Based on Zhihu Q&A Community: Case Study of Logistics Topics[J]. 数据分析与知识发现, 2018, 2(9): 42-49.
[2] Guo Bo,Zhao Junrui,Sun Yu. Analyzing Characteristics and Dynamics of User Behaviors in Social Q&A Community: Case Study of Zhihu.com[J]. 数据分析与知识发现, 2018, 2(4): 48-58.
[3] Cheng Xiufeng,Zhang Xinyi,Wang Ning. Identifying Trending Topics in Q&A Community with CART Decision Tree[J]. 数据分析与知识发现, 2018, 2(12): 52-59.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn