Analyzing Knowledge Demand and Supply of Community Question Answering with TF-PIDF
Li Ming1(),Li Ying1,Zhou Qing1,Wang Jun2
1School of Economics and Management, China University of Petroleum-Beijing, Beijing 102249, China 2School of Economics and Management, Beihang University, Beijing 100191, China
[Objective] This paper propose a new method to study the knowledge demand and supply of community question answering, aiming to make effective targeted interventions. [Methods] First, we constructed novel word weight calculation models (TF-PIDF) for the questions and answers. Then, we obtained the main categories of demanded and supplied knowledge by clustering questions and answers, as well as the popularity of topics. Third, we paired the categories of knowledge demand and their supply counterparts. Fourth, we proposed an algorithm to calculate the popularity of knowledge demands. [Results] The proposed model was examined with topis on influenza from the community of ZHIHU. We found six categories of topics for knowledge demand and supply. The trending one was “epidemic”, which represented the most popular real time needs. [Limitations] The identified topics rely on the topic meaning from feature word clustering. [Conclusions] The proposed method could effectively manage the knowledge demand and supply of community question answering.
李明, 李莹, 周庆, 王君. 基于TF-PIDF的网络问答社区中的知识供需研究 *[J]. 数据分析与知识发现, 2021, 5(2): 106-115.
Li Ming, Li Ying, Zhou Qing, Wang Jun. Analyzing Knowledge Demand and Supply of Community Question Answering with TF-PIDF. Data Analysis and Knowledge Discovery, 2021, 5(2): 106-115.
( Zhang Lu, Zhang Pengyi. The Relationship Between Online/Offline Social Capital and User Behavior in Social Q&A: The Case of Medical and Health Topics in Zhihu[J]. Library and Information Service, 2017,61(17):84-90.)
[2]
Liu J W, Shen H Y, Yu L. Question Quality Analysis and Prediction in Community Question Answering Services with Coupled Mutual Reinforcement[J]. IEEE Transactions on Services Computing, 2017,10(2):286-301.
doi: 10.1109/TSC.2015.2446991
( Xu Peng, Zhang Dan. A Research on the Motivation of Knowledge Sharing in Online Q&A Community: From the Perspective of Social Exchange Theory[J]. Document, Information & Knowledge, 2018(2):105-112.)
[6]
Roy P K, Ahmad Z, Singh J P, et al. Finding and Ranking High-Quality Answers in Community Question Answering Sites[J]. Global Journal of Flexible Systems Management, 2018,19(1):53-68.
[7]
Figueroa A. Automatically Generating Effective Search Queries Directly from Community Question-Answering Questions for Finding Related Questions[J]. Expert Systems with Applications, 2017,77:11-19.
doi: 10.1016/j.eswa.2017.01.041
[8]
Li M, Li Y, Lou W Q, et al. A Hybrid Recommendation System for Q&A Documents[J]. Expert Systems with Applications, 2020,144:113088.
doi: 10.1016/j.eswa.2019.113088
[9]
Fu C G. User Intimacy Model for Question Recommendation in Community Question Answering[J]. Knowledge-Based Systems, 2020,188:104844.
doi: 10.1016/j.knosys.2019.07.015
( Tao Xing, Zhang Xiangxian, Guo Shunli, et al. Automatic Summarization of User-Generated Content in Academic Q&A Community Based on Word2Vec and MMR[J]. Data Analysis and Knowledge Discovery, 2020,4(4):109-118.)
[11]
Deng Y, Lam W, Xie Y X, et al. Joint Learning of Answer Selection and Answer Summary Generation in Community Question Answering[C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence. 2020: 7651-7658.
[12]
Cheng X, Zhu S G, Su S, et al. A Multi-Objective Optimization Approach for Question Routing in Community Question Answering Services[J]. IEEE Transactions on Knowledge and Data Engineering, 2017,29(9):1779-1792.
doi: 10.1109/TKDE.2017.2696008
[13]
Liu D R, Chen Y H, Kao W C, et al. Integrating Expert Profile, Reputation and Link Analysis for Expert Finding in Question-Answering Websites[J]. Information Processing & Management, 2013,49(1):312-329.
doi: 10.1016/j.ipm.2012.07.002
[14]
Kundu D, Pal R K, Mandal D P. Preference Enhanced Hybrid Expertise Retrieval System in Community Question Answering Services[J]. Decision Support Systems, 2020,129:113164.
doi: 10.1016/j.dss.2019.113164
[15]
Mumtaz S, Rodriguez C, Benatallah B. Expert2Vec: Experts Representation in Community Question Answering for Question Routing[C]//Proceedings of International Conference on Advanced Information Systems Engineering. Springer, Cham, 2019: 213-229.
( Tang Xiaobo, Li Xinxing. Research on System Dynamics Simulation of Knowledge Sharing Mechanism in Social Q&A Community[J]. Information Science, 2018,36(3):125-129.)
( Chen Xing, Zhang Xing, Zeng Shuyun, et al. The Factors of Knowledge Sharing Intention in the Health Q&A Communities[J]. Journal of Modern Information, 2017,37(4):62-71.)
( Wu Yawei, Zhang Xiangxian, Tao Xing, et al. Construction of Answer Quality Evaluation Index Based on User Perception of Academic Question and Answer Community[J]. Information Science, 2020,38(10):141-147.)
( Guo Shunli, Zhang Xiangxian, Tao Xing, et al. Research on Automated Evaluation of User Generated Answer Quality in Social Question and Answer Community——Taking “Zhihu” as an Example[J]. Library and Information Service, 2019,63(11):118-130)
( Wang Wei, Ji Yuqiang, Wang Hongwei, et al. Evaluating Chinese Answers’ Quality in the Community QA System: A Case Study of Zhihu[J]. Library and Information Service, 2017,61(22):36-44.)
[21]
Bun K K, Ishizuka M. Topic Extraction from News Archive Using TF* PDF Algorithm[C]//Proceedings of the 3rd International Conference on Web Information Systems Engineering. IEEE, 2002: 73-82.
[22]
Trstenjak B, Mikac S, Donko D. KNN with TF-IDF Based Framework for Text Categorization[J]. Procedia Engineering, 2014,69:1356-1364.
doi: 10.1016/j.proeng.2014.03.129
( Wei Jianxiang, Liu Huai, Su Xinning. Design and Simulation of a Document Clustering Algorithm Based on Genetic Algorithm[J]. Journal of Nanjing University (Natural Sciences), 2009,45(3):432-438.)
[25]
Schütze H, Manning C D, Raghavan P. Introduction to Information Retrieval[M]. Cambridge: Cambridge University Press, 2008.