Please wait a minute...
Data Analysis and Knowledge Discovery
Current Issue | Archive | Adv Search |
Expert Recommendation in Community Question Answering based on Topic Interest and Domain Authority
Li Mingzhu;Mi Chuanmin;Gou Xiaoyi;Xiao Lin
(College of Economics and Management, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China)
Download:
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] We aimed to realize contextual topic identification of expert’s historical Q&A texts, in order to improve the accuracy of expert recommendation in CQA. [Methods] By combining the Labeled-LDA model and the BERT model, we made full use of the tag information to vectorize expert’s historical Q&A texts. Through dimension reduction and topic clustering, we achieved contextual topic identification and obtained the probability distribution of expert's topic interests. According to the results of topic interest excavation, we construct the topic sensitive PageRank algorithm (TSPR), and added the user quality weight to calculate the domain authority. Based on this, we proposed the TIDARank algorithm for expert recommendation in CQA. [Results] Based on the Stack Exchange public data set, the BERT-LLDA model outperformed TF-IDF, BERT, and BERT-LDA models on the silhouette coefficient (0.5756) and topic coherence (0.4766). The ACC@20 and MRR@20 of TIDARank reached 0.5807 and 0.2430 respectively, improved by 14.53% and 8.14% compared with the best-performing Bi-LSTM+TSPR baseline algorithm. [Limitations] We did not consider user activity in link analysis. [Conclusions] Based on the BERT-LLDA model, we achieved better topic clustering results for question-answering texts and improved the performances of expert recommendation in CQA.

Key words Community question answering      Expert recommendation      BERT      Labeled-LDA      PageRank      
Published: 15 March 2024
ZTFLH:  G203,TP181  

Cite this article:

Li Mingzhu, Mi Chuanmin, Gou Xiaoyi, Xiao Lin. Expert Recommendation in Community Question Answering based on Topic Interest and Domain Authority . Data Analysis and Knowledge Discovery, 0, (): 1-.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2023.0433     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y0/V/I/1

[1] Quan Ankun, Li Honglian, Zhang Le, Lyu Xueqiang. Generating Chinese Abstracts with Content and Image Features[J]. 数据分析与知识发现, 2024, 8(3): 110-119.
[2] Huang Taifeng, Ma Jing. Text Sentiment Classification Algorithm Based on Prompt Learning Enhancement[J]. 数据分析与知识发现, 2024, 8(3): 77-84.
[3] Lyu Xueqiang, Yang Yuting, Xiao Gang, Li Yuxian, You Xindong. Extracting Long Terms from Sparse Samples[J]. 数据分析与知识发现, 2024, 8(1): 135-145.
[4] He Chaocheng, Huang Qian, Li Xinru, Wang Chunying, Wu Jiang. Trending Topics on Metaverse: A Microblog Text Analysis with BERT and DTM[J]. 数据分析与知识发现, 2023, 7(9): 25-38.
[5] Zhao Xuefeng, Wu Delin, Wu Weiwei, Sun Zhuoluo, Hu Jinjin, Lian Ying, Shan Jiayu. Identifying High-Quality Technology Patents Based on Deep Learning and Multi-Category Polling Mechanism——Case Study of Patent Applications[J]. 数据分析与知识发现, 2023, 7(8): 30-45.
[6] Ben Yanyan, Pang Xueqin. Identifying Medical Named Entities with Word Information[J]. 数据分析与知识发现, 2023, 7(5): 123-132.
[7] Xu Kang, Yu Shengnan, Chen Lei, Wang Chuandong. Linguistic Knowledge-Enhanced Self-Supervised Graph Convolutional Network for Event Relation Extraction[J]. 数据分析与知识发现, 2023, 7(5): 92-104.
[8] Wang Yinqiu, Yu Wei, Chen Junpeng. Automatic Question-Answering in Chinese Medical Q & A Community with Knowledge Graph[J]. 数据分析与知识发现, 2023, 7(3): 97-109.
[9] Su Mingxing, Wu Houyue, Li Jian, Huang Ju, Zhang Shunxiang. AEMIA:Extracting Commodity Attributes Based on Multi-level Interactive Attention Mechanism[J]. 数据分析与知识发现, 2023, 7(2): 108-118.
[10] Zhao Yiming, Pan Pei, Mao Jin. Recognizing Intensity of Medical Query Intentions Based on Task Knowledge Fusion and Text Data Enhancement[J]. 数据分析与知识发现, 2023, 7(2): 38-47.
[11] Wang Yufei, Zhang Zhixiong, Zhao Yang, Zhang Mengting, Li Xuesi. Designing and Implementing Automatic Title Generation System for Sci-Tech Papers[J]. 数据分析与知识发现, 2023, 7(2): 61-71.
[12] Zhang Siyang, Wei Subo, Sun Zhengyan, Zhang Shunxiang, Zhu Guangli, Wu Houyue. Extracting Emotion-Cause Pairs Based on Multi-Label Seq2Seq Model[J]. 数据分析与知识发现, 2023, 7(2): 86-96.
[13] Lyu Xueqiang, Du Yifan, Zhang Le, Pan Huiping, Tian Chi. GKTR Retrieval Model for Engineering Consulting Reports with Graph Convolution Topological and Keyword Features[J]. 数据分析与知识发现, 2023, 7(12): 155-163.
[14] Wu Xuxu, Chen Peng, Jiang Huan. Micro-Blog Fine-Grained Sentiment Analysis Based on Multi-Feature Fusion[J]. 数据分析与知识发现, 2023, 7(12): 102-113.
[15] Gao Haoxin, Sun Lijuan, Wu Jingchen, Gao Yutong, Wu Xu. Online Sensitive Text Classification Model Based on Heterogeneous Graph Convolutional Network[J]. 数据分析与知识发现, 2023, 7(11): 26-36.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn