%A Feng Xiaodong, Hui Kangxin %T Topic Clustering for Social Media Texts with Heterogeneous Graph Neural Networks %0 Journal Article %D 2022 %J Data Analysis and Knowledge Discovery %R 10.11925/infotech.2096-3467.2022.0038 %P 9-19 %V 6 %N 10 %U {https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/abstract/article_5530.shtml} %8 2022-10-25 %X

[Objective] This paper develops an effective topic clustering method to address the issues of semantic sparsity and multiple interactions of social media texts. [Methods] We constructed a model for the multiple interaction relationship between social media users and online contents with the help of heterogeneous information network. First, we used word embedding method to obtain the representation of texts as the initial input features. Then, we propagated and aggregated representations of nodes with the heterogeneous graph neural network. Finally, we trained the model with representation of text nodes, and conducted an unsupervised clustering for the topics. [Results] We examined our model on the English benchmark data set, and found its NMI for original posts and comments reached 0.837 2 and 0.868 9 respectively, which were higher than those of the traditional LDA or directly clustering method with words or text embedding vectors by Word2Vec, Doc2Vec, or GolVe. [Limitations] Due to the limits of data, we did not examine the social relationship among users and multimedia contents online. [Conclusions] The proposed model can effectively improve the topic clustering for social media texts.