%A Zhang Tao,Ma Haiqun %T Clustering Policy Texts Based on LDA Topic Model %0 Journal Article %D 2018 %J Data Analysis and Knowledge Discovery %R 10.11925/infotech.2096-3467.2018.0273 %P 59-65 %V 2 %N 9 %U {https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/abstract/article_4552.shtml} %8 2018-09-25 %X

[Objective] This research aims to improve the effectiveness of clustering policy texts with the help of LDA topic model. [Methods] First, we pre-processed the policy texts with the LDA model to generate the training data set. Then, we used the weighted algorithm to determine the optimal number of topics and then clustered the policy texts. [Results] We found that the G value of the weighted clustering results reached peak while the k value was 4. Our results, which were consistent with those of the manual classification, also obtained higher purity and F values. Therefore, the proposed method is effective. [Limitations] Results of each operation in our study will influence the accuracy of the final policy text clustering. [Conclusions] The proposed method could provide directions for the making of new policies, the evaluation of current policies, and the mechanism of two-way interactions.