[Objective]This study is to accurately identify potential knowledge correlations among textual information, and then enrich the methodology of knowledge mining. [Methods] First, we combined the topic model and association rules. Second, used the LDA model to extract topic set from the texts, which not only reduced the textual dimension but also realized the semantic space expression. Finally, we analyzed the semantic ties among the topics with association rules. [Results] We effectively found the potential knowledge association from the document texts with reasonable degrees of support and confidence, and then improved model’s “understanding” of the textual message. [Limitations] While preprocessing data, the self-defined dictionary posed some negative effects to the results. [Conclusions] The proposed method could extract the latent semantic association from unstructured textual information, and then improve the performance of knowledge discovery systems.
阮光册, 夏磊. 基于关联规则的文本主题深度挖掘应用研究*[J]. 数据分析与知识发现, 2016, 32(12): 50-56.
Guangce Ruan, Lei Xia. Mining Document Topics Based on Association Rules. Data Analysis and Knowledge Discovery, DOI：10.11925/infotech.1003-3513.2016.12.07.
Lazer D, Pentland A, Adamie L, et al.Computational Social Science[J]. Science, 2009, 323(5915): 721-723.
Salton G, Wong A, Yang C.A Vector Space Model for Automatic Indexing[J]. Communications of the ACM, 1975, 18(11): 613-620.
Ponte J M, Croft W B.A Language Modeling Approach to Information Retrieval [C]. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.1998: 275-281.
Agrawal R, Imieliński T, Swami A.Mining Association Rules Betweensets of Items in Large Databases[C]. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data. 1993: 207-216.