[Objective]This study is to accurately identify potential knowledge correlations among textual information, and then enrich the methodology of knowledge mining. [Methods] First, we combined the topic model and association rules. Second, used the LDA model to extract topic set from the texts, which not only reduced the textual dimension but also realized the semantic space expression. Finally, we analyzed the semantic ties among the topics with association rules. [Results] We effectively found the potential knowledge association from the document texts with reasonable degrees of support and confidence, and then improved model’s “understanding” of the textual message. [Limitations] While preprocessing data, the self-defined dictionary posed some negative effects to the results. [Conclusions] The proposed method could extract the latent semantic association from unstructured textual information, and then improve the performance of knowledge discovery systems.
阮光册, 夏磊. 基于关联规则的文本主题深度挖掘应用研究*[J]. 数据分析与知识发现, 2016, 32(12): 50-56.
Guangce Ruan, Lei Xia. Mining Document Topics Based on Association Rules. Data Analysis and Knowledge Discovery, 2016, 32(12): 50-56.
Lazer D, Pentland A, Adamie L, et al.Computational Social Science[J]. Science, 2009, 323(5915): 721-723.
[2]
Salton G, Wong A, Yang C.A Vector Space Model for Automatic Indexing[J]. Communications of the ACM, 1975, 18(11): 613-620.
[3]
Ponte J M, Croft W B.A Language Modeling Approach to Information Retrieval [C]. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.1998: 275-281.
[4]
Agrawal R, Imieliński T, Swami A.Mining Association Rules Betweensets of Items in Large Databases[C]. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data. 1993: 207-216.
(He Yu, Feng Jianlin, Wang Yuanzhen.Text Classification Based on Maximal Association Rule[J]. Computer Science, 2006, 33(11): 143-145.)
[7]
Cherfi H, Napoli A, Toussaint Y.Towards a Text Mining Methodology Using Association Rule Extraction[J]. Soft Computing, 2006, 10: 431-441.
[8]
Sekhavat Y A, Hoeber O.Visualizing Association Rules Using Linked Matrix, Graph, and Detail Views[J]. International Journal of Intelligence Science, 2013, 3(1): 34-49.
(Liu Fei, Huang Xuanjing, Wu Lide.Approach for Extracting Thematic Terms Based on Association Rules[J]. Computer Engineering, 2008, 37(4): 81-83.)
[10]
Maedche A, Staab S.Discovering Conceptual Relations from Text [C]. In: Proceedings of the 14th European Conference on Artificial Intelligence (ECAI), Berlin, Germany. 2000: 321-325.
[11]
Schutz A, Buitelaar P.RelExt: A Tool for Relation Extraction from Text in Ontology Extension [C]. In: Proceedings of the 4th International Semantic Web Conference. 2005: 593-606.
[12]
Blei D M, Ng A Y, Jordan M I.Latent Dirichlet Allocation[J]. The Journal of Machine Learning Research, 2003, 3(3): 993-1022.
[13]
Zaki M J.Scalable Algorithm for Association Mining[J]. IEEE Transactions on Knowledge and Data Engineering, 2000, 12(3): 372-390.