[Objective] This study aims to solve the high-dimensional and sparse issues facing traditional large-scale corpus analysis methods. [Methods] First, we used the probability of co-occurrence to represent the mutual information between words, and extracted combination of words with values higher than the threshold. Then, we constructed the initial network with the third level entries based on syntactic structure. Finally, we developed the text complex network with the correction algorithm to express topic semantics. [Results] We retrieved 6,936 micro-blog posts from the trending topic of “global outbreak of network ransomware” as experiment corpus, and built a network model with 217 nodes and 2,019 sides. We also explored micro-blogging topics with the new model. [Limitations] More research is needed on the network node weight assignments in text complex networks. [Conclusions] The proposed model could effectively reduce the redundancy of network nodes, and improve the semantic expression of topic complex network.
刘冰瑶, 马静, 李晓峰. 一种“特征降维”文本复杂网络的话题表示模型*[J]. 数据分析与知识发现, 2017, 1(11): 53-61.
Liu Bingyao,Ma Jing,Li Xiaofeng. Topic Representation Model Based on “Feature Dimensionality Reduction”. Data Analysis and Knowledge Discovery, 2017, 1(11): 53-61.
(Ma Hongwei, Lu Bei, Chen Zhiqun.Research on Micro Blog Language Characteristics Based on Complex Net-work[J]. Computer Engineering and Applications, 2015, 51(19): 119-124.)
(Zhang Zhiyuan, Huo Weigang.A Topic Text Network Construction Method Based on PL-LDA Model[J]. Complex Systems and Complexity Science, 2017, 14(1): 52-57.)
Amancio D R, Aluisio S M, Oliveira O N, et al.Complex Networks Analysis of Language Complexity[J]. EPL, 2012, 100: 58002.
Amancio D R. Network Analysis of Named Entity Interactions in Written Texts [OL]. Preprint arXiv, arXiv:1509.05281v1.
Amancio D R.Probing the Topological Properties of Complex Networks Modeling Short Written Texts[J]. PLoS One, 2014, 10(2): e0118394.
Amancio D R.Complex Networks Analysis of Manual and Machine Translations[J]. International Journal of Modern Physics C, 2008, 19(4): 583-598.
Kuramochi T, Okada N, Tanikawa K, et al.Applying to Twitter Networks of a Community Extraction Method Using Intersection Graph and Semantic Analysis [A] // Human-Computer Interaction. Users and Contexts of Use[M]. Springer Berlin Heidelberg, 2013: 314-323.
Lim K W, Chen C, Buntine W. Twitter-Network Topic Model: A Full Bayesian Treatment for Social Network and Text Modeling [OL]. Preprint arXiv, arXiv:1609.06791v1.
汪小帆, 李翔, 陈关荣. 复杂网络理论及其应用[M]. 北京: 清华大学出版社, 2006.
(Wang Xiaofan, Li Xiang, Chen Guanrong.Complex Network Theory and Its Applications [M]. Beijing: Tsinghua University Press, 2006.)