Research on Content Characteristics About Complex Network of Text
Liu Honghong1,2, An Haizhong1,2, Gao Xiangyun1,2
1. Lab of Resources and Environmental Management, China University of Geosciences, Beijing 100083, China;
2. School of Humanities and Economic Management, China University of Geosciences, Beijing 100083, China
To solve the problem of irregular structure of some texts, this paper presents a method based on the complex network theory to evaluate the text structure. This method uses a node to represent a sentence and an edge between two nodes to represent a common word of two sentences, which construct the complex network of a text. Then the authors analyze characters of text structure by topological characteristics of text complex network. By building a text complex network based on a selected article, the degree, the degree of intensity, the shortest paths and the weighting clustering coefficients of this selected article are calculated. The results show that the structure of the text content can be effectively evaluated by this proposed method. Moreover, the results also provide important references to understand main ideas, to generate summaries and to filter text retrieval of a given text.
刘红红, 安海忠, 高湘昀. 基于文本复杂网络的内容结构特征分析[J]. 现代图书情报技术, 2011, 27(1): 69-73.
Liu Honghong, An Haizhong, Gao Xiangyun. Research on Content Characteristics About Complex Network of Text. New Technology of Library and Information Service, 2011, 27(1): 69-73.
[5] Jenkins S, Kirk S R. Software Architecture Graphs as Complex Networks: A Novel Partitioning Scheme to Measure Stability and Evolution
[J]. Information Sciences,2007,177(12):2587-2601.
[6] Amancio D R,Antiqueira L L, Pardo T A S, et al.Complex Networks Analysis of Manual and Machine Translations
[J]. International Journal of Modern Physics C,2008,19 (4):583-598.
[7] Antiqueira L, Nunes M G V, Oliveira Jr O N,et al. Strong Correlations Between Text Quality and Complex Networks Features
[J].Physica A,2007,373(4):811-820.
[8] Antiqueira L,Pardo T A S,Nunes M G V,et al.Some Issues on Complex Networks for Author Characterization. In: Proceedings of the 4th Workshop in Information and Human Language Technology.2006:59-68.
[9] Antiqueira L, Oliveira Jr O N, Luciano da Fontoura Costa,et al.A Complex Network Approach to Text Summarization
[J].Information Sciences,2009,179(5):584-599.
[10] Pardo T A S,Antiqueira L,Nunes M G V,et al.Modeling and Evaluating Summaries Using Complex Networks. In: Proceedings of the 7th Workshop on Computational Processing of Written and Spoken Portuguese (PROPOR).2006:1-10.
[11] 中国科学院计算技术研究所.汉语词法分析系统(ICTCLAS分词系统). 2007.
[12] BorgattiS P, Everett M T, Freeman L C.社会分析软件UCINET. 加州大学.2002.