Concerning the issues of Web text with little structure information and big noise, sentences are viewed as nodes and similarities between them are viewed as edges, a relationship map is used to describe the relationship between sentences. Topic sentences of a text can be got through searching the nodes which have most of edges. Using the semantic dictionary, sentence similarity is defined as its semantic similarity to address the problem of low word frequency similarity of short text. An internet public campus is chosen to take a test, 80.6% acceptability have been achieved.
何维,王宇. 基于句子关系图的网页文本主题句抽取*[J]. 现代图书情报技术, 2009, 3(3): 57-61.
He Wei,Wang Yu. Extracting Topic Sentences form Web Text Based on Sentence Relationship Map. New Technology of Library and Information Service, 2009, 3(3): 57-61.
[1] 张云涛,龚玲,王永成.基于综合方法的文本主题句的自动抽取[J].上海交通大学学报,2006,40(5): 771-774,782.
[2] 马颖华,王永成,苏贵洋,等.一种基于字同现频率的汉语文本主题抽取方法[J].计算机研究与发展,2003,40(6):874-878.
[3] 廉站俊,吕学强,张玉杰,等.基于句子相似度计算的信息抽取[J].现代图书情报技术,2007 (6):38-41.
[4] 孙宏纲,陆余良.中文博客主题情感句自动抽取研究[J].计算机工程与应用,2008,44(20):165-168,221.
[5] 陈炯,张永奎.基于加权信息论的突发事件新闻主题抽取方法[J].计算机应用,2008,28:150-151.
[6] 蔡巍,王永成,尹中航.一种无词典的从Web新闻页面抽取主题的算法[J].情报学报,2008,27(1):12-17.
[7] Salton G, Allan J. Automatic Text Decomposition and Structuring [J]. Information Processing and Management,1996,32(2):127-138.
[8] Salton G, Singhal A, Buckley C, et al. Automatic Text Decomposition Using Text Segments and Text Themes[C].In:Proceedings of the Seventh ACM Conference on Hypertext. NY: ACM New York,1996:53-65.
[9] Mitra M, Singhal A, Buckley C. Automatic Text Summarization by Paragraph Extraction[C]. In:Proceedings of ACL’97/EACL’97. Worksho Pon Intelligent Scaleable Text Summarization, Madrid. NJ: Assoc. Comput. Linguistics, 1997:39-46.
[10] 薛翠芳,郭炳炎.汉语文本结构的自动分析[J].情报学报,2000,19(4):319-325.
[11] Chatterjee N. A Statistical Approach for Similarity Measurement between Sentences for EBMT[C]. In:Proceedings of Symposium on Translation Support Systems STRANS-2001, 2001.
[12] Chen K, Fan XZ, Liu J, et al. A New Approach to Compute the Semantic Similarity of Chinese Question Sentence[C].In:Proceedings of the Sixth International Conference on Machine Learning and Cybernetics(ICMLC 2007), Hong Kong. NJ:IEEE, 2007:1830-1835.
[13] Li Y, McLean D, Bandar Z A, et al. Sentence Similarity Based on Semantic Nets and Corpus Statistics[J].IEEE transactions on knowledge and data engineering, 2006,18(8):1138-1150.
[14] Che W X, Jiang J M, Su Z, et al. Improved-Edit-Distance Kernel for Chinese Relation Extraction[C].In:The Second International Joint Conference on Natural Language Processing (IJCNLP-05),Jeju Korea. Springer,2005:134-139.
[15] 哈尔滨工业大学信息检索研究室.同义词词林(扩展版)[EB/OL].[2008-05-19].http://www.ir-lab.org/.
[16] 搜狗实验室.文本分类语料库:精简版(tar.gz格式)[DB/OL].[2008-03-18]. http://www.sogou.com/labs/dl/c.html.
[17] 张华平. ICTCLAS3.0 API[CP].[ 2008-03-17]. http://www.nlp.org.cn/project/project.php?proj_id=6.