New Technology of Library and Information Service  2007, Vol. 2 Issue (6): 38-41    DOI: 10.11925/infotech.1003-3513.2007.06.09
Information Extraction Based on Calculation of Sentence Similarity
Lian ZhanjunLv XueqiangZhang Yujie2  Shi Shuicai1
1 (Chinese Information Processing Research Center,Beijing Information
Science and Technology University,Beijing 100101,China)
2 (College of Information Science and Engineering,Dalian  Polytechnic University, Dalian 116011,China)
This paper gives a new method of information extraction based on calculation of sentence similarity. The topics of the sentences in testing words are labeled by adopting the method of calculation of sentence similarity. The veracity is increased by referencing the distributing of probability of the sentences in the documents. Using the resources of personal information on Internet, the paper achieves a statistic result.

Key wordsInformation extraction      Distributing of probability      Topic      Calculation of sentence similarity     
Received: 10 May 2007      Published: 25 June 2007


1Zhang Y M,Zhou J F.A Trainable Method for Extracting Chinese Entity Names and Their Relations.In:Proceedings of the Second Chinese Language Processing Workshop,Hong Kong,2000
2Barzilay R, Lee L. Catching the Drift: Probabilistic Content Models. with Application to Generation and Summarization,HLT-NAACL 2004:113-120
6菅小艳,郑家恒. 基于HMM的农作物信息抽取.自然语言理解与大规模内容计算,2005(10):25-28
7高霄云,杨建林.基于规则的中文时间词和数词的自动识别算法.现代图书情报技术,2007(3): 46-50
8Sigz.垂直搜索引擎技术. (Accessed  Sept.10,2006)

