|
|
Information Extraction Based on Calculation of Sentence Similarity |
Lian Zhanjun1 Lv Xueqiang1 Zhang Yujie2 Shi Shuicai1 |
1 (Chinese Information Processing Research Center,Beijing Information
Science and Technology University,Beijing 100101,China)
2 (College of Information Science and Engineering,Dalian Polytechnic University, Dalian 116011,China) |
|
|
Abstract This paper gives a new method of information extraction based on calculation of sentence similarity. The topics of the sentences in testing words are labeled by adopting the method of calculation of sentence similarity. The veracity is increased by referencing the distributing of probability of the sentences in the documents. Using the resources of personal information on Internet, the paper achieves a statistic result.
|
Received: 10 May 2007
Published: 25 June 2007
|
|
Corresponding Authors:
Lian Zhanjun
E-mail: dikk12345678@gmail.com
|
About author:: Lian Zhanjun,Lv Xueqiang,Zhang Yujie,Shi Shuicai |
1Zhang Y M,Zhou J F.A Trainable Method for Extracting Chinese Entity Names and Their Relations.In:Proceedings of the Second Chinese Language Processing Workshop,Hong Kong,2000
2Barzilay R, Lee L. Catching the Drift: Probabilistic Content Models. with Application to Generation and Summarization,HLT-NAACL 2004:113-120
3李向阳,苗壮,肖江.无结构文本信息抽取综述.军事通信技术,2004,25(2):32-35
4车万翔,刘挺,秦兵,李生等.基于改进编辑距离的中文相似句子检索.高技术通讯,2004(7):15-20
5李彬,刘挺,秦兵,李生.基于语义依存的汉语句子相似度计算.计算机应用研究,2003(12):15-17
6菅小艳,郑家恒. 基于HMM的农作物信息抽取.自然语言理解与大规模内容计算,2005(10):25-28
7高霄云,杨建林.基于规则的中文时间词和数词的自动识别算法.现代图书情报技术,2007(3): 46-50
8Sigz.垂直搜索引擎技术. http://www.fullsearcher.com/n20051112144420735.asp (Accessed Sept.10,2006) |
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|