Please wait a minute...
New Technology of Library and Information Service  2007, Vol. 2 Issue (6): 38-41    DOI: 10.11925/infotech.1003-3513.2007.06.09
Current Issue | Archive | Adv Search |
Information Extraction Based on Calculation of Sentence Similarity
Lian ZhanjunLv XueqiangZhang Yujie2  Shi Shuicai1
1 (Chinese Information Processing Research Center,Beijing Information
Science and Technology University,Beijing 100101,China)
2 (College of Information Science and Engineering,Dalian  Polytechnic University, Dalian 116011,China)
Download: PDF (432 KB)  
Export: BibTeX | EndNote (RIS)      
Abstract  

This paper gives a new method of information extraction based on calculation of sentence similarity. The topics of the sentences in testing words are labeled by adopting the method of calculation of sentence similarity. The veracity is increased by referencing the distributing of probability of the sentences in the documents. Using the resources of personal information on Internet, the paper achieves a statistic result.

Key wordsInformation extraction      Distributing of probability      Topic      Calculation of sentence similarity     
Received: 10 May 2007      Published: 25 June 2007
ZTFLH: 

TP391

 
Corresponding Authors: Lian Zhanjun     E-mail: dikk12345678@gmail.com
About author:: Lian Zhanjun,Lv Xueqiang,Zhang Yujie,Shi Shuicai

Cite this article:

Lian Zhanjun,Lv Xueqiang,Zhang Yujie,Shi Shuicai. Information Extraction Based on Calculation of Sentence Similarity. New Technology of Library and Information Service, 2007, 2(6): 38-41.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2007.06.09     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2007/V2/I6/38

1Zhang Y M,Zhou J F.A Trainable Method for Extracting Chinese Entity Names and Their Relations.In:Proceedings of the Second Chinese Language Processing Workshop,Hong Kong,2000
2Barzilay R, Lee L. Catching the Drift: Probabilistic Content Models. with Application to Generation and Summarization,HLT-NAACL 2004:113-120
3李向阳,苗壮,肖江.无结构文本信息抽取综述.军事通信技术,2004,25(2):32-35
4车万翔,刘挺,秦兵,李生等.基于改进编辑距离的中文相似句子检索.高技术通讯,2004(7):15-20
5李彬,刘挺,秦兵,李生.基于语义依存的汉语句子相似度计算.计算机应用研究,2003(12):15-17
6菅小艳,郑家恒. 基于HMM的农作物信息抽取.自然语言理解与大规模内容计算,2005(10):25-28
7高霄云,杨建林.基于规则的中文时间词和数词的自动识别算法.现代图书情报技术,2007(3): 46-50
8Sigz.垂直搜索引擎技术. http://www.fullsearcher.com/n20051112144420735.asp (Accessed  Sept.10,2006)

[1] Liu Qian, Li Chenliang. A Survey of Topic Evolution on Social Media[J]. 数据分析与知识发现, 2020, 4(8): 1-14.
[2] Sheng Jiaqi, Xu Xin. Expanding Scholar Labels with Research Similarity and Co-authorship Network[J]. 数据分析与知识发现, 2020, 4(8): 75-85.
[3] Yue Lixin,Liu Ziqiang,Hu Zhengyin. Evolution Analysis of Hot Topics with Trend-Prediction[J]. 数据分析与知识发现, 2020, 4(6): 22-34.
[4] Cai Yongming,Liu Lu,Wang Kewei. Identifying Key Users and Topics from Online Learning Community[J]. 数据分析与知识发现, 2020, 4(6): 69-79.
[5] Liu Ping,Peng Xiaofang. Calculating Word Similarities Based on Formal Concept Analysis[J]. 数据分析与知识发现, 2020, 4(5): 66-74.
[6] Yu Chuanming,Yuan Sai,Zhu Xingyu,Lin Hongjun,Zhang Puliang,An Lu. Research on Deep Learning Based Topic Representation of Hot Events[J]. 数据分析与知识发现, 2020, 4(4): 1-14.
[7] Pan Youneng,Ni Xiuli. Recommending Online Medical Experts with Labeled-LDA Model[J]. 数据分析与知识发现, 2020, 4(4): 34-43.
[8] Liang Yanping,An Lu,Liu Jing. Topic Resonance of Micro-blogs on Similar Public Health Emergencies[J]. 数据分析与知识发现, 2020, 4(2/3): 122-133.
[9] Liu Yuwen,Wang Kai. Finding Geographic Locations of Popular Online Topics[J]. 数据分析与知识发现, 2020, 4(2/3): 173-181.
[10] Xu Jianmin,Zhang Liqing,Wang Miao. Tracking Static Topics with Bayesian Network[J]. 数据分析与知识发现, 2020, 4(2/3): 200-206.
[11] Ding Shengchun,Yu Fengyang,Li Zhen. Identifying Potential Trending Topics of Online Public Opinion[J]. 数据分析与知识发现, 2020, 4(2/3): 29-38.
[12] Manyu Huang,Qi Yun,Hufeng Peng,Xuemeng Dou. Analyzing Textual Features of Excess-funded Agricultural Products——Case Study of Crowdfunding Website[J]. 数据分析与知识发现, 2019, 3(9): 124-134.
[13] Hongfei Ling,Shiyan Ou. Review of Automatic Labeling for Topic Models[J]. 数据分析与知识发现, 2019, 3(9): 16-26.
[14] Weimin Nie,Yongzhou Chen,Jing Ma. A Text Vector Representation Model Merging Multi-Granularity Information[J]. 数据分析与知识发现, 2019, 3(9): 45-52.
[15] Bowen Liu,Rujiang Bai,Yanting Zhou,Xiaoyue Wang. Identifying Frontier Topics from Funding and Paper——Case Study of Carbon Nanotube[J]. 数据分析与知识发现, 2019, 3(8): 114-122.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn