Please wait a minute...
New Technology of Library and Information Service  2011, Vol. Issue (11): 48-53    DOI: 10.11925/infotech.1003-3513.2011.11.08
Current Issue | Archive | Adv Search |
Research of Title Party News Identification Technology Based on Topic Sentence Similarity
Wang Zhichao1, Weng Nan2, Wang Yu3
1. Institute of Information Science & Technology, Shanghai Jiaotong University, Shanghai 200240, China;
2. School of Management & Engineering, Nanjing University, Nanjing 210093, China;
3. School of Management, Dalian University of Technology, Dalian 116024, China
Download:
Export: BibTeX | EndNote (RIS)      
Abstract  Concerning the issues of the more and more title party news in the Web,this paper presents a new algorithm of title party news identification. Firstly, it analyzes the composition of the news page, then puts forward an approach of news title extraction and information extraction based on the features of news page. Secondly, considering the problem of extracting coherent topic sentences from news pages, starting with the relationship matrix of sentences, it puts forward an algorithm of topic sentence extraction. Then, according to the extracted news title and the candidate set of topic sentences, it can compute the similarity value, which is the main basis for judging the title party. Finally, the experiment results show that this method is effective and feasible.
Key wordsTitle party news      News title extraction      News information extraction      Sentence similarity computing     
Received: 16 September 2011      Published: 06 January 2012
:  TP391  

Cite this article:

Wang Zhichao, Weng Nan, Wang Yu. Research of Title Party News Identification Technology Based on Topic Sentence Similarity. New Technology of Library and Information Service, 2011, (11): 48-53.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2011.11.08     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2011/V/I11/48

[1] 蒲宇达,关毅,王强. 基于数据挖掘思想的网页正文抽取方法的研究 .见: 第三届学生计算语言学研讨会论文集 ,沈阳.2006.
[2] Moorn L.Discovery in Web-Documents .In: Proceedings of the 1999 ACM SIGMOD,Philadelphia,Pennsylvania,USA.1999.
[3] Marlin L.Relational Learning of Pattern-Match Rules for Information Extraction . In: Proceedings of Workshop in Natural Language Learning.1997:3-84.
[4] 李彬,刘挺,秦兵,等.基于语义依存的汉语句子相似度计算[J]. 计算机应用研究, 2003,20(12):15-17.
[5] 车万翔,刘挺,秦兵,等.基于改进编辑距离的中文相似句子检索[J]. 高技术通讯, 2004,14(7):15-19.
[6] 杨思春,程节华,陈家骏,等.一种基于模式的汉语句子相似度计算方法[J]. 微型机与应用, 2001,20(8):52-53.
[7] 李芳,柯熙政.基于切平面的主题提取算法[J]. 计算机工程与应用, 2007(25):172-174.
[8] 石晶,胡明,戴国忠.基于小世界模型的中文文本主题分析[J]. 中文信息学报, 2007,21(3):69-75.
[9] 李楠.基于遗传算法的汉语文本主题词提取研究 .长春:吉林大学,2007.
[10] 罗永莲,秦振吉.新闻网页主题内容提取方法研究[J]. 微计算机应用, 2007,28(5):556-560.
[11] 孙承杰,关毅.基于统计的网页正文信息抽取方法的研究[J]. 中文信息学报, 2004,18(5):17-22.
[12] 王森,王宇.基于文本树结构的论文复制检测算法[J]. 现代图书情报技术, 2009(10):50-55.
[1] Wang Hong, Shu Zhan, Gao Yinquan, Tian Wenhong. Analyzing Implicit Discourse Relation with Single Classifier and Multi-Task Network[J]. 数据分析与知识发现, 2021, 5(11): 80-88.
[2] Wu Yanwen, Cai Qiuting, Liu Zhi, Deng Yunze. Digital Resource Recommendation Based on Multi-Source Data and Scene Similarity Calculation[J]. 数据分析与知识发现, 2021, 5(11): 114-123.
[3] Li Zhenyu, Li Shuqing. Deep Collaborative Filtering Algorithm with Embedding Implicit Similarity Groups[J]. 数据分析与知识发现, 2021, 5(11): 124-134.
[4] Dong Miao, Su Zhongqi, Zhou Xiaobei, Lan Xue, Cui Zhigang, Cui Lei. Improving PubMedBERT for CID-Entity-Relation Classification Using Text-CNN[J]. 数据分析与知识发现, 2021, 5(11): 145-152.
[5] Yu Chuanming, Zhang Zhengang, Kong Lingge. Comparing Knowledge Graph Representation Models for Link Prediction[J]. 数据分析与知识发现, 2021, 5(11): 29-44.
[6] Ding Hao, Ai Wenhua, Hu Guangwei, Li Shuqing, Suo Wei. A Personalized Recommendation Model with Time Series Fluctuation of User Interest[J]. 数据分析与知识发现, 2021, 5(11): 45-58.
[7] Hua Bin, Wu Nuo, He Xin. Integrating Expert Reviews for Government Information Projects with Knowledge Fusion[J]. 数据分析与知识发现, 2021, 5(10): 124-136.
[8] Wang Yuan, Shi Kaize, Niu Zhendong. Position-Aware Stepwise Tagging Method for Triples Extraction of Entity-Relationship[J]. 数据分析与知识发现, 2021, 5(10): 71-80.
[9] Yang Chen, Chen Xiaohong, Wang Chuhan, Liu Tingting. Recommendation Strategy Based on Users’ Preferences for Fine-Grained Attributes[J]. 数据分析与知识发现, 2021, 5(10): 94-102.
[10] Dai Zhihong, Hao Xiaoling. Extracting Hypernym-Hyponym Relationship for Financial Market Applications[J]. 数据分析与知识发现, 2021, 5(10): 60-70.
[11] Wang Xuefeng, Ren Huichao, Liu Yuqin. Research on the Visualization Method of Drawing Technology Theme Map with Clusters [J]. 数据分析与知识发现, 0, (): 1-.
[12] Wang Yifan,Li Bo,Shi Hua,Miao Wei,Jiang Bin. Annotation Method for Extracting Entity Relationship from Ancient Chinese Works[J]. 数据分析与知识发现, 2021, 5(9): 63-74.
[13] Che Hongxin,Wang Tong,Wang Wei. Comparing Prediction Models for Prostate Cancer[J]. 数据分析与知识发现, 2021, 5(9): 107-114.
[14] Zhou Yang,Li Xuejun,Wang Donglei,Chen Fang,Peng Lijuan. Visualizing Knowledge Graph for Explosive Formula Design[J]. 数据分析与知识发现, 2021, 5(9): 42-53.
[15] Ma Jiangwei, Lv Xueqiang, You Xindong, Xiao Gang, Han Junmei. Extracting Relationship Among Military Domains with BERT and Relation Position Features[J]. 数据分析与知识发现, 2021, 5(8): 1-12.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn