Please wait a minute...
Advanced Search
现代图书情报技术  2011, Vol. 27 Issue (4): 52-57     https://doi.org/10.11925/infotech.1003-3513.2011.04.09
  情报分析与研究 本期目录 | 过刊浏览 | 高级检索 |
互联网新闻报道中的突发事件识别研究
姚占雷, 许鑫
华东师范大学信息学系 上海 200241
Research on the Detection of Sudden Events in News Stories of Online Information
Yao Zhanlei, Xu Xin
Depatment of Informatics, East China Normal University, Shanghai 200241, China
全文: PDF (530 KB)   HTML  
输出: BibTeX | EndNote (RIS)      
摘要 为及时、准确地捕获突发事件,提出词间距的思想,并构建基于互联网新闻报道的突发事件识别模型。该模型主要包括热点词元发现和新词语检测两部分,即通过改进的TF-PDF算法捕获当前关注的词元以形成热点词元,利用词间距来寻找热点词元之间的客观分布状态,从而依据热点词元之间相对稳定的组合达到突发事件识别的目的。实验表明该模型对突发事件的识别,在时间上有着较高的敏感性。
服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
姚占雷
许鑫
关键词 事件识别热点词元词间距    
Abstract:Focusing on how to capture sudden events timely and accurately, this paper introduces an idea of the Distance between two Segmental Words(DSW), and devises a model for detecting the sudden events in Internet news. This model mainly comprises two parts, as generating the Hot Element of Terms(HET) and detecting new words. Specifically, it uses the improved TF-PDF algorithm for capturing the Element of Terms(ET),which concerns to generate the Hot Element of Terms, and seeks the status quo of distribution among these terms based on the Distance between two Segmental Words, then with the relatively stable combination among these terms to achieve event detection. Experiment shows that the model has a high sensitivity on detecting the sudden events.
Key wordsEvent detection    Hot element of terms    Distance between two segmental words
收稿日期: 2011-03-21      出版日期: 2011-06-11
: 

G354

 
基金资助:

本文系上海市社会科学规划课题“政务公开信息的网络舆情反馈研究”(项目编号:2009ETQ001)和教育部人文社会科学研究项目“公共危机传播中的网络舆情演变机制研究”(项目编号:09YJC860011)的研究成果之一。

引用本文:   
姚占雷, 许鑫. 互联网新闻报道中的突发事件识别研究[J]. 现代图书情报技术, 2011, 27(4): 52-57.
Yao Zhanlei, Xu Xin. Research on the Detection of Sudden Events in News Stories of Online Information. New Technology of Library and Information Service, 2011, 27(4): 52-57.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2011.04.09      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2011/V27/I4/52
[1] 洪宇,张宇,范基礼,等. 基于子话题分治匹配的新事件检测[J]. 计算机学报,2008,31(4):2887-2898.

[2] Yang Y, Carbonell J G, Brown R D, et al. Learning Approaches for Detecting and Tracking News Events[J]. IEEE Intelligent Systems, 1999, 34(4):32-43.

[3] Salton G, Yang C S. On the Specification of Term Values in Automatic Indexing[J]. Journal of Documentation,1973,29(4):351-372.

[4] Brants T, Chen F, Farahat A. A System for New Event Detection[C]. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.2003:330–337.

[5] Dai X, He Y, Sun Y. A Two-layer Text Clustering Approach for Retrospective News Event Detection[C]. In: Proceedings of Artificial Intelligence and Computational Intelligence.2010:364-368.

[6] 贾自艳,何清,张海俊,等. 一种基于动态进化模型的事件探测和追踪算法[J]. 计算机研究与发展,2004, 41(7):1273-1280.

[7] 邹纲,刘洋,刘群,等. 面向Internet 的中文新词语检测[J]. 中文信息学报,2004,18(6):1-9.

[8] Bun K K, Ishizuka M. Topic Extraction from News Archive Using TF*PDF Algorithm[C]. In: Proceedings of the 3rd International Conference on Web Information Systems Engineering.2002:73-82.

[9] 雷震,吴玲达,刘宇弛,等. 基于事件的新闻报道分析技术研究进展[J]. 计算机应用研究,2007,24(5):13-16.

[10] 张阔,李涓子,吴刚,等. 基于词元再评估的新事件检测模型[J]. 软件学报,2008,19(4):817-828.

[11] Zhou M. Some Concepts and Mathematical Consideration of Similarity System Theory[J]. Journal of System Science and System Engineering,1992,1(1):84-92.

[12] 张音,王舒怀,李鹤. “概念新闻”与党报创新[J]. 新闻战线,2010(9):29-31.
[1] 孙鑫瑞,孟雨,王文乐. 基于知识图谱与目标检测的微博交通事件识别*[J]. 数据分析与知识发现, 2020, 4(12): 136-147.
[2] 谷俊. 专利文献中新技术术语识别研究[J]. 现代图书情报技术, 2012, (11): 53-59.
[3] 夏彦, 何琳, 潘运来, 欧阳辰晨. 基于规则与统计相结合的互联网突发事件识别研究[J]. 现代图书情报技术, 2010, 26(10): 65-69.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn