Please wait a minute...
Advanced Search
现代图书情报技术  2012, Vol. Issue (10): 21-27     https://doi.org/10.11925/infotech.1003-3513.2012.10.04
  知识组织与知识管理 本期目录 | 过刊浏览 | 高级检索 |
面向社会文本流数据探测爆发主题方法浅析
乐小虬1, 洪娜2
1. 中国科学院国家科学图书馆 北京 100190;
2. 中国医学科学院医学信息研究所 北京100020
A Survey of Burst Topic Detection Towards Social Text Stream Data
Le Xiaoqiu1, Hong Na2
1. National Science Library, Chinese Academy of Sciences, Beijing 100190, China;
2. Institute of Medical Information, Chinese Academy of Medical Sciences, Beijing 100020, China
全文: PDF (633 KB)   HTML  
输出: BibTeX | EndNote (RIS)      
摘要 社会文本流数据富含上下文环境信息、语言不规范且参与用户数量庞大。针对这类数据开展爆发主题探测需要寻找新的思路。本文对社会文本流数据的概念、特点以及爆发主题表达形式进行系统性梳理,从文本内容、时间、社会三个维度阐述探测爆发主题的主要研究思路和基本流程,分析利用社会特征(如用户参与、上下文环境、社团结构)进行爆发主题探测的主要技术方法。
服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
乐小虬
洪娜
关键词 社会文本流爆发主题探测社会网络    
Abstract:Social text streams have rich contextual information and huge participants who communicate with informal steams. It needs to find suitable solutions to detect burst topics from this kind of data. In this paper, the authors comb through the concepts, the characteristics of social text stream data and the presentation forms of burst topics. It also summarizes the main research ideas and the basic procedures of burst topic detection towards social text stream data in three dimensions: textual content, social, and temporal. The principal approaches to make use of social features, such as user participation, social context and community structure evolution, for burst topic detection are generally discussed.
Key wordsSocial text stream    Burst topic detection    Social network
收稿日期: 2012-09-25      出版日期: 2013-01-24
: 

TP393

 
基金资助:

本文系国家社会科学基金项目“网络科技信息中爆发主题的监测与分析方法研究”(项目编号:09BTQ035)的研究成果之一。

通讯作者: 乐小虬     E-mail: lexq@mail.las.ac.cn
引用本文:   
乐小虬, 洪娜. 面向社会文本流数据探测爆发主题方法浅析[J]. 现代图书情报技术, 2012, (10): 21-27.
Le Xiaoqiu, Hong Na. A Survey of Burst Topic Detection Towards Social Text Stream Data. New Technology of Library and Information Service, 2012, (10): 21-27.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2012.10.04      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2012/V/I10/21
[1] Matsumura N, Goldberg D E, Llora X. Mining Directed Social Network from MessageBoard[OL].[2012-09-20].http://delivery.acm.org/10.1145/1070000/1062884/p1092-matsumura.pdf?ip=159.226.100.225&acc=ACTIVE%20SERVICE&CFID=118941016&CFTOKEN=80199382&__acm__=1348453917_c173fc996f56a0c611739ba100e23712.
[2] Mathioudakis M, Koudas N. TwitterMonitor: Trend Detection over the Twitter Stream [OL].[2012-03-11].http://delivery.acm.org/10.1145/1810000/1807306/p1155-mathioudakis.pdf?ip=159.226.100.225&CFID=35057597&CFTOKEN=95005305&__acm__=1310711607_fd4454ee954f38c1a4c767c8b 5047820.
[3] Zhao Q, Mitra P, Chen B. Temporal and Information Flow Based Event Detection from Social Text Streams[C]. In: Proceedings of the 22nd AAAI Conference on Artificial Intelligence. 2007:1501-1506.
[4] Kleinberg J. Bursty and Hierarchical Structure in Streams[J]. Data Mining and Knowledge Discovery, 2003,7(4): 373-397.
[5] Fujiki T, Nanno T, Suzuki Y, et al. Identification of Bursts in a Document Stream[OL]. [2012-10-12].http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.106.6773.
[6] Chen Y,Yang S,Cheng X Q. Bursty Topics Extraction for Web Forums[C/OL].In: Proceedings of the 11th International Workshop on Web Information and Data Management.2009:55-58. [2012-03-09]. http://portal.acm.org/citation.cfm?id=1651587.1651600&coll=DL&dl=ACM&CFID=19983188&CFTOKEN=89593705.
[7] Yao J, Cui B, Huang Y, et al.Temporal and Social Context Based Burst Detection from Folksonomies[C]. In: Proceedings of the 24th AAAI Conference on Artificial Intelligence. 2010:1474-1479.
[8] 陈友,程学旗,杨森.面向网络论坛的突发话题发现[J]. 中文信息学报,2010,24(3):29-36.(Chen You, Cheng Xueqi, Yang Sen. Outburst Topic Detection for Web Forums[J]. Journal of Chinese Information Processing, 2010, 24(3): 29-36. )
[9] 徐戈,王厚峰.自然语言处理中主题模型的发展[J]. 计算机学报, 2011,34(8): 1423-1436.(Xu Ge, Wang Houfeng. The Development of Topic Models in Natural Language Processing [J]. Chinese Journal of Computer, 2011, 34(8): 1423-1436.)
[10] Diao Q, Jiang J, Zhu F, et al. Finding Bursty Topics from Microblogs[OL]. [2012-10-02]. http://www.mysmu.edu/faculty/jingjiang/papers/ACL’12.pdf.
[11] Ramage D,Hall D,Nallapati R,et al.Labeled LDA:A Supervised Topic Model for Credit Attribution in Multi Labeled Corpora[C]. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore.2009:248-256.
[12] He Q, Chang K, Lim E P. Analyzing Feature Trajectories for Event Detection[C]. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2007: 207-214.
[13] Zhu M, Hu W, Wu O.Topic Detection and Tracking for Threaded Discussion Communities[C]. In: Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology. 2008:77-83.
[14] Wu Z L, Li C H. Topic Detection in Online Discussion Using Non-Negative Matrix Factorization[C].In: Proceedings of the 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology-Workshops.2007:272-275.
[15] Parikh N, Sundaresan N. Scalable and Near Real-Time Burst Detection from ECommerce Queries[OL]. [2012-04-16]. http://portal.acm.org/citation.cfm?id=1401890.1402006&coll=DL&dl= ACM &CFID=19983188&CFTOKEN=89593705.
[16] Liu Y T, Gao B, Liu T Y. BrowseRank: Letting Web Users Vote for Page Importance[C]. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Singapore.2008:451-458.
[17] 李东方,俞能海,尹华罡. 一种Web 2.0环境下互联网热点挖掘算法[J]. 电子与信息学报, 2010, 32(5):1142-1145.(Li Dongfang, Yu Nenghai, Yin Huagang. Mining Hot Topics on Internet Under Web 2.0[J]. Journal of Electronics & Information Technology, 2010,32(5):1142-1145.)
[18] Cataldi M,Caro L D,Schifanella C. Emerging Topic Detection on Twitter Based on Temporal and Social Terms Evaluation[OL].[2012-05-15]. http://delivery.acm.org/10.1145/1820000/ 1814249/a4-cataldi.pdf?ip=159.226.100.225&acc=ACTIVE%20SERVICE&CFID=118941016&CFTOKEN=80199382 &_acm_=1348454353_fc699f8fab306f6ff8ed2cd2b208f191.
[19] Adamic L A, Adar E. Friends and Neighbors on the Web[J]. Social Networks, 2003, 25(3):211-230.
[20] Backstrom L, Huttenlocher D, Kleinberg J, et al.Group Formation in Large Social Networks: Membership, Growth, and Evolution[C]. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2006: 44-54.
[21] Lin C X, Zhao B, Mei Q, et al. PET:A Statistical Model for Popular Event Tracking in Social Communities[C/OL]. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA.2010:929-938. [2012-05-15]. http://delivery.acm.org/10.1145/1840000/1835922/p929-lin.pdf? ip=159.226.100.225&CFID=35057597&CFTOKEN=95005305&__acm__=1310719076_1a0cf689597ec51c79e4b05c7c614370.
[22] Kumar R, Novak J, Raghavan P, et al. On the Bursty Evolution of Blogspace[C/OL]. In:Proceedings of the 12th International Conference on World Wide Web, Budapest, Hungary. 2003:568-576. [2012-05-15]. http://delivery.acm.org/10.1145/780000/775233/p568-kumar.pdf? ip= 159.226.100.225&acc=ACTIVE%20SERVICE&CFID=118941016&CFTOKEN=80199382 &__acm__=1348454514_530db4146e72606621c2c46ea5822704.
[1] 高伊林,闵超. 中美对“一带一路”沿线技术扩散结构比较研究*[J]. 数据分析与知识发现, 2021, 5(6): 80-92.
[2] 李跃艳,王昊,邓三鸿,王伟. 近十年信息检索领域的研究热点与演化趋势研究——基于SIGIR会议论文的分析[J]. 数据分析与知识发现, 2021, 5(4): 13-24.
[3] 关鹏,王曰芬. 国内外专利网络研究进展*[J]. 数据分析与知识发现, 2020, 4(1): 26-39.
[4] 王欣瑞,何跃. 社交媒体用户交互行为与股票市场的关联分析研究: 基于新浪财经博客的实证[J]. 数据分析与知识发现, 2019, 3(11): 108-119.
[5] 叶光辉, 胡婧岚, 徐健, 夏立新. 社交博客标签增长态势与连接模式分析*[J]. 数据分析与知识发现, 2018, 2(6): 70-78.
[6] 陈芬, 付希, 何源, 薛春香. 融合社会网络分析与影响力扩散模型的微博意见领袖发现研究*[J]. 数据分析与知识发现, 2018, 2(12): 60-67.
[7] 王忠义, 张鹤铭, 黄京, 李春雅. 基于社会网络分析的网络问答社区知识传播研究[J]. 数据分析与知识发现, 2018, 2(11): 80-94.
[8] 李真, 丁晟春, 王楠. 网络舆情观点主题识别研究*[J]. 数据分析与知识发现, 2017, 1(8): 18-30.
[9] 李飞, 张健, 王宗水. 社会化推荐研究进展与发展趋势演化*——基于文献计量和社会网络分析的视角[J]. 数据分析与知识发现, 2017, 1(6): 22-35.
[10] 王晰巍, 张柳, 李师萌, 王楠阿雪. 新媒体环境下社会公益网络舆情传播研究* ——以新浪微博“画出生命线”话题为例[J]. 数据分析与知识发现, 2017, 1(6): 93-101.
[11] 范如霞, 曾建勋, 高亚瑞玺. 基于合作网络的学者动态学术影响力模式识别研究[J]. 数据分析与知识发现, 2017, 1(4): 30-37.
[12] 王曰芬,靳嘉林. 比较分析《现代图书情报技术》近10年发文特征与发展趋势*[J]. 现代图书情报技术, 2016, 32(9): 1-16.
[13] 张磊,马静,李丹丹,沈洋. 语义社会网络的超网络模型构建及关键节点自动化识别方法研究*[J]. 现代图书情报技术, 2016, 32(3): 8-17.
[14] 吴应良, 姚怀栋, 李成安. 一种引入间接信任关系的改进协同过滤推荐算法[J]. 现代图书情报技术, 2015, 31(9): 38-45.
[15] 任妮, 周建农. 合著网络加权模式下科研团队的发现与评价研究[J]. 现代图书情报技术, 2015, 31(9): 68-75.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn