Please wait a minute...
Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (4): 13-24    DOI: 10.11925/infotech.2096-3467.2020.1164
Current Issue | Archive | Adv Search |
Research Trends of Information Retrieval——Case Study of SIGIR Conference Papers
Li Yueyan,Wang Hao(),Deng Sanhong,Wang Wei
School of Information Management, Nanjing University, Nanjing 210023, China
Jiangsu Key Laboratory of Data Engineering & Knowledge Service, Nanjing 210023, China
Download: PDF (2431 KB)   HTML ( 16
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper summarizes the research development trends of information retrieval, aiming to promote interdisciplinary studies and application of related technologies. [Methods] First, we used LDA model to identify topics of papers accepted by the SIGIR Annual Conference from 2008 to 2019. Second, we removed irrelevant papers based on the similarity between documents and topics, and grouped papers into multiple categories by calculating topic discrimination. Third, we constructed the evolution path of domain topics in time series which showed the increasing, decreasing and stable patterns. Finally, we created the fine-grained evolution path of a single topic through the modular community, which demonstrated the dynamic evolution process of knowledge units within the topics. [Results] The proposed method avoids the interference of irrelevant documents on identifying topics and evolution paths. The multi-topic classification of documents helps reveal the cross-fusion among topics. The current information retrieval research trends include user-centric, continuously optimized models, filtering and recommending, semantic web technology, deep learning methods, as well as medical and health information retrieval. [Limitations] It might be subjective to remove irrelevant documents and categorize documents with multi-topics. [Conclusions] Intelligent information services is becoming a new norm, and users’ needs for information retrieval becomes more prominent.

Key wordsInformation Retrieval      LDA      Social Network Analysis      Topics Evolution     
Received: 25 November 2020      Published: 15 December 2020
ZTFLH:  分类号: G250  
Fund:National Natural Science Foundation of China(72074108);Fundamental Research Funds for the Central Universities(010814370113)
Corresponding Authors: Wang Hao     E-mail: ywhaowang@nju.edu.cn

Cite this article:

Li Yueyan,Wang Hao,Deng Sanhong,Wang Wei. Research Trends of Information Retrieval——Case Study of SIGIR Conference Papers. Data Analysis and Knowledge Discovery, 2021, 5(4): 13-24.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2020.1164     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2021/V5/I4/13

Approximate Perplexity
Intertopic Distance Map
序号 主题标号 主题标识 词项
1 topic1 挖掘和建模搜索活动 search user web engine behaviour session
2 topic2 排名学习和排名模型 rank learning model feature train algorithm
3 topic3 术语表示 term model retrieval document query weight
4 topic4 过滤与推荐 recommendation user item model collaborative system
5 topic5 交互式信息检索 system tutorial user application interface interactive
6 topic6 跨语言信息检索 language cross natural translation processing wikipedia
7 topic7 检索评价 collection test system evaluation performance effectiveness
8 topic8 深度学习 network model neural learn representation embed
9 topic9 网络搜索 click model advertisement privacy rate online
10 topic10 图像搜索 image tag visual annotation video content
11 topic11 社交搜索 social user news medium twitter content
12 topic12 问答系统 question answer expert community expertise collaborative
13 topic13 查询与查询分析 query suggestion completion context auto log
14 topic14 分类 model text classification semantic representation document
15 topic15 多样性搜索 document diversity search aspect diversification rank
16 topic16 * topic hash random similarity code walk
17 topic17 检索的效率和可伸缩性 search time engine efficiency algorithm index
18 topic18 聚类 time cluster feedback temporal tweet pseudo
19 topic19 语义网信息检索 entity knowledge link graph base recognition
20 topic20 评估指标 metric measure evaluation gain framework discount
21 topic21 文档摘要分析 document sentence summarization summary multi level
22 topic22 情感分析 web sentiment location opinion review mining
23 topic23 音乐检索 music passage sequence detection local similarity
24 topic24 相关性评估 judgment assessment crowdsourcing assessor label distribution
25 topic25 医疗信息搜索 medical match content video keyword domain
Topic-Terms Distribution
Topics on the Rise
Topics of a Downward Trend
Topics of a Stable Trend
Heat of Topic
The Dynamic Evolution of Knowledge Structure Units in the “Filtering and Recommendation” Topic Community
[1] Smeaton A F, Keogh G, Gurrin C, et al. Analysis of Papers from Twenty-Five Years of SIGIR Conferences: What Have We been Doing for the Last Quarter of a Century?[J]. ACM SIGIR Forum, 2002,37(1):49-53.
doi: 10.1145/945546.945550
[2] Hiemstra D, Hauff C, de Jong F , et al. SIGIR’s 30th Anniversary: An Analysis of Trends in IR Research and the Topology of Its Community[J]. ACM SIGIR Forum, 2007,41(2):18-24.
[3] 刘茜. SIGIR最新研究动向分析[J]. 图书馆学研究, 2007(2):88-90,60.
[3] ( Liu Qian. The Analysis of the Latest Research of SIGIR[J]. Researches in Library Science, 2007(2):88-90, 60.)
[4] 窦永香, 苏山佳, 赵捧未. 信息检索研究的发展与动向——对ACM SIGIR信息检索年会的主题分析[J]. 情报理论与实践, 2010,33(7):124-128.
[4] ( Dou Yongxiang, Su Shanjia, Zhao Pengwei. Progress and Development Trend in the Study of Information Retrieval[J]. Information Studies: Theory & Application, 2010,33(7):124-128.)
[5] 陈少涌, 李广建. 近十年来信息检索研究发展动向——基于SIGIR年会主题及论文集的统计分析[J]. 情报科学, 2015,33(5):150-156.
[5] ( Chen Shaoyong, Li Guangjian. Research on Information Retrieval over the Last Decade: Analysis of SIGIR Annual Conferences’ Research Topics and Proceedings[J]. Information Science, 2015,33(5):150-156.)
[6] 杨超凡, 邓仲华, 彭鑫, 等. 近5年信息检索的研究热点与发展趋势综述——基于相关会议论文的分析[J]. 数据分析与知识发现, 2017,1(7):35-43.
[6] ( Yang Chaofan, Deng Zhonghua, Peng Xin, et al. Review of Information Retrieval Research: Case Study of Conference Papers[J]. Data Analysis and Knowledge Discovery, 2017,1(7):35-43.)
[7] 赵忠伟, 程齐凯. 信息检索领域主题研究——基于SIGIR邮件列表和会议论文的比较研究[J]. 数字图书馆论坛, 2017(6):46-52.
[7] ( Zhao Zhongwei, Cheng Qikai. Research on the Subject of Information Retrieval: A Comparative Study Based on SIGIR Mailing List and Conference Papers[J]. Digital Libary Forum, 2017(6):46-52.)
[8] 杨建梁. iConference会议研究热点研究——基于2008~2017年会议论文的文本数据分析[J]. 情报资料工作, 2019,40(1):52-63.
[8] ( Yang Jianliang. A Probe into the Conference Research Hotspot of iConference: Based on Text Data Analysis of Conference Papers from 2008 to 2017[J]. Information and Documentation Services, 2019,40(1):52-63.)
[9] 杜丽君. 学科交叉视角下的信息检索研究主题演化分析——以情报学和计算机科学为例[J]. 信息技术与信息化, 2020(1):178-183.
[9] ( Du Lijun. An Analysis of the Evolution of Information Retrieval Research Subjects from the Perspective of Interdisciplinary Studies——Taking Information Science and Computer Science as Examples[J]. Information Technology and Informatization, 2020(1):178-183.)
[10] 郭红梅, 张智雄. 基于图挖掘的文本主题识别方法研究综述[J]. 中国图书馆学报, 2015,41(6):97-108.
[10] ( Guo Hongmei, Zhang Zhixiong. Methods of Text Theme Identification Based on Graph Mining[J]. Journal of Library Science in China, 2015,41(6):97-108.)
[11] 刘军. 整体网分析讲义: Ucinet软件实用指南[M]. 上海:格致出版社, 2009: 6-10.
[11] ( Liu Jun. Lectures on Whole Network Approach——A Practical Guide to Ucinet[M]. Shanghai:Truth & Wisdom Press, 2009: 6-10.)
[12] Chen C M. CiteSpace II: Detecting and Visualizing Emerging Trends and Transient Patterns in Scientific Literature[J]. Journal of the American Society for Information Science and Technology, 2006,57(3):359-377.
doi: 10.1002/(ISSN)1532-2890
[13] 王晓光, 程齐凯. 基于NEViewer的学科主题演化可视化分析[J]. 情报学报, 2013,32(9):900-911.
[13] ( Wang Xiaoguang, Cheng Qikai. Analysis on Evolution of Research Topics in a Discipline Based on NEViewer[J]. Journal of the China Society for Scientific and Technical Information, 2013,32(9):900-911.)
[14] Cobo M J, López-Herrera A G, Herrera-Viedma E, et al. An Approach for Detecting, Quantifying, and Visualizing the Evolution of a Research Field: A Practical Application to the Fuzzy Sets Theory Field[J]. Journal of Informetrics, 2011,5(1):146-166.
doi: 10.1016/j.joi.2010.10.002
[15] 刘勇, 杜一. 网络数据可视化与分析利器Gephi中文教程[M]. 北京:电子工业出版社, 2017.
[15] ( Liu Yong, Du Yi. A Chinese Course of Gephi——Used for Network Data Visualization and Analysis[M]. Beijing:Publishing House of Electronics Industry, 2017.)
[16] 刘自强, 王效岳, 白如江. 多维度视角下学科主题演化可视化分析方法研究——以我国图书情报领域大数据研究为例[J]. 中国图书馆学报, 2016,42(6):67-84.
[16] ( Liu Ziqiang, Wang Xiaoyue, Bai Rujiang. Research on Visualization Analysis Method of Discipline Topics Evolution from the Perspective of Multi-Dimensions: A Case Study of the Big Data in the Field of Library and Information Science in China[J]. Journal of the Library Science in China, 2016,42(6):67-84.)
[17] 曲佳彬, 欧石燕. 基于主题过滤与主题关联的学科主题演化分析[J]. 数据分析与知识发现, 2018,2(1):64-75.
[17] ( Qu Jiabin, Ou Shiyan. Analyzing Topic Evolution with Topic Filtering and Relevance[J]. Data Analysis and Knowledge Discovery, 2018,2(1):64-75.)
[18] Donohue J C. Understanding Scientific Literatures: A Bibliometric Approach[M]. Cambridge: the MIT Press, 1973: 49-50.
[19] Krestel R, Fankhauser P, Nejdl W. Latent Dirichlet Allocation for Tag Recommendation[C]// Proceedings of the 3rd ACM Conference on Recommender Systems. 2009: 61-68.
[20] Newman M E. Modularity and Community Structure in Networks[J]. Proceedings of the National Academy of Sciences of the United States of America, 2006,103(23):8577-8582.
[21] Arun R, Suresh V, Madhavan C E V, et al. On Finding the Natural Number of Topics with Latent Dirichlet Allocation: Some Observations[C]// Proceedings of the 14th Pacific-Asia Conference on Advances in Knowledge Discovery & Data Mining. 2010: 391-402.
[22] Li P, Burges C J, Wu Q. McRank: Learning to Rank Using Multiple Classification and Gradient Boosting[C]// Proceedings of the 20th International Conference on Neural Information Processing Systems. 2008: 897-904.
[23] Hinton G E, Salakhutdinov R R. Reducing the Dimensionality of Data with Neural Networks[J]. Science, 2006,313(5786):504-507.
doi: 10.1126/science.1127647
[24] 张彦文. Facebook社交搜索及其对图书馆服务的影响[J]. 图书馆论坛, 2014,34(10):115-121.
[24] ( Zhang Yanwen. Facebook Social Search and Its Impact on Library Service[J]. Library Tribune, 2014,34(10):115-121.)
[25] Gao J, Galley M, Li L, et al. Neural Approaches to Conversational AI[J]. Foundations and Trends® in Information Retrieval, 2019,13(2-3):127-298.
doi: 10.1561/1500000074
[1] Huang Mingxuan,Jiang Caoqing,Lu Shoudong. Expanding Queries Based on Word Embedding and Expansion Terms[J]. 数据分析与知识发现, 2021, 5(6): 115-125.
[2] Gao Yilin,Min Chao. Comparing Technology Diffusion Structure of China and the U.S. to Countries Along the Belt and Road[J]. 数据分析与知识发现, 2021, 5(6): 80-92.
[3] Meng Zhen,Wang Hao,Yu Wei,Deng Sanhong,Zhang Baolong. Vocal Music Classification Based on Multi-category Feature Fusion[J]. 数据分析与知识发现, 2021, 5(5): 59-70.
[4] Yi Huifang,Liu Xiwen. Analyzing Patent Technology Topics with IPC Context-Enhanced Context-LDA Model[J]. 数据分析与知识发现, 2021, 5(4): 25-36.
[5] Wang Hongbin,Wang Jianxiong,Zhang Yafei,Yang Heng. Topic Recognition of News Reports with Imbalanced Contents[J]. 数据分析与知识发现, 2021, 5(3): 109-120.
[6] Wang Wei, Gao Ning, Xu Yuting, Wang Hongwei. Topic Evolution of Online Reviews for Crowdfunding Campaigns[J]. 数据分析与知识发现, 2021, 5(10): 103-123.
[7] Cai Yongming,Liu Lu,Wang Kewei. Identifying Key Users and Topics from Online Learning Community[J]. 数据分析与知识发现, 2020, 4(6): 69-79.
[8] Ye Guanghui,Zeng Jieyan,Hu Jinglan,Bi Chongwu. Analyzing Public Sentiments from the Perspective of City Profiles[J]. 数据分析与知识发现, 2020, 4(4): 15-26.
[9] Pan Youneng,Ni Xiuli. Recommending Online Medical Experts with Labeled-LDA Model[J]. 数据分析与知识发现, 2020, 4(4): 34-43.
[10] Liu Yuwen,Wang Kai. Finding Geographic Locations of Popular Online Topics[J]. 数据分析与知识发现, 2020, 4(2/3): 173-181.
[11] Huang Wei,Zhao Jiangyuan,Yan Lu. Empirical Research on Topic Drift Index for Trending Network Events[J]. 数据分析与知识发现, 2020, 4(11): 92-101.
[12] Ye Guanghui,Xu Tong,Bi Chongwu,Li Xinyue. Analyzing Evolution of City Tourism Portraits with Multi-Dimensional Features and LDA Model[J]. 数据分析与知识发现, 2020, 4(11): 121-130.
[13] Wang Xiwei,Zhang Liu,Huang Bo,Wei Ya’nan. Constructing Topic Graph for Weibo Users Based on LDA: Case Study of “Egypt Air Disaster”[J]. 数据分析与知识发现, 2020, 4(10): 47-57.
[14] Peng Guan,Yuefen Wang. Advances in Patent Network[J]. 数据分析与知识发现, 2020, 4(1): 26-39.
[15] Hongfei Ling,Shiyan Ou. Review of Automatic Labeling for Topic Models[J]. 数据分析与知识发现, 2019, 3(9): 16-26.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn