Research Trends of Information Retrieval——Case Study of SIGIR Conference Papers
Li Yueyan,Wang Hao(),Deng Sanhong,Wang Wei
School of Information Management, Nanjing University, Nanjing 210023, China Jiangsu Key Laboratory of Data Engineering & Knowledge Service, Nanjing 210023, China
[Objective] This paper summarizes the research development trends of information retrieval, aiming to promote interdisciplinary studies and application of related technologies. [Methods] First, we used LDA model to identify topics of papers accepted by the SIGIR Annual Conference from 2008 to 2019. Second, we removed irrelevant papers based on the similarity between documents and topics, and grouped papers into multiple categories by calculating topic discrimination. Third, we constructed the evolution path of domain topics in time series which showed the increasing, decreasing and stable patterns. Finally, we created the fine-grained evolution path of a single topic through the modular community, which demonstrated the dynamic evolution process of knowledge units within the topics. [Results] The proposed method avoids the interference of irrelevant documents on identifying topics and evolution paths. The multi-topic classification of documents helps reveal the cross-fusion among topics. The current information retrieval research trends include user-centric, continuously optimized models, filtering and recommending, semantic web technology, deep learning methods, as well as medical and health information retrieval. [Limitations] It might be subjective to remove irrelevant documents and categorize documents with multi-topics. [Conclusions] Intelligent information services is becoming a new norm, and users’ needs for information retrieval becomes more prominent.
李跃艳,王昊,邓三鸿,王伟. 近十年信息检索领域的研究热点与演化趋势研究——基于SIGIR会议论文的分析[J]. 数据分析与知识发现, 2021, 5(4): 13-24.
Li Yueyan,Wang Hao,Deng Sanhong,Wang Wei. Research Trends of Information Retrieval——Case Study of SIGIR Conference Papers. Data Analysis and Knowledge Discovery, 2021, 5(4): 13-24.
document sentence summarization summary multi level
22
topic22
情感分析
web sentiment location opinion review mining
23
topic23
音乐检索
music passage sequence detection local similarity
24
topic24
相关性评估
judgment assessment crowdsourcing assessor label distribution
25
topic25
医疗信息搜索
medical match content video keyword domain
Table 1 主题-词项分布
Fig.3 呈上升趋势的主题
Fig.4 呈下降趋势的主题
Fig.5 呈稳定趋势的主题
Fig.6 主题热度谱图
Fig.7 “过滤与推荐”主题群落内部知识结构单元的动态演化
[1]
Smeaton A F, Keogh G, Gurrin C, et al. Analysis of Papers from Twenty-Five Years of SIGIR Conferences: What Have We been Doing for the Last Quarter of a Century?[J]. ACM SIGIR Forum, 2002,37(1):49-53.
doi: 10.1145/945546.945550
[2]
Hiemstra D, Hauff C, de Jong F , et al. SIGIR’s 30th Anniversary: An Analysis of Trends in IR Research and the Topology of Its Community[J]. ACM SIGIR Forum, 2007,41(2):18-24.
[3]
刘茜. SIGIR最新研究动向分析[J]. 图书馆学研究, 2007(2):88-90,60.
[3]
( Liu Qian. The Analysis of the Latest Research of SIGIR[J]. Researches in Library Science, 2007(2):88-90, 60.)
( Dou Yongxiang, Su Shanjia, Zhao Pengwei. Progress and Development Trend in the Study of Information Retrieval[J]. Information Studies: Theory & Application, 2010,33(7):124-128.)
( Chen Shaoyong, Li Guangjian. Research on Information Retrieval over the Last Decade: Analysis of SIGIR Annual Conferences’ Research Topics and Proceedings[J]. Information Science, 2015,33(5):150-156.)
( Yang Chaofan, Deng Zhonghua, Peng Xin, et al. Review of Information Retrieval Research: Case Study of Conference Papers[J]. Data Analysis and Knowledge Discovery, 2017,1(7):35-43.)
( Zhao Zhongwei, Cheng Qikai. Research on the Subject of Information Retrieval: A Comparative Study Based on SIGIR Mailing List and Conference Papers[J]. Digital Libary Forum, 2017(6):46-52.)
( Yang Jianliang. A Probe into the Conference Research Hotspot of iConference: Based on Text Data Analysis of Conference Papers from 2008 to 2017[J]. Information and Documentation Services, 2019,40(1):52-63.)
( Du Lijun. An Analysis of the Evolution of Information Retrieval Research Subjects from the Perspective of Interdisciplinary Studies——Taking Information Science and Computer Science as Examples[J]. Information Technology and Informatization, 2020(1):178-183.)
( Liu Jun. Lectures on Whole Network Approach——A Practical Guide to Ucinet[M]. Shanghai:Truth & Wisdom Press, 2009: 6-10.)
[12]
Chen C M. CiteSpace II: Detecting and Visualizing Emerging Trends and Transient Patterns in Scientific Literature[J]. Journal of the American Society for Information Science and Technology, 2006,57(3):359-377.
doi: 10.1002/(ISSN)1532-2890
( Wang Xiaoguang, Cheng Qikai. Analysis on Evolution of Research Topics in a Discipline Based on NEViewer[J]. Journal of the China Society for Scientific and Technical Information, 2013,32(9):900-911.)
[14]
Cobo M J, López-Herrera A G, Herrera-Viedma E, et al. An Approach for Detecting, Quantifying, and Visualizing the Evolution of a Research Field: A Practical Application to the Fuzzy Sets Theory Field[J]. Journal of Informetrics, 2011,5(1):146-166.
doi: 10.1016/j.joi.2010.10.002
( Liu Yong, Du Yi. A Chinese Course of Gephi——Used for Network Data Visualization and Analysis[M]. Beijing:Publishing House of Electronics Industry, 2017.)
( Liu Ziqiang, Wang Xiaoyue, Bai Rujiang. Research on Visualization Analysis Method of Discipline Topics Evolution from the Perspective of Multi-Dimensions: A Case Study of the Big Data in the Field of Library and Information Science in China[J]. Journal of the Library Science in China, 2016,42(6):67-84.)
( Qu Jiabin, Ou Shiyan. Analyzing Topic Evolution with Topic Filtering and Relevance[J]. Data Analysis and Knowledge Discovery, 2018,2(1):64-75.)
[18]
Donohue J C. Understanding Scientific Literatures: A Bibliometric Approach[M]. Cambridge: the MIT Press, 1973: 49-50.
[19]
Krestel R, Fankhauser P, Nejdl W. Latent Dirichlet Allocation for Tag Recommendation[C]// Proceedings of the 3rd ACM Conference on Recommender Systems. 2009: 61-68.
[20]
Newman M E. Modularity and Community Structure in Networks[J]. Proceedings of the National Academy of Sciences of the United States of America, 2006,103(23):8577-8582.
[21]
Arun R, Suresh V, Madhavan C E V, et al. On Finding the Natural Number of Topics with Latent Dirichlet Allocation: Some Observations[C]// Proceedings of the 14th Pacific-Asia Conference on Advances in Knowledge Discovery & Data Mining. 2010: 391-402.
[22]
Li P, Burges C J, Wu Q. McRank: Learning to Rank Using Multiple Classification and Gradient Boosting[C]// Proceedings of the 20th International Conference on Neural Information Processing Systems. 2008: 897-904.
[23]
Hinton G E, Salakhutdinov R R. Reducing the Dimensionality of Data with Neural Networks[J]. Science, 2006,313(5786):504-507.
doi: 10.1126/science.1127647
( Zhang Yanwen. Facebook Social Search and Its Impact on Library Service[J]. Library Tribune, 2014,34(10):115-121.)
[25]
Gao J, Galley M, Li L, et al. Neural Approaches to Conversational AI[J]. Foundations and Trends® in Information Retrieval, 2019,13(2-3):127-298.
doi: 10.1561/1500000074