Please wait a minute...
Advanced Search
现代图书情报技术  2016, Vol. 32 Issue (5): 1-8     https://doi.org/10.11925/infotech.1003-3513.2016.05.01
  综述评介 本期目录 | 过刊浏览 | 高级检索 |
自动引文摘要研究述评
刘天祎,步一,赵丹群(),黄文彬
北京大学信息管理系 北京 100871
Review of Citation-based Automatic Summarization Studies
Liu Tianyi,Bu Yi Zhao Danqun Huang Wenbin,Zhao Danqun(),Huang Wenbin
Department of Information Management, Peking University, Beijing 100871, China
全文: PDF (433 KB)   HTML ( 84
输出: BibTeX | EndNote (RIS)      
摘要 

目的】对引文摘要领域的国外主流研究方法和步骤进行综述分析。【文献范围】选取2007年以来引文摘要领域的重要研究及此前自动摘要、引文分析领域的研究进展。【方法】基于文献调研, 介绍该领域的基本概念以及自然语言处理的方法在引文摘要中的应用。【结果】引文句在摘要实践中起到重要的概括作用、指示作用和关联作用, 具有一定的优越性。【局限】缺乏对引文摘要领域现有成果和可能达成的理想情况的比较。【结论】引文摘要拓展了自动摘要和传统的信息计量学的研究方向, 并对改进自动摘要原有的评估方案提出要求, 同时产生了有关引文窗口扩展、语料库构建等一系列新问题。本文对这些问题进行探讨, 并对引文摘要未来的研究发展进行展望。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
刘天祎
步一
赵丹群
黄文彬
关键词 自动摘要引文摘要引文句自然语言处理    
Abstract

[Objective] This paper is an in-depth review of popular research methodologies adopted by the Citation-Based Summarization (CBS) studies. [Coverage] We retrieved scholarly papers on CBS published since 2007, as well as earlier research on automatic summarization and citation analysis. [Methods] We thoroughly discussed the basic concepts and natural language processing technology in the field of CBS. [Results] Citances plays more important roles in automatic summarization applications than randomly selected sentences from scientific works. [Limitations] We did not compare the current achievements with possible results under the ideal circumstances. [Conclusions] CBS technology expands the scope of traditional informetrics and automatic summarization studies. It also offers suggestion to improve the existing evaluation methods of automatic summarization services. CBS calls for the expansion of citation windows and new experimental corpus. We have addressed these issues and explored new perspectives for the CBS research.

Key wordsAutomatic summarization    Citation-based summarization    Citance    Natural Language Processing
收稿日期: 2015-10-21      出版日期: 2016-06-24
引用本文:   
刘天祎,步一,赵丹群,黄文彬. 自动引文摘要研究述评[J]. 现代图书情报技术, 2016, 32(5): 1-8.
Liu Tianyi,Bu Yi Zhao Danqun Huang Wenbin,Zhao Danqun,Huang Wenbin. Review of Citation-based Automatic Summarization Studies. New Technology of Library and Information Service, 2016, 32(5): 1-8.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2016.05.01      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2016/V32/I5/1
[1] Mei Q, Zhai C.Generating Impact-Based Summaries for Scientific Literature [C]. In: Proceedings of ACL-08: HLT, 2008: 816-824.
[2] Bradshaw S.Reference Directed Indexing: Redeeming Relevance for Subject Search in Citation Indexes [C]. In: Proceedings of the 7th European Conference on Research and Advanced Technology on Digital Libraries (ECDL 2003), Trondheim, Norway. Springer, 2003: 499-510.
[3] Elkiss A, Shen S, Fader A, et al.Blind Men and Elephants: What do Citation Summaries Tell Us about a Research Article?[J]. Journal of the American Society for Information Science and Technology, 2008, 59(1): 51-62.
[4] Mohammad S, Dorr B, Egan M, et al.Using Citations to Generate Surveys of Scientific Paradigms[C]. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 2009: 584-592.
[5] Kan M-Y, Klavans J L, McKeown K R. Using the Annotated Bibliography as a Resource for Indicative Summarization [C]. In: Proceedings of LREC, Las Palmas, Spain. 2002: 1746-1752.
[6] Qazvinian V, Radev D R.Scientific Paper Summarization Using Citation Summary Networks[C]. In: Proceedings of the 22nd International Conference on Computational Linguistics- Volume 1, 2008: 689-696.
[7] 王连喜. 自动摘要研究中的若干问题[J]. 图书情报工作, 2014, 58(20): 13-22.
[7] (Wang Lianxi.Issues in Automatic Summarization Research[J]. Library and Information Service, 2014, 58(20): 13-22.)
[8] Nakov P I, Schwartz A S, Hearst M A.Citances: Citation Sentences for Semantic Analysis of Bioscience Text [C]. In: Proceedings of the SIGIR’04 Workshop on Search and Discovery in Bioinformatics, 2004: 81-88.
[9] Nanba H, Kando N, Okumura M.Classification of Research Papers Using Citation Links and Citation Types: Towards Automatic Review Article Generation[J]. Advances in Classification Research Online, 2000, 11(1): 117-134.
[10] Nanba H, Okumura M.Towards Multi-paper Summarization Using Reference Information [C]. In: Proceedings of the 16th International Joint Conference on Artificial Intelligence, 1999: 926-931.
[11] 刘洋, 崔雷. 引文上下文在文献内容分析中的信息价值研究[J]. 图书情报工作, 2014, 58(6): 101-104.
[11] (Liu Yang, Cui Lei.The Information Value of Citation Context in Document Content Analysis[J]. Library and Information Service, 2014, 58(6): 101-104.)
[12] Qazvinian V, Radev D R.Identifying Non-explicit Citing Sentences for Citation-based Summarization [C]. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, 2010: 555-564.
[13] Athar A, Teufel S.Detection of Implicit Citations for Sentiment Detection [C]. In: Proceedings of the Workshop on Detecting Structure in Scholarly Discourse, 2012: 18-26.
[14] 白光祖, 何远标, 马建霞, 等. 利用小样本量机器学习实现学术文摘结构的自动识别[J]. 现代图书情报技术, 2014(7-8): 34-40.
[14] (Bai Guangzu, He Yuanbiao, Ma Jianxia, et al.Application of Machine Learning with Limited Corpus to Identify Structure of Scientific Abstracts Automatically[J]. New Technology of Library and Information Service, 2014(7-8): 34-40.)
[15] Teufel S.Argumentative Zoning: Information Extraction from Scientific Text [D]. Edinburgh: University of Edinburgh School of Cognitive Science, 2000.
[16] Teufel S, Moens M.Summarizing Scientific Articles: Experiments with Relevance and Rhetorical Status[J]. Computational Linguistics, 2002, 28(4): 409-445.
[17] Liakata M, Dobnik S, Saha S, et al.A Discourse-Driven Content Model for Summarising Scientific Articles Evaluated in a Complex Question Answering Task [C]. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, USA. 2013: 747-757.
[18] Guo Y, Korhonen A, Liakata M, et al.Identifying the Information Structure of Scientific Abstracts: An Investigation of Three Different Schemes [C]. In: Proceedings of the 2010 Workshop on Biomedical Natural Language Processing (ACL 2010), 2010: 99-107.
[19] Mizuta Y, Korhonen A, Mullen T, et al.Zone Analysis in Biology Articles as a Basis for Information Extraction[J]. International Journal of Medical Informatics, 2006, 75(6): 468-487.
[20] Teufel S.Argumentative Zoning for Improved Citation Indexing [A]. //Computing Attitude and Affect in Text: Theory and Applications[M]. Netherlands: Springer, 2006: 159-169.
[21] Ehrler F, Geissbühler A, Jimeno A, et al.Data-poor Categorization and Passage Retrieval for Gene Ontology Annotation in Swiss-Prot[J]. BMC Bioinformatics, 2005, 6(S1): S23.
[22] Hirohata K, Okazaki N, Ananiadou S, et al.Identifying Sections in Scientific Abstracts Using Conditional Random Fields [C]. In: Proceedings of the International Joint Conference on Natural Language Processing, 2008: 381-388.
[23] Teufel S, Siddharthan A, Batchelor C.Towards Discipline- independent Argumentative Zoning: Evidence from Chemistry and Computational Linguistics [C]. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3, 2009: 1493-1502.
[24] Liakata M, Teufel S, Siddharthan A, et al.Corpora for the Conceptualisation and Zoning of Scientific Papers [C]. In: Proceedings of the International Conference on Language Resources and Evaluation, 2010: 2054-2061.
[25] Contractor D, Guo Y, Korhonen A.Using Argumentative Zones for Extractive Summarization of Scientific Articles [C]. In: Proceedings of the International Conference on Computational Linguistics, 2012: 663-678.
[26] Abu-Jbara A, Radev D.Coherent Citation-based Summarization of Scientific Papers [C]. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies- Volume 1, 2011: 500-509.
[27] Carbonell J, Goldstein J.The Use of MMR, Diversity-based Reranking for Reordering Documents and Producing Summaries [C]. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 1998: 335-336.
[28] Qazvinian V, Radev D R, ?zgür A.Citation Summarization Through Keyphrase Extraction [C]. In: Proceedings of the 23rd International Conference on Computational Linguistics, 2010: 895-903.
[29] Mollá D, Jones C, Sarker A.Impact of Citing Papers for Summarisation of Clinical Documents[C]. In: Proceedings of the Australasian Language Technology Association Workshop, 2014: 79.
[30] Jaidka K, Chandrasekaran M K, Jha R, et al.The Computational Linguistics Summarization Pilot Task [C]. In: Proceedings of Text Analysis Conference, 2014.
[31] Radev D, Allison T, Blair-Goldensohn S, et al.MEAD-A Platform for Multidocument Multilingual Text Summarization [C]. In: Proceedings of Conference on Language Resources and Evaluation, 2004: 699-702.
[32] Chen J, Zhuge H.Summarization of Scientific Documents by Detecting Common Facts in Citations[J]. Future Generation Computer Systems, 2014, 32: 246-252.
[33] Galgani F, Compton P, Hoffmann A.Summarization Based on Bi-directional Citation Analysis[J]. Information Processing & Management, 2015, 51(1): 1-24.
[34] Erkan G, Radev D R.LexRank: Graph-based Lexical Centrality as Salience in Text Summarization[J]. Journal of Artificial Intelligence Research, 2004, 22: 457-479.
[35] Shi L, Tong H, Tang J, et al.VEGAS: Visual influEnce GrAph Summarization on Citation Networks[J]. IEEE Transactions on Knowledge and Data Engineering, 2015, 27(12): 3417-3431.
[36] Christensen J, Mausam S S, Soderland S, et al.Towards Coherent Multi-Document Summarization [C]. In: Proceedings of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2013: 1163-1173.
[37] Barzilay R, Lapata M.Modeling Local Coherence: An Entity-based Approach[J]. Computational Linguistics, 2008, 34(1): 1-34.
[38] Lin C-Y.Rouge: A Package for Automatic Evaluation of Summaries [C]. In: Proceedings of the Workshop on Text Summarization Branches out. 2004.
[39] Nenkova A, Passonneau R.Evaluating Content Selection in Summarization: The Pyramid Method [C]. In: Proceedings of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2004: 145-152.
[1] 王一钒,李博,史话,苗威,姜斌. 古汉语实体关系联合抽取的标注方法*[J]. 数据分析与知识发现, 2021, 5(9): 63-74.
[2] 王义真,欧石燕,陈金菊. 民事裁判文书两阶段式自动摘要研究*[J]. 数据分析与知识发现, 2021, 5(5): 104-114.
[3] 陶兴,张向先,郭顺利,张莉曼. 学术问答社区用户生成内容的W2V-MMR自动摘要方法研究*[J]. 数据分析与知识发现, 2020, 4(4): 109-118.
[4] 黄名选,卢守东,徐辉. 基于加权关联模式挖掘与规则后件扩展的跨语言信息检索 *[J]. 数据分析与知识发现, 2019, 3(9): 77-87.
[5] 胡佳慧,方安,赵琬清,杨晨柳,任慧玲. 面向知识发现的中文电子病历标注方法研究 *[J]. 数据分析与知识发现, 2019, 3(7): 123-132.
[6] 贾晓婷, 王名扬, 曹宇. 结合Doc2Vec与改进聚类算法的中文单文档自动摘要方法研究*[J]. 数据分析与知识发现, 2018, 2(2): 86-95.
[7] 杨春雷. 面向语用消歧的量化约束条件系统: 从语言学设计到计算实现*[J]. 数据分析与知识发现, 2017, 1(11): 1-11.
[8] 杨春雷. 基于HPSG的汉语词库和语法规则系统构建*[J]. 现代图书情报技术, 2016, 32(7-8): 129-136.
[9] 唐晓波, 邱鑫. 面向主题的高质量评论挖掘模型研究[J]. 现代图书情报技术, 2015, 31(7-8): 104-112.
[10] 彭浩, 徐健, 肖卓. 基于比较句的网络用户评论情感分析[J]. 现代图书情报技术, 2015, 31(12): 48-56.
[11] 杨春雷, Dan Flickinger. 汉构:面向深层语言处理的语法工程[J]. 现代图书情报技术, 2014, 30(3): 57-64.
[12] 邱均平, 方国平. 基于知识图谱的中外自然语言处理研究的对比分析[J]. 现代图书情报技术, 2014, 30(12): 51-61.
[13] 佘贵清, 张永安. 审判案例自动抽取与标注模型研究[J]. 现代图书情报技术, 2013, (6): 23-29.
[14] 王秀艳, 崔雷. 采用混合方法抽取生物医学实体间语义关系[J]. 现代图书情报技术, 2013, 29(3): 77-82.
[15] 张运良 梁健 朱礼军 乔晓东. 基于术语定义的科技知识组织系统自动丰富关键技术研究*[J]. 现代图书情报技术, 2010, 26(7/8): 66-71.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn