Summarizing Figures of Chinese Scholarly Articles of Library and Information Science
Bao Chuhan1, Jia Danping1, He Lin1,2(), Ma Xiaowen1, Ai Yuxi1
1College of Information Science and Technology, Nanjing Agricultural University, Nanjing 210095, China 2Research Center for Correlation of Domain Knowledge, Nanjing Agricultural University, Nanjing 210095, China
[Objective] This paper studies the figures of Chinese articles in the field of library and information science (LIS), aiming to establish new principles to summarize them. [Methods] We proposed the framework and rules for figure summarization based on manual indexing and features of LIS papers. Then, we evaluated the performance of the new system with the help of SPSS. [Results] Compared with the existing figure-text model, our method could more effectively process information from the figures. [Limitations] We need to extract more information from the figures, analyze the influences of different charts, and add automatic indexing functions to the new system. [Conclusions] The proposed method could effectively summarize figures from the scholarly articles.
包楚晗, 贾丹萍, 何琳, 马晓雯, 艾毓茜. 中文科技论文图表摘要设计研究*——以图书情报领域为例[J]. 数据分析与知识发现, 2017, 1(10): 21-31.
Bao Chuhan,Jia Danping,He Lin,Ma Xiaowen,Ai Yuxi. Summarizing Figures of Chinese Scholarly Articles of Library and Information Science. Data Analysis and Knowledge Discovery, 2017, 1(10): 21-31.
Futrelle R P.Handling Figures in Document Summarization Abstract[C]//Proceedings of Meeting of the Association for Computational Linguistics. 2004.
[5]
Luhn H P.The Automatic Creation of Literature Abstracts[J]. IBM Journal of Research and Development, 1958, 2(2): 159-165.
doi: 10.1147/rd.22.0159
[6]
Nakov P I, Schwartz A S, Hearst M A.Citances: Citation Sentences for Semantic Analysis of Bioscience Text[C]// Proceedings of the SIGIR’04 Workshop on Search and Discovery in Bioinformatics. 2004.
(Zhou Lang, Zhang Liang, Feng Chong, et al.Terminology Extraction Based on Statistical Word Frequency Distribution Variety[J]. Computer Science, 2009, 36(5): 177-180.)
doi: 10.3969/j.issn.1002-137X.2009.05.045
[8]
Hirao T, Isozaki H, Maeda E, et al.Extracting Important Sentences with Support Vector Machines[C]//Proceedings of the 19th International Conference on Computational Linguistics. 2002: 1-7.
(Zhang Fan, Le Xiaoqiu.Research on Innovation Points Extraction from Scientific Research Paper Based on Field Thesaurus[J].New Technology of Library and Information Service, 2014(9): 15-21.)
[10]
Brunn M, Chali Y, Pinchak C.Text Summarization Using Lexical Chains[C]//Proceedings of the Document Understanding Conference, 2001: 135-140.
(Wang Fang, Shi Haiyan, Ji Xuemei.The Use of Theory in Chinese Information Science Research Based on the Content Analysis of the Journal of the China Society for Scientific and Technical Information[J]. Journal of the China Society for Scientific and Technical Information, 2015, 34(6): 581-591.)
doi: 10.3772/j.issn.1000-0135.2015.006.003
[12]
Dahl T.Contributing to the Academic Conversation: A Study of New Knowledge Claims in Economics and Linguistics[J]. Journal of Pragmatics, 2008, 40(7): 1184-1201.
doi: 10.1016/j.pragma.2007.11.006
[13]
Parkinson J.The Discussion Section as Argument: The Language Used to Prove Knowledge Claims[J]. English for Specific Purposes, 2011, 30(3): 164-175.
doi: 10.1016/j.esp.2011.03.001
[14]
Ramesh B P, Sethi R J, Yu H.Figure-Associated Text Summarization and Evaluation[J]. PLoS One, 2015, 10(2): e0115671.
doi: 10.1371/journal.pone.0115671
pmid: 4313946
[15]
Herbrich R, Graepel T, Obermayer K.Support Vector Learning for Ordinal Regression[C]//Proceedings of the 9th International Conference on Artificial Neural Networks. IET, DOI: 10.1049/cp: 19991091.
(Guan Peng, Wang Yuefen, Fu Zhu.Effect Analysis of Scientific Literature Topic Extraction Based on LDA Topic Model with Different Corpus[J]. Library and Information Service, 2016, 60(2): 112-121.)
doi: 10.13266/j.issn.0252-3116.2016.02.018
[17]
Radev D R, Jing H, Styś M, et al.Centroid-based Summarization of Multiple Documents[J]. Information Processing & Management, 2004, 40(6): 919-938.
doi: 10.1016/j.ipm.2003.10.006
[18]
Agarwal S, Yu H.FigSum: Automatically Generating Structured Text Summaries for Figures in Biomedical Literature[C]//Proceedings of AMIA Annual Symposium. 2009.
(Zhu Liping, Li Hongqi, Yang Zhongguo, et al.An Information Extraction Method for Scientific Literature Introduction[J]. Journal of Shandong University: Natural Science, 2015, 50(7): 23-30, 37.)
(Du Wei, Zou Xianxia.Research of Sliding Windows Scheme Based on Data Stream[J]. Computer Engineering and Design, 2005, 26(11): 2922-2944.)
doi: 10.3969/j.issn.1000-7024.2005.11.019
[21]
Yu H, Agarwal S, Johnston M, et al.Are Figure Legends Sufficient? Evaluating the Contribution of Associated Text to Biomedical Figure Comprehension[J]. Journal of Biomedical Discovery and Collaboration, 2009, 4(1). DOI: 10.1186/1747- 5333-4-1.
doi: 10.1186/1747-5333-4-1
pmid: 19126221
(Fang Bao.An Analysis of the Factors Influencing the Effectiveness of Likert Rating Scale’s Investigation Result[J]. Journal of Shiyan Technical Institute, 2009, 22(2): 25-28.)
doi: 10.3969/j.issn.1008-4738.2009.02.007
[23]
Lin C Y, Hovy E.Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics[C]//Proceedings of the 2003 Conference of North American Chapter of the Association for Computational Linguistics on Human Language. 2003: 71-78.