|
|
Identifying and Extracting Figures and Tables from Academic Literature Based on YOLOv5-ECA-BiFPN |
Li Yingqun1,2,Li Yafei1,2(),Pei Lei1,2,Hu Zhiwei1,2,Song Ningyuan1,2 |
1School of Information Management, Nanjing University, Nanjing 210023, China 2Laboratory of Data Intelligence and Cross Innovation, Nanjing University, Nanjing 210023, China |
|
|
Abstract [Objective] This paper aims to accurately identify and extract figures and tables from academic literature, which promotes the dissemination of academic achievements. [Methods] First, we introduced the ECA channel attention module into the YOLOv5 algorithm and replaced the PAN module with BiFPN. Then, we randomly chose 1300 scholarly articles from thirteen subjects as experimental data and converted them to high-quality images using poppler-0.68.0. Finally, we examined the performance of the new algorithm on this dataset. [Results] Compared with the suboptimal algorithm, the F1 value of the new model improved by 1.99% to 99.88% when applied to the dataset. [Limitations] The scope and quantity of data annotation needs to be expanded to more scenarios. [Conclusions] YOLOv5-ECA-BiFPN can effectively improve the recognition of figures and tables from academic journals.
|
Received: 08 September 2022
Published: 22 March 2023
|
|
Fund:Research Project of Laboratory of Data Intelligence and Cross Innovation of Nanjing University |
Corresponding Authors:
Li Yafei,ORCID:0000-0003-1754-2300,E-mail:dg20140013@smail.nju.edu.cn。
|
[1] |
丁培. 学术图表知识发现技术框架及研究进展[J]. 图书情报工作, 2021, 65(23): 136-148.
doi: 10.13266/j.issn.0252-3116.2021.23.015
|
[1] |
(Ding Pei. The Technical Framework and Research Progress of Knowledge Discovery in Academic Figures and Tables[J]. Library and Information Service, 2021, 65(23): 136-148.)
doi: 10.13266/j.issn.0252-3116.2021.23.015
|
[2] |
Liu Y L, Si C K, Jin K, et al. FCENet: An Instance Segmentation Model for Extracting Figures and Captions From Material Documents[J]. IEEE Access, 2020, 9: 551-564.
doi: 10.1109/Access.6287639
|
[3] |
Clark C, Divvala S. PDFFigures 2.0: Mining Figures from Research Papers[C]// Proceedings of 2016 IEEE/ACM Joint Conference on Digital Libraries. 2016: 143-152.
|
[4] |
于丰畅, 陆伟. 一种学术文献图表位置标注数据集构建方法[J]. 数据分析与知识发现, 2020, 4(6): 35-42.
|
[4] |
(Yu Fengchang, Lu Wei. Constructing Data Set for Location Annotations of Academic Literature Figures and Tables[J]. Data Analysis and Knowledge Discovery, 2020, 4(6): 35-42.)
|
[5] |
Glyph & Cog. Xpdf[EB/OL]. [2022-09-13]. http://www.xp.dfreader.com.
|
[6] |
Choudhury S R, Giles C L. An Architecture for Information Extraction from Figures in Digital Libraries[C]// Proceedings of the 24th International Conference on World Wide Web. 2015: 667-672.
|
[7] |
Simon A, Pret J C, Johnson A P. A Fast Algorithm for Bottom-Up Document Layout Analysis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, 19(3): 273-277.
doi: 10.1109/34.584106
|
[8] |
Apache Software Foundation. Apache PDFBox[EB/OL]. [2022-05-13]. https://pdfbox.apache.org.
|
[9] |
Yusuke S. PDFMiner[EB/OL]. [2022-09-13]. https://github.com/euske/pdfminer.
|
[10] |
Hassan T. Object-Level Document Analysis of PDF Files[C]// Proceedings of the 9th ACM Symposium on Document Engineering. 2009: 47-55.
|
[11] |
于丰畅, 程齐凯, 陆伟. 基于几何对象聚类的学术文献图表定位研究[J]. 数据分析与知识发现, 2021, 5(1): 140-149.
|
[11] |
(Yu Fengchang, Cheng Qikai, Lu Wei. Locating Academic Literature Figures and Tables with Geometric Object Clustering[J]. Data Analysis and Knowledge Discovery, 2021, 5(1): 140-149.)
|
[12] |
Praczyk P A, Nogueras-Iso J. Automatic Extraction of Figures from Scientific Publications in High-Energy Physics[J]. Information Technology and Libraries, 2013, 32(4): 25-52.
doi: 10.6017/ital.v32i4.3670
|
[13] |
Li P Y, Jiang X Y, Shatkay H. Figure and Caption Extraction from Biomedical Documents[J]. Bioinformatics, 2019, 35(21): 4381-4388.
doi: 10.1093/bioinformatics/btz228
pmid: 30949681
|
[14] |
Siegel N, Lourie N, Power R, et al. Extracting Scientific Figures with Distantly Supervised Neural Networks[C]// Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries. 2018: 223-232.
|
[15] |
Li P Y, Jiang X Y, Shatkay H. Figure and Caption Extraction from Biomedical Documents[J]. Bioinformatics, 2019, 35(21): 4381-4388.
doi: 10.1093/bioinformatics/btz228
pmid: 30949681
|
[16] |
Chen K, Seuret M, Liwicki M, et al. Page Segmentation of Historical Document Images with Convolutional Autoencoders[C]// Proceedings of the 13th International Conference on Document Analysis and Recognition. 2015: 1011-1015.
|
[17] |
Amin A, Shiu R. Page Segmentation and Classification Utilizing Bottom-Up Approach[J]. International Journal of Image and Graphics, 2001, 1(2): 345-361.
doi: 10.1142/S0219467801000219
|
[18] |
Mehri M, Héroux P, Gomez-Krämer P, et al. Texture Feature Benchmarking and Evaluation for Historical Document Image Analysis[J]. International Journal on Document Analysis and Recognition, 2017, 20(1): 1-35.
doi: 10.1007/s10032-016-0278-y
|
[19] |
Ha J, Haralick R M, Phillips I T. Recursive X-Y Cut Using Bounding Boxes of Connected Components[C]// Proceedings of the 3rd International Conference on Document Analysis and Recognition. 1995: 952-955.
|
[20] |
张建东, 陈仕吉, 徐小婷, 等. 基于词向量的PDF表格抽取研究[J]. 数据分析与知识发现, 2021, 5(8): 34-44.
|
[20] |
(Zhang Jiandong, Chen Shiji, Xu Xiaoting, et al. Extracting PDF Tables Based on Word Vectors[J]. Data Analysis and Knowledge Discovery, 2021, 5(8): 34-44.)
|
[21] |
Hassan T, Baumgartner R. Table Recognition and Understanding from PDF Files[C]// Proceedings of the 9th International Conference on Document Analysis and Recognition. 2007: 1143-1147.
|
[22] |
唐锐, 邓建新, 叶志兴, 等. PDF文件的表格抽取研究综述[J]. 计算机应用与软件, 2021, 38(7): 1-7.
|
[22] |
(Tang Rui, Deng Jianxin, Ye Zhixing, et al. Survey of Table Extraction in PDF Documents[J]. Computer Applications and Software, 2021, 38(7): 1-7.)
|
[23] |
Ultralytics. YOLOv5[OL].[2022-11-12]. https://github.com/ultralytics/yolov5.
|
[24] |
Lin T Y, Dollár P, Girshick R, et al. Feature Pyramid Networks for Object Detection[C]// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017: 936-944.
|
[25] |
Liu S, Qi L, Qin H F, et al. Path Aggregation Network for Instance Segmentation[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018: 8759-8768.
|
[26] |
Wang Q L, Wu B G, Zhu P F, et al. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 11531-11539.
|
[27] |
Tan M X, Pang R M, Le Q V. EfficientDet: Scalable and Efficient Object Detection[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 10778-10787.
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|