Identifying Academic Creative Concept Topics Based on Future Work of Scientific Papers
Song Ruoxuan,Qian Li(),Du Yu
Department of Library, Information and Archives Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China
[Objective] This paper analyzes the sentences on future work from scientific papers, aiming to automatically generate academic innovation ideas. [Methods] First, we combined rule matching with BERT to extract sentences on future work from papers. Then, we conducted the expansion calculation on papers in related fields, and identified keywords and papers on future directions. Finally, these innovative raw materials were fed to the UniLM-based model to create topics of innovation concepts. [Results] The average innovation score of the generated results is 6.04 points, and the average interest level score is 6.01 points. [Limitations] The topic generation model neither includes prior semantic knowledge nor uses large-scale data for experiment, and the quality of generated topics needs to be improved. [Conclusions] The proposed method provides a new idea to expand technological innovation.
宋若璇,钱力,杜宇. 基于科技论文中未来工作句集的学术创新构想话题自动生成方法研究*[J]. 数据分析与知识发现, 2021, 5(5): 10-20.
Song Ruoxuan,Qian Li,Du Yu. Identifying Academic Creative Concept Topics Based on Future Work of Scientific Papers. Data Analysis and Knowledge Discovery, 2021, 5(5): 10-20.
Olcay E, Dengler C, Lohmann B. Data-driven System Identification of an Innovation Community Model[J]. IFAC-Papers OnLine, 2018,51(11):1269-1274.
[2]
Martinez-Torres R, Olmedilla M. Identification of Innovation Solvers in Open Innovation Communities Using Swarm Intelligence[J]. Technological Forecasting and Social Change, 2016,109(8):15-24.
doi: 10.1016/j.techfore.2016.05.007
[3]
von Hippel E, von Krogh G. Open Source Software and the “Private-Collective” Innovation Model: Issues for Organization Science[J]. Organization Science, 2003,14(2):209-223.
doi: 10.1287/orsc.14.2.209.14992
( Chen Lijuan, Lan Yanyan, Pang Liang, et al. Generation of Creative Concept Topic[J]. Journal of Shanxi University (Natural Science Edition), 2019,42(1):56-63.)
( Li Ying, Zhou Li. Value and Ideal Model of Reasonable Showing of Innovation Points in Scientific Papers[J]. Chinese Journal of Scientific and Technical Periodicals, 2018,29(10):993-999.)
[6]
Zhang M, Fan B, Zhang N, et al. Mining Product Innovation Ideas from Online Reviews[J]. Information Processing & Management, 2020,58(1):102389.
doi: 10.1016/j.ipm.2020.102389
[7]
Almeida J N, Azevedo S, Carvalho J P. Towards Automatic Web Identification of Solutions in Patient Innovation[A]//Computational Intelligence and Mathematics for Tackling Complex Problems[M]. Springer, Cham, 2020: 9-14.
( Leng Fuhai, Bai Rujiang, Zhu Qingsong. A Hybrid Semantic Information Extraction Method for Scientific Research Papers[J]. Library and Information Service, 2013,57(11):112-119.)
[10]
Chen L L, Fang H. An Automatic Method for Extracting Innovative Ideas Based on the Scopus® Database[J]. Knowledge Organization, 2019,46(3):171-186.
doi: 10.5771/0943-7444-2019-3
( Zhou Haichen, Zheng Dejun, Li Tianyu. Research on the Identification of Academic Innovation Contributions of Full Academic Texts[J]. Journal of the China Society for Scientific and Technical Information, 2020,39(8):845-851.)
( Huang Lu, Zhu Yihe, Zhang Yi. Research on Identification of Emerging Topics Based on Link Prediction with Weighted Networks[J]. Journal of the China Society for Scientific and Technical Information, 2019,38(4):335-341.)
( Wang Jinfeng, Wu Min, Yue Junju, et al. Research on the Path of Technology Opportunity Identification in the Innovation Process[J]. Information Studies: Theory & Application, 2017,40(8):82-86.)
[14]
Krenn M, Zeilinger A. Predicting Research Trends with Semantic and Neural Networks with an Application in Quantum Physics[J]. PNAS, 2020,117(4):1910-1916.
doi: 10.1073/pnas.1914370116
[15]
Wang Q Y, Huang L F, Jiang Z Y, et al. PaperRobot: Incremental Draft Generation of Scientific Ideas[OL]. arXiv Preprint, arXiv: 1905. 07870.
[16]
Spangler S, Wilkins A D, Bachman B J, et al. Automated Hypoconfproc Generation Based on Mining Scientific Literature[C]// Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2014: 1877-1886.
[17]
Hu Y, Wan X J. Mining and Analyzing the Future Works in Scientific Articles[OL]. arXiv Preprint, arXiv: 1507. 02140.
[18]
Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 5998-6008.
[19]
Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[OL]. arXiv Preprint, arXiv: 1810. 04805.
[20]
Liu Y. Fine-tune BERT for Extractive Summarization[OL]. arXiv Preprint, arXiv: 1903. 10318.
[21]
Levenshtein V I. Binary Codes Capable of Correcting Deletions, Insertions, and Reversals[J]. Soviet Physics Doklady, 1966,10(8):707-710.
[22]
Dong L, Yang N, Wang W H, et al. Unified Language Model Pre-training for Natural Language Understanding and Generation[C]// Proceedings of the 33rd Conference on Neural Information Processing Systems. 2019: 13063-13075.