1National Science Library, Chinese Academy of Sciences, Beijing 100190, China 2Department of Library, Information and Archives Management, University of Chinese Academy of Sciences, Beijing 100190, China 3School of Computer Science & Technology, Beijing Institute of Technology, Beijing 100081, China;
[Objective] This paper aims to identify innovative topics from massive volumes of texts. [Methods] First, we extracted knowledge points with heavier weights from the data of scholarly knowledge graph. Then, these knowledge points were labeled as innovative seeds from the perspectives of “popularity”, “novelty” and “authority”. Third, we computed the knowledge correlation of the innovative seeds. Finally, the results were input to a deep learning model trained by large amounts of sci-tech papers to generate innovative topics. Note: the model is sequence to sequence with Bi-LSTM. [Results] We used Chinese research papers on artificial intelligence as the experimental data and found the average innovation score of the retrieved topics was 6.52, which were evaluated by experts manually. [Limitations] At present, contents of the knowledge graph and the training datasets need to be improved. [Conclusions] The proposed model, which identifies innovative topics from scholarly papers, could be optimized in the future.
付常雷,钱力,张华平,赵华茗,谢靖. 基于深度学习的创新主题智能挖掘算法研究*[J]. 数据分析与知识发现, 2019, 3(1): 46-54.
Changlei Fu,Li Qian,Huaping Zhang,Huaming Zhao,Jing Xie. Mining Innovative Topics Based on Deep Learning. Data Analysis and Knowledge Discovery, DOI：10.11925/infotech.2096-3467.2018.1365.
(Zhang Fan, Le Xiaoqiu.Research on Innovation Points Extraction from Scientific Research Paper Based on Field Thesaurus[J]. New Technology of Library and Information Service, 2014(9): 15-21.)
Mikolov T, Chen K, Corrado G, et al.Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint. arXiv: 1301.3781.
Mikolov T, Sutskever I, Chen K, et al.Distributed Representations of Words and Phrases and Their Compositionality[C]//Proceedings of International Conference on Neural Information Processing Systems. 2013: 3111-3119.
(Zhu Qunxiong, Sun Feng.Study on Application of Recurrent Neural Network[J]. Journal of Beijing University of Chemical Technology: Natural Science Edition, 1998, 25(1): 86-90.)
Pascanu R, Mikolov T, Bengio Y.On the Difficulty of TrainingRecurrent Neural Networks[C]// Proceedings of International Conference on Machine Learning. 2013.
Theodoridis S.Neural Networks and Deep Learning[A]// Machine Learning[M]. 2015: 875-936.
Sundermeyer M, Schlüter R, Ney H.LSTM Neural Networks for Language Modeling[C]// Proceedings of Interspeech. 2012.
Gers F A, Schmidhuber J, Cummins F.Learning to Forget: Continual Prediction with LSTM[J]. Neural Computation, 2014, 12(10): 2451-2471.
Hakkani-Tür D, Tur G, Celikyilmaz A, et al.Multi-Domain Joint Semantic Frame Parsing Using Bi-directional RNN- LSTM[C]//Proceedings of the Meeting of the International Speech Communication Association. 2016.
Lample G, Ballesteros M, Subramanian S, et al.Neural Architectures for Named Entity Recognition[OL]. arXiv Preprint. arXiv: 1603.0136.
Ma X, Hovy E.End-to-End Sequence Labeling via Bi-directional LSTM-CNNs-CRF[OL]. arXiv Preprint. arXiv: 1603.01354.
Sutskever I, Vinyals O, Le Q V.Sequence to Sequence Learning with Neural Networks[OL]. arXiv Preprint. arXiv: 1409.3215.
Bahdanau D, Cho K, Bengio Y.Neural Machine Translation by Jointly Learning to Align and Translate[OL]. arXiv Preprint. arXiv: 1409.0473.
(Yang Jianlin, Qian Lingfei.A Method for Novel Novelty Measurement Based on Keyword to Inverse Document Frequency[J]. Information Studies: Theory & Application, 2013, 36(3): 99-102.)
Mikolov T, Le Q V, Sutskever I.Exploiting Similarities Among Languages for Machine Translation[OL]. arXiv Preprint. arXiv: 1309.4168.
Hinton G E, Srivastava N, Krizhevsky A, et al.Improving Neural Networks by Preventing Co-adaptation of Feature Detectors[OL]. arXiv Preprint. arXiv: 1207.0580.
Kajdanowicz T, Kazienko P, Kraszewski J.Boosting Algorithm with Sequence-Loss Cost Function for Structured Prediction[C]//Proceedings of International Conference on Hybrid Artificial Intelligence Systems. 2010: 573-580.
Kingma D, Ba J.Adam: A Method for Stochastic Optimization [OL]. arXiv Preprint. arXiv: 1412.6980.