|
|
Topic Mining and Evolution Analysis of Medical Sci-Tech Reports with TWE Model |
Shen Si1,Li Qinyu1,Ye Yuan1,Sun Hao1(),Ye Wenhao2 |
1School of Economics and Management, Nanjing University of Science & Technology, Nanjing 210094, China 2School of Information Management, Nanjing University, Nanjing 210023, China |
|
|
Abstract [Objective] The paper uses word embedding representation technology to better discover the implicit associations among topics of the medical science and technology reports, aiming to improve the analysis methods for medical topic evolution. [Methods] We adopted the TWE (Topical Word Embeddings) model to analyze the potential semantic association among topics of oncology studies, as well as their evolution. [Results] We found the splitting correlation of topics in 2006 and 2007, as well as the merging correlation of topics in 2011 and 2012. However, these TWE correlation results were not fully reflected in the topic evolution of generated by traditional LDA method. In 2009 and 2010, the results yielded by traditional LDA and word embedding were completely different. [Limitations] Our sample size is limited because we only collected Chinese reports. More research is needed to examine the proposed method with other medical research topics. [Conclusions] The topic mining and evolution analysis based on the word embeddings representation model could highlight the impacts of deep learning on topic association. It provides better results for topic evolution analysis of medical Sci-Tech reports.
|
Received: 11 September 2019
Published: 12 April 2021
|
|
Fund:Natural Science Foundation of Jiangsu Province(BK20190450);National Natural Science Foundation of China(71974094);National Social Science Fund of China(19FTQB015) |
Corresponding Authors:
Sun Hao
E-mail: 117107010889@njust.edu.cn
|
[1] |
周杰. 科技报告资源的构成及产生机理研究[J]. 情报学报, 2013,32(5):466-471.
|
[1] |
( Zhou Jie. Study on the Composition and Formation of Science and Technology Report[J]. Journal of the China Society for Scientific and Technical Information, 2013,32(5):466-471.)
|
[2] |
孙静, 程齐凯, 张雯. 基于NEViewer的医学科研主题演化可视化分析[J]. 中华医学图书情报杂志, 2014,23(10):56-60.
|
[2] |
( Sun Jing, Cheng Qikai, Zhang Wen. NEViewer-Based Visual Analysis of Medical Scientific Research Topics Evolution[J]. Chinese Journal of Medical Library and Information Science, 2014,23(10):56-60.)
|
[3] |
范少萍, 安新颖, 单连慧, 等. 基于医学文献的主题演化类型与演化路径识别方法研究[J]. 情报理论与实践, 2019,42(3):114-119.
|
[3] |
( Fan Shaoping, An Xinying, Shan Lianhui, et al. Topic Evolution Type and Method of Path Identification Based on Medical Literature[J]. Information Studies: Theory & Application, 2019,42(3):114-119.)
|
[4] |
陈斯斯, 董立平, 许丹, 等. 医学文献主题新颖性探测方法对比分析[J]. 中华医学图书情报杂志, 2018,27(2):20-25.
|
[4] |
( Chen Sisi, Dong Liping, Xu Dan, et al. Comparative Analysis of Subject Novelty Detection Methods in Medical Literature[J]. Chinese Journal of Medical Library and Information Science, 2018,27(2):20-25.)
|
[5] |
Collins F S, Varmus H. A New Initiative on Precision Medicine[J]. New England Journal of Medicine, 2015,372(9):793-795.
|
[6] |
宫小翠, 安新颖. 基于LDA模型的医学领域主题分裂融合探测[J]. 图书情报工作, 2017,61(18):76-83.
|
[6] |
( Gong Xiaocui, An Xinying. A Research of Topic Splitting and Merging Detecting in the Medical Field Based on the LDA Model[J]. Library and Information Service, 2017,61(18):76-83.)
|
[7] |
陈恩红, 邱思语, 许畅, 等. 单词嵌入——自然语言的连续空间表示[J]. 数据采集与处理, 2014,29(1):19-29.
|
[7] |
( Chen Enhong, Qiu Siyu, Xu Chang, et al. Word Embedding: Continuous Space Representation for Natural Language[J]. Journal of Data Acquisition & Processing, 2014,29(1):19-29.)
|
[8] |
Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv: 1301.3781.
|
[9] |
Pennington J, Socher R, Manning C. Glove: Global Vectors for Word Representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014:1532-1543.
|
[10] |
Liu Y, Liu Z, Chua T S, et al. Topical Word Embeddings[C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence. AAAI Press, 2015: 2418-2424.
|
[11] |
Abulaish M, Fazil M. Modeling Topic Evolution in Twitter: An Embedding-Based Approach[J]. IEEE Access, 2018,6:64847-64857.
|
[12] |
徐月梅, 吕思凝, 蔡连侨, 等. 结合卷积神经网络和Topic2Vec 的新闻主题演变分析[J]. 数据分析与知识发现, 2018,2(9):31-41.
|
[12] |
( Xu Yuemei, Lv Sining, Cai Lianqiao, et al. Analyzing News Topic Evolution with Convolutional Neural Networks and Topic2Vec[J]. Data Analysis and Knowledge Discovery, 2018,2(9):31-41.)
|
[13] |
巴志超, 杨子江, 朱世伟, 等. 基于关键词语义网络的领域主题演化分析方法研究[J]. 情报理论与实践, 2016,39(3):67-72.
|
[13] |
( Ba Zhichao, Yang Zijiang, Zhu Shiwei, et al. Research on Domain Topology Evolution Analysis Method Based on Keyword Semantic Network[J]. Information Studies: Theory & Application, 2016,39(3):67-72.)
|
[14] |
Jha K, Xun G, Gopalakrishnan V, et al. DWE-Med: Dynamic Word Embeddings for Medical Domain[J]. ACM Transactions on Knowledge Discovery from Data (TKDD), 2019,13(2):19.
|
[15] |
曲靖野, 陈震, 郑彦宁. 基于主题模型的科技报告文档聚类方法研究[J]. 图书情报工作, 2018,62(4):113-120.
|
[15] |
( Qu Jingye, Chen Zhen, Zheng Yanning. Research on the Text Clustering Method of Science and Technology Reports Based on the Topic Model[J]. Library and Information Service, 2018,62(4):113-120.)
|
[16] |
Vrettas G, Sanderson M. Conferences Versus Journals in Computer Science[M]. John Wiley & Sons, Inc., 2015.
|
[17] |
丁玉飞, 王曰芬, 刘卫江. 基于主题模型的科技监测方法及应用研究[J]. 情报学报, 2015,34(8):854-865.
|
[17] |
( Ding Yufei, Wang Yuefen, Liu Weijiang. Method of Science and Technology Monitoring Based on Topic Model and Its Application[J]. Journal of the China Society for Scientific and Technical Information, 2015,34(8):854-865.)
|
[18] |
王燕鹏. 国内基于主题模型的科技文献主题发现及演化研究进展[J]. 图书情报工作, 2016,60(3):130-137.
|
[18] |
( Wang Yanpeng. Research Progress of Scientific and Technical Literature Topic Detection and Evolution Based on Topic Model in China[J]. Library and Information Service, 2016,60(3):130-137.)
|
[19] |
Chen W Q, Zheng R S, Baade P D, et al. Cancer Statistics in China, 2015[J]. CA: A Cancer Journal for Clinicians, 2016,66(2):115-132.
|
[20] |
付振涛, 郭晓雷, 张思维, 等. 2014年中国鼻咽癌发病与死亡分析[J]. 中华肿瘤杂志, 2018,40(8):566-571.
|
[20] |
( Fu Zhentao, Guo Xiaolei, Zhang Siwei, et al. Analysis of the Incidence and Death of Nasopharyngeal Carcinoma in China in 2014[J]. Chinese Journal of Oncology, 2018,40(8):566-571.)
|
[21] |
曾木圣. 肿瘤学领域的发展现状和未来挑战[J]. 科学观察, 2015,10(3):58-62.
|
[21] |
( Zeng Musheng. Development Status and Future Challenges in the Field of Oncology[J]. Science Focus, 2015,10(3):58-62.)
|
[22] |
崔畅畅, 柯学, 吕慧侠. 肿瘤干细胞靶向治疗研究进展[J]. 药学进展, 2016,40(1):20-29.
|
[22] |
( Cui Changchang, Ke Xue, Lv Huixia. Progress in Study on Therapeutic Targeting of Cancer Stem Cells[J]. Progress in Pharmaceutical Sciences, 2016,40(1):20-29.)
|
[23] |
周庚寅, 张晓芳. 肿瘤多药耐药机制及其逆转[J]. 临床与实验病理学杂志, 2009,25(4):348-351.
|
[23] |
( Zhou Gengyin, Zhang Xiaofang. Mechanism and Reversal of Multidrug Resistance in Tumors[J]. Chinese Journal of Clinical and Experimental Pathology, 2009,25(4):348-351.)
|
[24] |
林高阳, 徐克. MicroRNA调控肿瘤耐药的研究进展[J]. 中国肺癌杂志, 2014,17(10):741-749.
|
[24] |
( Lin Gaoyang, Xu Ke. Advances in Research on MicroRNA Regulation of Tumor Resistance[J]. Chinese Journal of Lung Cancer, 2014,17(10):741-749.)
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|