[Objective] This paper finds the potential side effects of drugs with the help of text mining, aiming to improve the contents of existing databases and early prediction of drug side effects. [Methods] A total of 100, 873 articles were retrieved from the PubMed database for about five years (2011-2016). We generated the drug side effects co-occurrence matrix and conducted gCLUTO bi-clustering analysis with Perl’s segmentation technique, named entity recognition method based on the dictionary, as well as the R language. [Results] For one category of results, we found the precision rate of the proposed method reached 75.65%, and identified 13.91% potential side effects. [Limitations] Only used the dictionary-based named entity recognition method and did not consider grammatical or lexis factors, which yielded high false positive rates. [Conclusions] This paper proposes a new approach to detect the unannounced side effects of drugs automatically and effectively.
范馨月, 崔雷. 基于文本挖掘的药物副作用知识发现研究[J]. 数据分析与知识发现, 2018, 2(3): 79-86.
Fan Xinyue,Cui Lei. Using Text Mining to Discover Drug Side Effects: Case Study of PubMed. Data Analysis and Knowledge Discovery, 2018, 2(3): 79-86.
(Zhao Mingzhen, Cheng Liangxi, Lin Hongfei.Detection of Adverse Drug Reactions Based on Comment Mining[J]. Journal of Chinese Information Processing, 2015, 29(6): 193-202.)
[2]
牛姝媛. 基于信息整合的药物副作用预测方法研究[D].上海: 华东师范大学, 2016.
[2]
(Niu Shuyuan.Method Research for the Prediction of Drug’s Side Effect Based on Information Integration[D]. Shanghai: East China Normal University, 2016.)
(Ding Yufeng, Zhou Wenli.Adverse Drug Reactions and Adverse Drug Events[J]. Herald of Medicine, 2004, 23(8): 610.)
doi: 10.3870/j.issn.1004-0781.2004.08.062
[4]
Ho T B, Le L, Thai D T, et al.Data-driven Approach to Detect and Predict Adverse Drug Reactions[J]. Current Pharmaceutical Design, 2016, 22(23): 3498.
doi: 10.2174/1381612822666160509125047
pmid: 27157416
[5]
Karimi S, Wang C, Metke-Jimenez A, et al.Text and Data Mining Techniques in Adverse Drug Reaction Detection[J]. ACM Computing Surveys, 2015, 47(4): 1-39.
doi: 10.1145/2719920
(Liu Haishan.The Correct Distinction Between Adverse Drug Reactions to Eliminate Adverse Drug Reactions Occured[J]. Journal of Practical Medical Techniques, 2005, 12(16): 2309.)
(Zhao Dongyan, Wang Haihong, Wang Guimei, et al.Talking about the Reasons and Preventive Measures of Adverse Drug Reactions[J]. Shanxi Medical Journal, 2010, 39(5): 442-443.)
doi: 10.3969/j.issn.0253-9926.2010.05.028
[8]
张新立. 临床常用药物副作用概述[J]. 健康必读旬刊, 2013, 12(12): 242.
[8]
(Zhang Xinli.Common Clinical Side Effects of Drugs Outlined[J]. Healthmust-Readmagazine, 2013, 12(12): 242.)
(Sui Mingshuang, Cui Lei.Detection of Drug Adverse Effects by Text-Mining[J]. Chinese Journal of Medical Library and Information Science, 2015, 24(11): 67-72.)
doi: 10.3969/j.issn.1671-3982.2015.11.016
[10]
Liu M, Wu Y, Chen Y, et al.Large-scale Prediction of Adverse Drug Reactions Using Chemical, Biological, and Phenotypic Properties of Drugs[J]. Journal of the American Medical Informatics Association, 2012, 19(1): 28-35.
doi: 10.1136/amiajnl-2011-000699
[11]
Pauwels E, Stoven V, Yamanishi Y.Predicting Drug Side-effect Profiles: A Chemical Fragment-based Approach[J]. BMC Bioinformatics, 2011, 12(1): 169.
doi: 10.1186/1471-2105-12-169
pmid: 3125260
[12]
Vilar S, Tatonetti N P, Hripcsak G.3D Pharmacophoric Similarity Improves Multi Adverse Drug Event Identification in Pharmacovigilance[J].Scientific Reports, 2015, 5: 8809.
doi: 10.1038/srep08809
pmid: 25744369
[13]
Wang W, Haerian K, Salmasian H, et al.A Drug-Adverse Event Extraction Algorithm to Support Pharmacovigilance Knowledge Mining from PubMed Citations[C]//Proceedings of AMIA Annual Symposium. AMIA Symposium, 2011: 1464.
(Liu Xiaoqian, Tao Feng, Jin Xin, et al.Exploration of the Medication Regularity of Traditional Chinese Medicine for Obesity Based on Text Mining Techniques[J]. World Science and Technology-Modernization of Traditional Chinese Medicine, 2017, 19(2): 212-217.)
(Guo Jiadong, Zhang Xuemei, Liu Ying, et al.Correlation Analysis of Gastric Cancer Chemotherapy Drugs Adverse Drug Reaction Based on Data Mining Technology[J].Chinese Journal of Pharmacoepidemiology, 2017(1): 46-49.)
[16]
Kwartler T.Text Mining in Practice with R[M]. John Wiley & Sons, Ltd., 2017: 1-15.
[17]
Allahyari M, Pouriyeh S, Assefi M, et al. Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques [OL]. arXiv Preprint, arXiv:1707.02919 2017.
[18]
陈基. 命名实体识别综述[J]. 现代计算机, 2016(3): 24-26.
[18]
(Chen Ji.Survey of Named Entity Recognition[J]. Modern Computer, 2016(3): 24-26.)
[19]
范文婷. 生物医学领域的命名实体识别和标准化[D]. 大连: 大连理工大学, 2013.
[19]
(Fan Wenting.Named Entities Recognition and Normalization in Biomedical Literatures[D]. Dalian: Dalian University of Technology, 2013.)
(Teng Da.Research on Machine Learning Algorithms of Protein Named Entity Recognition and Protein Interaction Relation Extraction[D].Hefei: University of Science and Technology of China, 2012.)
(Liu Buquan, Liao Xiangke.Perl Programming Language Summary[J]. Computer Engineering and Applications, 2002, 38(18): 86-87.)
[22]
Richards J, All S, Skopis G, et al.Opposing Actions of Perl and Cry2 in the Regulation of Perl Target Gene Expression in the Liver and Kidney[J]. American Journal of Physiology, 2013, 305(4): 735-747.
(Wang Wei.Algorithmic Study on Perl-based Automatic Segmentation of Chinese Words[J]. Journal of Zhongzhou University, 2007, 24(1): 120-122.)
doi: 10.3969/j.issn.1008-3715.2007.01.041
[25]
Kuhn M, Letunic I, Jensen L J, et al.The SIDER Database of Drugs and Side Effects[J]. Nucleic Acids Research, 2016, 44(D1): 1075-1079.
doi: 10.1093/nar/gkv1075
pmid: 26481350
[26]
Wishart D S, Knox C, Guo A C, et al.DrugBank: A Knowledgebase for Drugs, Drug Actions and Drug Targets[J]. Nucleic Acids Research, 2008, 36(Database Issue): 901-906.
doi: 10.1093/nar/gkm958
pmid: 2238889
(Wang Xiuyan.Semantic Relations Extraction Based on MeSH Term Association Rules: A Case Study of Drug Side Effects Causing Disease [D]. Shenyang: China Medical University, 2012.)
[28]
Rasmussen M, Karypis G. gCLUTO-An Interactive Clustering, Visualization, and Analysis System [R].UMN-CS TR-04-021, 2004.
[29]
杨颖, 崔雷. 同被引双聚类方法在情报分析中应用研究[C]//中国竞争情报年会, 2013.
[29]
(Yang Ying, Cui Lei.Applied Research of Cited Biclustering Method in Intelligence Analysis[C]//Proceedings of China Competitive Intelligence Annual Meeting, 2013.)
(Yu Yue, Xu Zhijian, Wang Kun, et al.Text Data Mining in Biomedical Informatics Based on Biclustering Method[J]. Library and Information Service, 2012, 56(18): 133-136.)
(Fang Li, Cui Lei.Detection of Frontier and Knowledge Base Using Biclustering Algorithm-A Case Study of h Index[J]. Information Studies: Theory & Application, 2014, 37(11): 55-60.)
[32]
Lyons G, Columb M, Wilson R C, et al.Epidural Pain Relief in Labour: Potencies of Levobupivacaine and Racemic Bupivacaine[J]. British Journal of Anaesthesia, 1998, 81(6): 899-901.
doi: 10.1093/bja/81.6.899
pmid: 10211016
[33]
Song Y K, Lee C.Effects of Ramosetron and Dexamethasone on Postoperative Nausea, Vomiting, Pain, and Shivering in Female Patients Undergoing Thyroid Surgery[J].Journal of Anesthesia, 2013, 27(1): 29-34.
doi: 10.1007/s00540-012-1473-8
pmid: 22965329
(Ren Cuiyu, Ren Hongmei.Cefazolin Sodium Caused Abdominal Pain in 1 Case[J]. Chinese Journal of Misdiagnosis, 2006, 6(19): 3889.)
doi: 10.3969/j.issn.1009-6647.2006.19.223
[35]
Cefazolin Side Effects in Detail[DB/OL]. [2017-09-09]..
[36]
Stevens B, Yamada J, Ohlsson A. Sucrose for Analgesia in Newborn Infants Undergoing Painful Procedures[J]. The Cochrane Database of Systematic Reviews, 2013, 14(1): CD001069.
doi: 10.1002/14651858.CD001069.pub4
pmid: 23440783
[37]
Webster L, Chey W D, Tack J, et al.Randomised Clinical Trial: The Long-term Safety and Tolerability of Naloxegol in Patients with Pain and Opioid-induced Constipation[J]. Alimentary Pharmacology & Therapeutics, 2014, 40(7): 771-779.
[38]
Peiró A M, Martínez J, Martinez E, et al.Efficacy and Tolerance of Metamizole versus Morphine for Acute Pancreatitis Pain[J]. Pancreatology, 2008, 8(1): 25-29.
doi: 10.1159/000114852
pmid: 18235213