Please wait a minute...
Data Analysis and Knowledge Discovery  2018, Vol. 2 Issue (3): 79-86    DOI: 10.11925/infotech.2096-3467.2017.1047
Current Issue | Archive | Adv Search |
Using Text Mining to Discover Drug Side Effects: Case Study of PubMed
Fan Xinyue, Cui Lei()
School of Medical Informatics, China Medical University, Shenyang 110122, China
Download: PDF (1348 KB)   HTML ( 7
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper finds the potential side effects of drugs with the help of text mining, aiming to improve the contents of existing databases and early prediction of drug side effects. [Methods] A total of 100, 873 articles were retrieved from the PubMed database for about five years (2011-2016). We generated the drug side effects co-occurrence matrix and conducted gCLUTO bi-clustering analysis with Perl’s segmentation technique, named entity recognition method based on the dictionary, as well as the R language. [Results] For one category of results, we found the precision rate of the proposed method reached 75.65%, and identified 13.91% potential side effects. [Limitations] Only used the dictionary-based named entity recognition method and did not consider grammatical or lexis factors, which yielded high false positive rates. [Conclusions] This paper proposes a new approach to detect the unannounced side effects of drugs automatically and effectively.

Key wordsDrug-Side Effects      Text Mining      Named Entity Recognition      Cluster Analysis     
Received: 20 October 2017      Published: 03 April 2018
ZTFLH:  TP391 G353  

Cite this article:

Fan Xinyue,Cui Lei. Using Text Mining to Discover Drug Side Effects: Case Study of PubMed. Data Analysis and Knowledge Discovery, 2018, 2(3): 79-86.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2017.1047     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2018/V2/I3/79

MeSH ID 药物MeSH词 药物款目词
D000935 Antifungal Agents Agents, Antifungal
Therapeutic Fungicides
Fungicides, Therapeutic
Antibiotics, Antifungal
Antifungal Antibiotics
D001569 Benzodiazepines Benzodiazepine Compounds
Benzodiazepine
D006493 Heparin Unfractionated Heparin
Heparin, Unfractionated
Heparinic Acid
Liquaemin
Sodium Heparin
Heparin, Sodium
Heparin Sodium
alpha-Heparin
alpha Heparin
PubMed MEDLINE SIDER
PMID: 24739449
TI
AB 1 Depression
AB 2
AB 3
AB 4
AB 5 Epilepsy
AB 6
AB 7
AB 8
AB 9
AB 10
AB 11
AB 12
PubMed MEDLINE Drug
PMID: 24739449
TI tianeptine
AB 1
AB 2
AB 3 tianeptine
AB 4 tianeptine
AB 5 tianeptine
AB 6 tianeptine
AB 7
AB 8
AB 9 tianeptine
AB 10 tianeptine
AB 11 tianeptine
AB 12 tianeptine
Lyme disease Polyps General Surgery Hypothermia
clarithromycin 1 0 0 0
ceftriaxone 1 0 0 0
doxycycline 1 0 0 0
erlotinib 0 1 0 0
thyroid 0 0 1 0
tyrosine 0 0 1 0
cabozantinib 0 0 1 0
morphine 0 0 1 1
Cluster 0 Pain Postoperative pain Rheumatism
disease
Headache
ceftriaxone (+) (-) (-) (+)
hyaluronic acid
naloxone (-) (-) (-) (-)
fluconazole (+) (-) (-) (+)
ciclosporin (+) (-) (+) (+)
palonosetron (+) (-) (-) (+)
dinoprostone (+) (-) (-) (-)
[1] 赵明珍, 程亮喜, 林鸿飞. 基于评论挖掘的药物副作用发现机制[J]. 中文信息学报, 2015, 29(6): 193-202.
[1] (Zhao Mingzhen, Cheng Liangxi, Lin Hongfei.Detection of Adverse Drug Reactions Based on Comment Mining[J]. Journal of Chinese Information Processing, 2015, 29(6): 193-202.)
[2] 牛姝媛. 基于信息整合的药物副作用预测方法研究[D].上海: 华东师范大学, 2016.
[2] (Niu Shuyuan.Method Research for the Prediction of Drug’s Side Effect Based on Information Integration[D]. Shanghai: East China Normal University, 2016.)
[3] 丁玉峰, 周文丽. 药物不良反应与药物不良反应事件[J]. 医药导报, 2004, 23(8): 610.
doi: 10.3870/j.issn.1004-0781.2004.08.062
[3] (Ding Yufeng, Zhou Wenli.Adverse Drug Reactions and Adverse Drug Events[J]. Herald of Medicine, 2004, 23(8): 610.)
doi: 10.3870/j.issn.1004-0781.2004.08.062
[4] Ho T B, Le L, Thai D T, et al.Data-driven Approach to Detect and Predict Adverse Drug Reactions[J]. Current Pharmaceutical Design, 2016, 22(23): 3498.
doi: 10.2174/1381612822666160509125047 pmid: 27157416
[5] Karimi S, Wang C, Metke-Jimenez A, et al.Text and Data Mining Techniques in Adverse Drug Reaction Detection[J]. ACM Computing Surveys, 2015, 47(4): 1-39.
doi: 10.1145/2719920
[6] 刘海山. 正确区分药物不良反应杜绝药物不良反应事件发生[J]. 实用医技杂志, 2005, 12(16): 2309.
[6] (Liu Haishan.The Correct Distinction Between Adverse Drug Reactions to Eliminate Adverse Drug Reactions Occured[J]. Journal of Practical Medical Techniques, 2005, 12(16): 2309.)
[7] 赵东彦, 王海虹, 王桂梅, 等.浅谈药品不良反应发生的原因及预防措施[J].山西医药杂志, 2010, 39(5): 442-443.
doi: 10.3969/j.issn.0253-9926.2010.05.028
[7] (Zhao Dongyan, Wang Haihong, Wang Guimei, et al.Talking about the Reasons and Preventive Measures of Adverse Drug Reactions[J]. Shanxi Medical Journal, 2010, 39(5): 442-443.)
doi: 10.3969/j.issn.0253-9926.2010.05.028
[8] 张新立. 临床常用药物副作用概述[J]. 健康必读旬刊, 2013, 12(12): 242.
[8] (Zhang Xinli.Common Clinical Side Effects of Drugs Outlined[J]. Healthmust-Readmagazine, 2013, 12(12): 242.)
[9] 隋明爽, 崔雷. 用文本挖掘方法发现药物的副作用[J]. 中华医学图书情报杂志, 2015, 24(11): 67-72.
doi: 10.3969/j.issn.1671-3982.2015.11.016
[9] (Sui Mingshuang, Cui Lei.Detection of Drug Adverse Effects by Text-Mining[J]. Chinese Journal of Medical Library and Information Science, 2015, 24(11): 67-72.)
doi: 10.3969/j.issn.1671-3982.2015.11.016
[10] Liu M, Wu Y, Chen Y, et al.Large-scale Prediction of Adverse Drug Reactions Using Chemical, Biological, and Phenotypic Properties of Drugs[J]. Journal of the American Medical Informatics Association, 2012, 19(1): 28-35.
doi: 10.1136/amiajnl-2011-000699
[11] Pauwels E, Stoven V, Yamanishi Y.Predicting Drug Side-effect Profiles: A Chemical Fragment-based Approach[J]. BMC Bioinformatics, 2011, 12(1): 169.
doi: 10.1186/1471-2105-12-169 pmid: 3125260
[12] Vilar S, Tatonetti N P, Hripcsak G.3D Pharmacophoric Similarity Improves Multi Adverse Drug Event Identification in Pharmacovigilance[J].Scientific Reports, 2015, 5: 8809.
doi: 10.1038/srep08809 pmid: 25744369
[13] Wang W, Haerian K, Salmasian H, et al.A Drug-Adverse Event Extraction Algorithm to Support Pharmacovigilance Knowledge Mining from PubMed Citations[C]//Proceedings of AMIA Annual Symposium. AMIA Symposium, 2011: 1464.
[14] 刘晓倩, 陶枫, 金昕, 等.基于文本挖掘方法探索中医治疗肥胖病的用药规律[J]. 世界科学技术: 中医药现代化, 2017, 19(2): 212-217.
[14] (Liu Xiaoqian, Tao Feng, Jin Xin, et al.Exploration of the Medication Regularity of Traditional Chinese Medicine for Obesity Based on Text Mining Techniques[J]. World Science and Technology-Modernization of Traditional Chinese Medicine, 2017, 19(2): 212-217.)
[15] 郭佳栋, 张雪梅, 刘影, 等.基于数据挖掘技术对胃癌化疗药物不良反应关联性研究[J]. 药物流行病学杂志, 2017(1): 46-49.
[15] (Guo Jiadong, Zhang Xuemei, Liu Ying, et al.Correlation Analysis of Gastric Cancer Chemotherapy Drugs Adverse Drug Reaction Based on Data Mining Technology[J].Chinese Journal of Pharmacoepidemiology, 2017(1): 46-49.)
[16] Kwartler T.Text Mining in Practice with R[M]. John Wiley & Sons, Ltd., 2017: 1-15.
[17] Allahyari M, Pouriyeh S, Assefi M, et al. Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques [OL]. arXiv Preprint, arXiv:1707.02919 2017.
[18] 陈基. 命名实体识别综述[J]. 现代计算机, 2016(3): 24-26.
[18] (Chen Ji.Survey of Named Entity Recognition[J]. Modern Computer, 2016(3): 24-26.)
[19] 范文婷. 生物医学领域的命名实体识别和标准化[D]. 大连: 大连理工大学, 2013.
[19] (Fan Wenting.Named Entities Recognition and Normalization in Biomedical Literatures[D]. Dalian: Dalian University of Technology, 2013.)
[20] 滕达. 基于机器学习的蛋白质命名实体识别和相互作用关系抽取的研究[D]. 合肥: 中国科学技术大学, 2012.
[20] (Teng Da.Research on Machine Learning Algorithms of Protein Named Entity Recognition and Protein Interaction Relation Extraction[D].Hefei: University of Science and Technology of China, 2012.)
[21] 刘步权, 廖湘科. Perl程序设计语言综述[J]. 计算机工程与应用, 2002, 38(18): 86-87.
[21] (Liu Buquan, Liao Xiangke.Perl Programming Language Summary[J]. Computer Engineering and Applications, 2002, 38(18): 86-87.)
[22] Richards J, All S, Skopis G, et al.Opposing Actions of Perl and Cry2 in the Regulation of Perl Target Gene Expression in the Liver and Kidney[J]. American Journal of Physiology, 2013, 305(4): 735-747.
[23] 石翠, 王杨. 运用perl轻松处理字符串[J]. 办公自动化, 2014(7): 56-57.
[23] (Shi Cui, Wang Yang.Using Perl Easy Processing String[J]. Office Automation, 2014(7): 56-57. )
[24] 王巍. 基于Perl的汉语自动分词算法研究[J]. 中州大学学报, 2007, 24(1): 120-122.
doi: 10.3969/j.issn.1008-3715.2007.01.041
[24] (Wang Wei.Algorithmic Study on Perl-based Automatic Segmentation of Chinese Words[J]. Journal of Zhongzhou University, 2007, 24(1): 120-122.)
doi: 10.3969/j.issn.1008-3715.2007.01.041
[25] Kuhn M, Letunic I, Jensen L J, et al.The SIDER Database of Drugs and Side Effects[J]. Nucleic Acids Research, 2016, 44(D1): 1075-1079.
doi: 10.1093/nar/gkv1075 pmid: 26481350
[26] Wishart D S, Knox C, Guo A C, et al.DrugBank: A Knowledgebase for Drugs, Drug Actions and Drug Targets[J]. Nucleic Acids Research, 2008, 36(Database Issue): 901-906.
doi: 10.1093/nar/gkm958 pmid: 2238889
[27] 王秀艳. 基于主题词关联规则的实体间语义关系抽取——以药物副作用引起疾病为例[D]. 沈阳: 中国医科大学, 2012.
[27] (Wang Xiuyan.Semantic Relations Extraction Based on MeSH Term Association Rules: A Case Study of Drug Side Effects Causing Disease [D]. Shenyang: China Medical University, 2012.)
[28] Rasmussen M, Karypis G. gCLUTO-An Interactive Clustering, Visualization, and Analysis System [R].UMN-CS TR-04-021, 2004.
[29] 杨颖, 崔雷. 同被引双聚类方法在情报分析中应用研究[C]//中国竞争情报年会, 2013.
[29] (Yang Ying, Cui Lei.Applied Research of Cited Biclustering Method in Intelligence Analysis[C]//Proceedings of China Competitive Intelligence Annual Meeting, 2013.)
[30] 于跃, 徐志健, 王坤, 等. 基于双聚类方法的生物医学信息学文本数据挖掘研究[J]. 图书情报工作, 2012, 56(18): 133-136.
[30] (Yu Yue, Xu Zhijian, Wang Kun, et al.Text Data Mining in Biomedical Informatics Based on Biclustering Method[J]. Library and Information Service, 2012, 56(18): 133-136.)
[31] 方丽, 崔雷. 利用双聚类算法探测学科前沿及知识基础——以h指数研究领域为例[J]. 情报理论与实践, 2014, 37(11): 55-60.
[31] (Fang Li, Cui Lei.Detection of Frontier and Knowledge Base Using Biclustering Algorithm-A Case Study of h Index[J]. Information Studies: Theory & Application, 2014, 37(11): 55-60.)
[32] Lyons G, Columb M, Wilson R C, et al.Epidural Pain Relief in Labour: Potencies of Levobupivacaine and Racemic Bupivacaine[J]. British Journal of Anaesthesia, 1998, 81(6): 899-901.
doi: 10.1093/bja/81.6.899 pmid: 10211016
[33] Song Y K, Lee C.Effects of Ramosetron and Dexamethasone on Postoperative Nausea, Vomiting, Pain, and Shivering in Female Patients Undergoing Thyroid Surgery[J].Journal of Anesthesia, 2013, 27(1): 29-34.
doi: 10.1007/s00540-012-1473-8 pmid: 22965329
[34] 任翠玉, 任红梅. 头孢唑林钠引起腹痛1例[J]. 中国误诊学杂志, 2006, 6(19): 3889.
doi: 10.3969/j.issn.1009-6647.2006.19.223
[34] (Ren Cuiyu, Ren Hongmei.Cefazolin Sodium Caused Abdominal Pain in 1 Case[J]. Chinese Journal of Misdiagnosis, 2006, 6(19): 3889.)
doi: 10.3969/j.issn.1009-6647.2006.19.223
[35] Cefazolin Side Effects in Detail[DB/OL]. [2017-09-09]..
[36] Stevens B, Yamada J, Ohlsson A. Sucrose for Analgesia in Newborn Infants Undergoing Painful Procedures[J]. The Cochrane Database of Systematic Reviews, 2013, 14(1): CD001069.
doi: 10.1002/14651858.CD001069.pub4 pmid: 23440783
[37] Webster L, Chey W D, Tack J, et al.Randomised Clinical Trial: The Long-term Safety and Tolerability of Naloxegol in Patients with Pain and Opioid-induced Constipation[J]. Alimentary Pharmacology & Therapeutics, 2014, 40(7): 771-779.
[38] Peiró A M, Martínez J, Martinez E, et al.Efficacy and Tolerance of Metamizole versus Morphine for Acute Pancreatitis Pain[J]. Pancreatology, 2008, 8(1): 25-29.
doi: 10.1159/000114852 pmid: 18235213
[1] Huang Mingxuan,Jiang Caoqing,Lu Shoudong. Expanding Queries Based on Word Embedding and Expansion Terms[J]. 数据分析与知识发现, 2021, 5(6): 115-125.
[2] Xu Guang,Ren Ming,Song Chengyu. Extracting China’s Economic Image from Western News[J]. 数据分析与知识发现, 2021, 5(5): 30-40.
[3] Dai Bing,Hu Zhengyin. Review of Studies on Literature-Based Discovery[J]. 数据分析与知识发现, 2021, 5(4): 1-12.
[4] Yu Chuanming, Wang Manyi, Lin Hongjun, Zhu Xingyu, Huang Tingting, An Lu. A Comparative Study of Word Representation Models Based on Deep Learning[J]. 数据分析与知识发现, 2020, 4(8): 28-40.
[5] Xu Chenfei, Ye Haiying, Bao Ping. Automatic Recognition of Produce Entities from Local Chronicles with Deep Learning[J]. 数据分析与知识发现, 2020, 4(8): 86-97.
[6] Xia Tian. Extracting Key-phrases from Chinese Scholarly Papers[J]. 数据分析与知识发现, 2020, 4(7): 76-86.
[7] Gao Yuan,Shi Yuanlei,Zhang Lei,Cao Tianyi,Feng Jun. Reconstructing Tour Routes Based on Travel Notes[J]. 数据分析与知识发现, 2020, 4(2/3): 165-172.
[8] Ma Jianxia,Yuan Hui,Jiang Xiang. Extracting Name Entities from Ecological Restoration Literature with Bi-LSTM+CRF[J]. 数据分析与知识发现, 2020, 4(2/3): 78-88.
[9] Du Jian. Measuring Uncertainty of Medical Knowledge: A Literature Review[J]. 数据分析与知识发现, 2020, 4(10): 14-27.
[10] Liu Jingru,Song Yang,Jia Rui,Zhang Yipeng,Luo Yong,Ma Jingdong. A BiLSTM-CRF Model for Protected Health Information in Chinese[J]. 数据分析与知识发现, 2020, 4(10): 124-133.
[11] Peng Guan,Yuefen Wang. Advances in Patent Network[J]. 数据分析与知识发现, 2020, 4(1): 26-39.
[12] Mingxuan Huang,Shoudong Lu,Hui Xu. Cross-Language Information Retrieval Based on Weighted Association Patterns and Rule Consequent Expansion[J]. 数据分析与知识发现, 2019, 3(9): 77-87.
[13] Yanan Yang,Wenhui Zhao,Jian Zhang,Shen Tan,Beibei Zhang. Visualizing Policy Texts Based on Multi-View Collaboration[J]. 数据分析与知识发现, 2019, 3(6): 30-41.
[14] Han Huang,Hongyu Wang,Xiaoguang Wang. Automatic Recognizing Legal Terminologies with Active Learning and Conditional Random Field Model[J]. 数据分析与知识发现, 2019, 3(6): 66-74.
[15] Mengji Zhang,Wanyu Du,Nan Zheng. Predicting Stock Trends Based on News Events[J]. 数据分析与知识发现, 2019, 3(5): 11-18.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn