Please wait a minute...
Data Analysis and Knowledge Discovery  2019, Vol. 3 Issue (4): 22-32    DOI: 10.11925/infotech.2096-3467.2018.1153
Current Issue | Archive | Adv Search |
Research on User Information Requirement in Chinese Network Health Community: Taking Tumor-forum Data of Qiuyi as an Example
Quan Lu1,2,Anqi Zhu1,Jiyue Zhang1,Jing Chen3()
1School of Information Management, Wuhan University, Wuhan 430072, China
2Big Data Research Institute, Wuhan University, Wuhan 430072, China
3School of Information Management, Central China Normal University, Wuhan 430079, China
Download: PDF (891 KB)   HTML ( 10
Export: BibTeX | EndNote (RIS)      

[Objective] This paper constructs an information demand mining framework of Chinese online health community users adapted to the big data environment, and analyzes the user information needs by taking the data of tumor-forum as an example. [Methods] The Latent Semantic Indexing (LSI) model and MapReduce distributed text clustering technology were used in this framework to mine the user information needs. We use all the Q&A data (24,305 in total) from tumor-forum of Chinese online health community ( as the experimental data source. [Results] The proposed framework mines the five information needs and their proportions of the tumor users: treatment (43.3%), pathology and etiology (34.5%), examination (12.1%), postoperative (7.0%), prevention (3.1%), and top 20 keywords of these needs. The analysis shows the growth of each needs, and the significant difference between domestic users and foreign users. Gender differences are also significant, the male need treatment information most, while female need pathological and etiological information most. Age difference is large too, and the information needs of young people are the largest (83.79%), etc. [Limitations] There may be better threshold selection, and the medical thesaurus is not prefect. The analysis of information needs is not multidimensional. [Conclusions] The proposed framework is feasible. The paper found the trend of the demand distribution changes with year and the distribution of users information needs vary with age or gender.

Key wordsOnline Health Community      Information Needs      Big Data Mining      Distributed Text Clustering      Tumor     
Received: 18 October 2018      Published: 29 May 2019

Cite this article:

Quan Lu,Anqi Zhu,Jiyue Zhang,Jing Chen. Research on User Information Requirement in Chinese Network Health Community: Taking Tumor-forum Data of Qiuyi as an Example. Data Analysis and Knowledge Discovery, 2019, 3(4): 22-32.

URL:     OR

[1] CNNIC. 第39次中国互联网络发展状况统计报告[EB/OL]. [2017-01-22] . 201701/P020170123364672657408.pdf
[1] (CNNIC. The 39th Statistical Report on the Development of Internet in China[EB/OL]. [2017-01-22]. hlwxzbg/hlwtjbg/201601/P020160122444930951954.pdf
[2] 欧盟. 第三次健康成长计划(2014-2020)[EB/OL]. [2014-03-21].
[2] (The European Union. Third Health Programme (2014-2020)[EB/OL]. [2014-03-21]. )
[3] 中华人民共和国国家卫生健康委员会. “十三五”全国人口健康信息化发展规划[R/OL]. [2018-03-21]. http://ghs.ndrc.
[3] (National Health and Family Planning Commission. “13th Five-Year Plan”: National Population Health Information Development Plan[R/OL]. [2018-03-21]. 201707/t20170720_855014.html
[4] 赵海平, 邓胜利. 基于社会化问答平台的用户健康信息行为研究综述[J]. 信息资源管理学报, 2016(4): 19-27.
[4] (Zhao Haiping, Deng Shengli.Literature Review of Users' Health Information Behavior in Social Q&A Platform: Research Topic and Method[J]. Journal of Information Resources Management, 2016(4): 19-27.)
[5] Oh S, Zhang Y, Park M S.Cancer Information Seeking in Social Question and Answer Services: Identifying Health-Related Topics in Cancer Questions on Yahoo! Answers[J]. Information Research, 2016, 21(3). http://www.informationr. net/ir/21-3/paper718.html#.XLlSRvl6enE
[6] Tsuya A, Sugawara Y, Tanaka A, et al.Do Cancer Patients Tweet? Examining the Twitter Use of Cancer Patients in Japan[J]. Journal of Medical Internet Research, 2014, 16(e5): e137.
[7] Shaw R J, Johnson C M.Health Information Seeking and Social Media Use on the Internet Among People with Diabetes[J]. Online Journal of Public Health Informatics, 2011, 3(1). DOI:10.5210/ojphi.v3i1.3561.
[8] 魏永婷, 陈英, 许亚红. 癌症患者住院化疗期间健康信息需求状况调查分析[J]. 护理实践与研究, 2013, 10(11): 152-153.
[8] (Wei Yongting, Chen Ying, Xu Yahong.Investigation and Analysis of the Health Information Needs Among Patients with Cancer During Chemotherapy in Hospital[J]. Nursing Practice and Research, 2013, 10(11): 152-153.)
[9] 黄雪薇, 张瑛, 王秀利, 等. 癌症患者的信息需求——《癌症患者信息选择问卷》的编制与评估[J]. 中国心理卫生杂志, 2003, 17(11): 750-753.
[9] (Huang Xuewei, Zhang Ying, Wang Xiuli, et al.Information Needs of Cancer Patients: Development and Evaluation of Information Preference Questionnaire for Cancer Patients[J]. Chinese Mental Health Journal, 2003, 17(11): 750-753.)
[10] Valero-Aguilera B, Bermudez-Tamayo C, Francisco Garcia-Gutierrez J, et al. Information Needs and Internet Use in Urological and Breast Cancer Patients[J]. Supportive Care in Cancer, 2014, 22(2): 545-552.
[11] Friedemann-Sanchez G, Griffin J M, Partin M R.Gender Differences in Colorectal Cancer Screening Barriers and Information Needs[J]. Health Expectations, 2007, 10(2): 148-160.
[12] 张馨遥, 曹锦丹. 网络环境下用户健康信息需求的影响因素分析[J]. 医学与社会, 2010, 23(9): 25-27.
[12] (Zhang Xinyao, Cao Jindan.The Analysis of Influence Factors of Health Information Network Users' Requirement[J]. Medicine and Society, 2010, 23(9): 25-27.)
[13] 郭光霞. 糖尿病患者健康信息需求调查分析及护理对策[J]. 基层医学论坛, 2008, 12(21): 628-629.
[13] (Guo Guangxia.Health Survey Analysis and Nursing Countermeasure for Diabetic Patients[J]. Public Medical Forum Magazine, 2008, 12(21): 628-629.)
[14] 武燕燕, 姜亚芳. 住院化疗癌症患者信息需求的调查研究[J]. 中华现代护理杂志, 2010, 16(4): 384-387.
[14] (Wu Yanyan, Jiang Yafang.Investigation of Information Needs of Chemotherapy Inpatients[J]. Chinese Journal of Modern Nursing, 2010, 16(4): 384-387.)
[15] Oh H J, Lauckner C, Boehmer J, et al.Facebooking for Health: An Examination into the Solicitation and Effects of Health-Related Social Support on Social Networking Sites[J]. Computers in Human Behavior, 2013, 29(5): 2072-2080.
[16] Ramo D E, Liu H, Prochaska J J.A Mixed-Methods Study of Young Adults' Receptivity to Using Facebook for Smoking Cessation: If You Build It, Will They Come?[J]. American Journal of Health Promotion, 2015, 29(4): e126-e135.
[17] Bernad V M, Maderuelo F J Á, Moreno G P. Information Needs of the Health and Diseases in Users of Healthcare Services in Primary Care at Salamanca, Spain[J]. Atencion Primaria, 2016, 48(1): 15-24.
[18] Bowler L, Oh J S, He D, et al.Eating Disorder Questions in Yahoo! Answers: Information, Conversation, or Reflection?[C]// Proceedings of the American Society for Information Science and Technology. 2012.
[19] 金碧漪, 许鑫. 社会化问答社区中糖尿病健康信息的需求分析[J]. 中华医学图书情报杂志, 2014, 23(12): 37-42.
[19] (Jin Biyi, Xu Xin.Health Information Needs of Diabetics in Social Q&A Community[J]. Chinese Journal of Medical Library and Information Science, 2014, 23(12): 37-42.)
[20] Stonbraker S, Larson E.Health-information Needs of HIV-positive Adults in Latin America and the Caribbean: An Integrative Review of the Literature[J]. Aids Care, 2016, 28(10): 1223-1229.
[21] 吕英杰. 网络健康社区中的文本挖掘方法研究[D]. 上海: 上海交通大学, 2013.
[21] (Lv Yingjie.Research on Text Mining in Online Health Community[D]. Shanghai: Shanghai Jiao Tong University, 2013.)
[22] 李重阳, 翟姗姗, 郑路. 网络健康社区信息需求特征测度——基于时间和主题视角的实证分析[J]. 数字图书馆论坛, 2016(9): 34-42.
[22] (Li Chongyang, Zhai Shanshan, Zhen Lu.Measurement of Information Demand Characteristics in Online Health Community: An Empirical Analysis Based on Time and Theme Perspective[J]. Digital Library Forum, 2016(9): 34-42.)
[23] 龙树全, 赵正文, 唐华. 中文分词算法概述[J]. 电脑知识与技术, 2009, 5(10): 2605-2607.
[23] (Long Shuquan, Zhao Zhengwen, Tang Hua.Overview on Chinese Segmentation Algorithm[J]. Computer Knowledge and Technology, 2009, 5(10): 2605-2607.)
[24] 常娥. 基于LSI理论的文本自动聚类研究[J]. 图书情报工作, 2012, 56(11): 89-92.
[24] (Chang E.Automatic Text Clustering Based on Latent Semantic Index Theory[J]. Library and Information Service, 2012, 56(11): 89-92.)
[25] 李钊, 李晓, 王春梅, 等. 一种基于MapReduce的文本聚类方法研究[J]. 计算机科学, 2016, 43(1): 246-250.
[25] (Li Zhao, Li Xiao, Wang Chunmei, et al.Text Clustering Method Study Based on MapReduce[J]. Computer Science, 2016, 43(1): 246-250.)
[26] 吴江, 侯绍新, 靳萌萌, 等. 基于LDA模型特征选择的在线医疗社区文本分类及用户聚类研究[J]. 情报学报, 2017, 36(11): 1183-1191.
[26] (Wu Jiang, Hou Shaoxin, Jin Mengmeng, et al.LDA Feature Selection Based Text Classification and User Clustering in Chinese Online Health Community[J]. Journal of the China Society for Scientific and Technical Information, 2017, 36(11): 1183-1191.)
[27] 郑英鑫. 数据挖掘中基于肘部法则的聚类分析在中小学生出行路线优化设计的应用[J]. 电子世界, 2017(9): 146.
[27] (Zheng Yingxin.Application of Clustering Analysis Based on Elbow Rule in Data Mining in the Optimization Design of Primary and Secondary School Students' Travel Routes[J]. Electronics World, 2017(9): 146.)
[28] Kanungo T, Mount D M, Netanyahu N S, et al.An Efficient K-means Clustering Algorithm: Analysis and Implementation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(7): 881-892.
[29] Cho J, Noh H, Ha M H, et al.What Kind of Cancer Information Do Internet Users Need?[J]. Supportive Care in Cancer, 2011, 19(9): 1465-1469.
[30] Chen W.Cancer Statistics: Updated Cancer Burden in China[J]. Chinese Journal of Cancer Research, 2015, 27(1): 1.
[1] Li He,Liu Jiayu,Shen Wang,Liu Rui,Jin Shuaiqi. Recommending Knowledge for Online Health Community Users Based on Fuzzy Cognitive Map[J]. 数据分析与知识发现, 2020, 4(12): 55-67.
[2] Jing Shi,Chenlu Li,Yuxing Qian,Liqin Zhou,Bin Zhang. Information Needs of Domestic and International HCQA Users ——An Empirical Analysis[J]. 数据分析与知识发现, 2019, 3(5): 1-10.
[3] Lei Yang,Zirun Wang,Guisheng Hou. Discovering Topics of Online Health Community with Q-LDA Model[J]. 数据分析与知识发现, 2019, 3(11): 52-59.
[4] Cao Mei,Zhu Xuefang. Research Progress on User Image Descriptions[J]. 现代图书情报技术, 2009, 25(12): 31-36.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938