Please wait a minute...
Advanced Search
数据分析与知识发现  2019, Vol. 3 Issue (4): 22-32     https://doi.org/10.11925/infotech.2096-3467.2018.1153
  专题 本期目录 | 过刊浏览 | 高级检索 |
中文网络健康社区中的用户信息需求挖掘研究*——以求医网肿瘤板块数据为例
陆泉1,2,朱安琪1,张霁月1,陈静3()
1武汉大学信息管理学院 武汉 430072
2武汉大学大数据研究院 武汉 430072
3华中师范大学信息管理学院 武汉 430079
Research on User Information Requirement in Chinese Network Health Community: Taking Tumor-forum Data of Qiuyi as an Example
Quan Lu1,2,Anqi Zhu1,Jiyue Zhang1,Jing Chen3()
1School of Information Management, Wuhan University, Wuhan 430072, China
2Big Data Research Institute, Wuhan University, Wuhan 430072, China
3School of Information Management, Central China Normal University, Wuhan 430079, China
全文: PDF (891 KB)   HTML ( 10
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】构建适应大数据环境的中文网络健康社区用户信息需求挖掘框架, 以肿瘤科为例分析用户信息需求。【方法】使用潜在语义索引(LSI)模型和MapReduce分布式文本聚类技术对中文网络健康社区——求医网肿瘤板块的全部提问数据(共计24 305条)进行用户信息需求挖掘。【结果】挖掘出用户的5个信息需求类目及其占比: 治疗(43.3%)、病理及病因(34.5%)、检查(12.1%)、术后(7.0%)、预防(3.1%), 各类目下Top20关键词; 发现国内外各需求类目占比差异巨大; 预防信息需求将持续上升; 需求的性别差异显著, 男性最关注治疗信息、女性最关注病理及病因信息; 需求的年龄差异较大, 青年群体占比极高(83.79%)等。【局限】可能存在更好的阈值选择, 更完整医学主题词表; 尚未进行信息需求的多维分析。【结论】本文框架可在大数据环境下挖掘用户信息需求, 并分析需求的变化趋势以及年龄与性别差异。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
陆泉
朱安琪
张霁月
陈静
关键词 网络健康社区信息需求大数据挖掘分布式文本聚类肿瘤    
Abstract

[Objective] This paper constructs an information demand mining framework of Chinese online health community users adapted to the big data environment, and analyzes the user information needs by taking the data of tumor-forum as an example. [Methods] The Latent Semantic Indexing (LSI) model and MapReduce distributed text clustering technology were used in this framework to mine the user information needs. We use all the Q&A data (24,305 in total) from tumor-forum of Chinese online health community (qiuyi.cn) as the experimental data source. [Results] The proposed framework mines the five information needs and their proportions of the tumor users: treatment (43.3%), pathology and etiology (34.5%), examination (12.1%), postoperative (7.0%), prevention (3.1%), and top 20 keywords of these needs. The analysis shows the growth of each needs, and the significant difference between domestic users and foreign users. Gender differences are also significant, the male need treatment information most, while female need pathological and etiological information most. Age difference is large too, and the information needs of young people are the largest (83.79%), etc. [Limitations] There may be better threshold selection, and the medical thesaurus is not prefect. The analysis of information needs is not multidimensional. [Conclusions] The proposed framework is feasible. The paper found the trend of the demand distribution changes with year and the distribution of users information needs vary with age or gender.

Key wordsOnline Health Community    Information Needs    Big Data Mining    Distributed Text Clustering    Tumor
收稿日期: 2018-10-18      出版日期: 2019-05-29
基金资助:*本文系教育部人文社会科学重点研究基地重大项目“大数据资源的挖掘与服务研究——面向医疗健康领域”(项目编号: 17JJD870002)和教育部规划基金项目“基于前景理论的信息搜索过程建模与预测研究”(项目编号: 18YJA870002)的研究成果之一
引用本文:   
陆泉,朱安琪,张霁月,陈静. 中文网络健康社区中的用户信息需求挖掘研究*——以求医网肿瘤板块数据为例[J]. 数据分析与知识发现, 2019, 3(4): 22-32.
Quan Lu,Anqi Zhu,Jiyue Zhang,Jing Chen. Research on User Information Requirement in Chinese Network Health Community: Taking Tumor-forum Data of Qiuyi as an Example. Data Analysis and Knowledge Discovery, 2019, 3(4): 22-32.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2018.1153      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2019/V3/I4/22
[1] CNNIC. 第39次中国互联网络发展状况统计报告[EB/OL]. [2017-01-22] .http://www.cnnic.net.cn/hlwfzyj/hlwxzbg/hlwtjbg/ 201701/P020170123364672657408.pdf
[1] (CNNIC. The 39th Statistical Report on the Development of Internet in China[EB/OL]. [2017-01-22]. http://www.cnnic.cn/hlwfzyj/ hlwxzbg/hlwtjbg/201601/P020160122444930951954.pdf
[2] 欧盟. 第三次健康成长计划(2014-2020)[EB/OL]. [2014-03-21]. https://ec.europa.eu/health/funding/programme_en.
[2] (The European Union. Third Health Programme (2014-2020)[EB/OL]. [2014-03-21]. )https://ec.europa.eu/health/funding/programme_en.
[3] 中华人民共和国国家卫生健康委员会. “十三五”全国人口健康信息化发展规划[R/OL]. [2018-03-21]. http://ghs.ndrc. gov.cn/ghwb/gjjgh/201707/t20170720_855014.html
[3] (National Health and Family Planning Commission. “13th Five-Year Plan”: National Population Health Information Development Plan[R/OL]. [2018-03-21]. http://ghs.ndrc.gov.cn/ghwb/gjjgh/ 201707/t20170720_855014.html
[4] 赵海平, 邓胜利. 基于社会化问答平台的用户健康信息行为研究综述[J]. 信息资源管理学报, 2016(4): 19-27.
[4] (Zhao Haiping, Deng Shengli.Literature Review of Users' Health Information Behavior in Social Q&A Platform: Research Topic and Method[J]. Journal of Information Resources Management, 2016(4): 19-27.)
[5] Oh S, Zhang Y, Park M S.Cancer Information Seeking in Social Question and Answer Services: Identifying Health-Related Topics in Cancer Questions on Yahoo! Answers[J]. Information Research, 2016, 21(3). http://www.informationr. net/ir/21-3/paper718.html#.XLlSRvl6enE
[6] Tsuya A, Sugawara Y, Tanaka A, et al.Do Cancer Patients Tweet? Examining the Twitter Use of Cancer Patients in Japan[J]. Journal of Medical Internet Research, 2014, 16(e5): e137.
[7] Shaw R J, Johnson C M.Health Information Seeking and Social Media Use on the Internet Among People with Diabetes[J]. Online Journal of Public Health Informatics, 2011, 3(1). DOI:10.5210/ojphi.v3i1.3561.
[8] 魏永婷, 陈英, 许亚红. 癌症患者住院化疗期间健康信息需求状况调查分析[J]. 护理实践与研究, 2013, 10(11): 152-153.
[8] (Wei Yongting, Chen Ying, Xu Yahong.Investigation and Analysis of the Health Information Needs Among Patients with Cancer During Chemotherapy in Hospital[J]. Nursing Practice and Research, 2013, 10(11): 152-153.)
[9] 黄雪薇, 张瑛, 王秀利, 等. 癌症患者的信息需求——《癌症患者信息选择问卷》的编制与评估[J]. 中国心理卫生杂志, 2003, 17(11): 750-753.
[9] (Huang Xuewei, Zhang Ying, Wang Xiuli, et al.Information Needs of Cancer Patients: Development and Evaluation of Information Preference Questionnaire for Cancer Patients[J]. Chinese Mental Health Journal, 2003, 17(11): 750-753.)
[10] Valero-Aguilera B, Bermudez-Tamayo C, Francisco Garcia-Gutierrez J, et al. Information Needs and Internet Use in Urological and Breast Cancer Patients[J]. Supportive Care in Cancer, 2014, 22(2): 545-552.
[11] Friedemann-Sanchez G, Griffin J M, Partin M R.Gender Differences in Colorectal Cancer Screening Barriers and Information Needs[J]. Health Expectations, 2007, 10(2): 148-160.
[12] 张馨遥, 曹锦丹. 网络环境下用户健康信息需求的影响因素分析[J]. 医学与社会, 2010, 23(9): 25-27.
[12] (Zhang Xinyao, Cao Jindan.The Analysis of Influence Factors of Health Information Network Users' Requirement[J]. Medicine and Society, 2010, 23(9): 25-27.)
[13] 郭光霞. 糖尿病患者健康信息需求调查分析及护理对策[J]. 基层医学论坛, 2008, 12(21): 628-629.
[13] (Guo Guangxia.Health Survey Analysis and Nursing Countermeasure for Diabetic Patients[J]. Public Medical Forum Magazine, 2008, 12(21): 628-629.)
[14] 武燕燕, 姜亚芳. 住院化疗癌症患者信息需求的调查研究[J]. 中华现代护理杂志, 2010, 16(4): 384-387.
[14] (Wu Yanyan, Jiang Yafang.Investigation of Information Needs of Chemotherapy Inpatients[J]. Chinese Journal of Modern Nursing, 2010, 16(4): 384-387.)
[15] Oh H J, Lauckner C, Boehmer J, et al.Facebooking for Health: An Examination into the Solicitation and Effects of Health-Related Social Support on Social Networking Sites[J]. Computers in Human Behavior, 2013, 29(5): 2072-2080.
[16] Ramo D E, Liu H, Prochaska J J.A Mixed-Methods Study of Young Adults' Receptivity to Using Facebook for Smoking Cessation: If You Build It, Will They Come?[J]. American Journal of Health Promotion, 2015, 29(4): e126-e135.
[17] Bernad V M, Maderuelo F J Á, Moreno G P. Information Needs of the Health and Diseases in Users of Healthcare Services in Primary Care at Salamanca, Spain[J]. Atencion Primaria, 2016, 48(1): 15-24.
[18] Bowler L, Oh J S, He D, et al.Eating Disorder Questions in Yahoo! Answers: Information, Conversation, or Reflection?[C]// Proceedings of the American Society for Information Science and Technology. 2012.
[19] 金碧漪, 许鑫. 社会化问答社区中糖尿病健康信息的需求分析[J]. 中华医学图书情报杂志, 2014, 23(12): 37-42.
[19] (Jin Biyi, Xu Xin.Health Information Needs of Diabetics in Social Q&A Community[J]. Chinese Journal of Medical Library and Information Science, 2014, 23(12): 37-42.)
[20] Stonbraker S, Larson E.Health-information Needs of HIV-positive Adults in Latin America and the Caribbean: An Integrative Review of the Literature[J]. Aids Care, 2016, 28(10): 1223-1229.
[21] 吕英杰. 网络健康社区中的文本挖掘方法研究[D]. 上海: 上海交通大学, 2013.
[21] (Lv Yingjie.Research on Text Mining in Online Health Community[D]. Shanghai: Shanghai Jiao Tong University, 2013.)
[22] 李重阳, 翟姗姗, 郑路. 网络健康社区信息需求特征测度——基于时间和主题视角的实证分析[J]. 数字图书馆论坛, 2016(9): 34-42.
[22] (Li Chongyang, Zhai Shanshan, Zhen Lu.Measurement of Information Demand Characteristics in Online Health Community: An Empirical Analysis Based on Time and Theme Perspective[J]. Digital Library Forum, 2016(9): 34-42.)
[23] 龙树全, 赵正文, 唐华. 中文分词算法概述[J]. 电脑知识与技术, 2009, 5(10): 2605-2607.
[23] (Long Shuquan, Zhao Zhengwen, Tang Hua.Overview on Chinese Segmentation Algorithm[J]. Computer Knowledge and Technology, 2009, 5(10): 2605-2607.)
[24] 常娥. 基于LSI理论的文本自动聚类研究[J]. 图书情报工作, 2012, 56(11): 89-92.
[24] (Chang E.Automatic Text Clustering Based on Latent Semantic Index Theory[J]. Library and Information Service, 2012, 56(11): 89-92.)
[25] 李钊, 李晓, 王春梅, 等. 一种基于MapReduce的文本聚类方法研究[J]. 计算机科学, 2016, 43(1): 246-250.
[25] (Li Zhao, Li Xiao, Wang Chunmei, et al.Text Clustering Method Study Based on MapReduce[J]. Computer Science, 2016, 43(1): 246-250.)
[26] 吴江, 侯绍新, 靳萌萌, 等. 基于LDA模型特征选择的在线医疗社区文本分类及用户聚类研究[J]. 情报学报, 2017, 36(11): 1183-1191.
[26] (Wu Jiang, Hou Shaoxin, Jin Mengmeng, et al.LDA Feature Selection Based Text Classification and User Clustering in Chinese Online Health Community[J]. Journal of the China Society for Scientific and Technical Information, 2017, 36(11): 1183-1191.)
[27] 郑英鑫. 数据挖掘中基于肘部法则的聚类分析在中小学生出行路线优化设计的应用[J]. 电子世界, 2017(9): 146.
[27] (Zheng Yingxin.Application of Clustering Analysis Based on Elbow Rule in Data Mining in the Optimization Design of Primary and Secondary School Students' Travel Routes[J]. Electronics World, 2017(9): 146.)
[28] Kanungo T, Mount D M, Netanyahu N S, et al.An Efficient K-means Clustering Algorithm: Analysis and Implementation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(7): 881-892.
[29] Cho J, Noh H, Ha M H, et al.What Kind of Cancer Information Do Internet Users Need?[J]. Supportive Care in Cancer, 2011, 19(9): 1465-1469.
[30] Chen W.Cancer Statistics: Updated Cancer Burden in China[J]. Chinese Journal of Cancer Research, 2015, 27(1): 1.
[1] 吴菊华,王煜,黎明,蔡少云. 基于加权知识网络的在线健康社区用户知识发现*[J]. 数据分析与知识发现, 2019, 3(2): 108-117.
[2] 范馨月, 崔雷. 基于网络属性的抗肿瘤药物靶点预测方法及其应用*[J]. 数据分析与知识发现, 2018, 2(12): 98-108.
[3] 张晓娟. 信息类、导航类与事务类查询的网络动态性分析*[J]. 数据分析与知识发现, 2017, 1(4): 9-19.
[4] 王永光,赵可晓,王凯. 肿瘤学文献数据库检索系统的研究[J]. 现代图书情报技术, 2002, 18(5): 96-97.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn