【目的】 围绕“通过科学文献中有关知识主张的文本语言学特征,测度医学知识的不确定性”这一主题,阐述其理论基础、研究进展及其预期应用场景。【文献范围】 以同时包含“不确定”、“知识/知识单元”、“医学”三方面关键词为规则,以源作Representing Scientific Knowledge: The Role of Uncertainty设置引文追踪。综合采用关键词检索和引文检索,在中英文数据库检索并筛选文献,共筛选得到51篇。【方法】 对文献进行分类述评;对涉及的研究方法、数据来源、核心观点进行归纳梳理。【结果】 理论基础主要包括宏观层面的范式转移理论,以及微观层面的统计学理论,如贝叶斯因果网络。研究进展集中于三个方面:一是识别医学文献中表达不确定性的线索词与语句;二是细粒度、结构化表示医学知识对象;三是针对结构化医学知识测度其来源文本表述的不确定性程度。【局限】 对知识单元的讨论仅限以“数据-信息-知识-智慧”(Data-Information-Knowledge-Wisdom,DIKW)模型为基本范式的情报学、知识工程或人工智能领域。【结论】 医学知识不确定性测度是一个信息计量学与医学信息学交叉研究的新方向。不确定性及其时间演化间接反映知识主张的竞争强度、知识缺口的解决程度和知识确定性的概率,有望促进信息计量学向知识计量学深化,并拓展信息计量学在知识发现、科技评价和人工智能领域潜在的新应用。
[Objective] This article reviewed the theory, research progress and potential applications on measuring uncertainty of medical knowledge from scientific publications.[Coverage] We searched PubMed, Web of Science, Microsoft Academic, CNKI, and Wanfang Data for English and Chinese publications with 1) keywords “uncertain* AND knowledge AND *medical” in title, and 2) the cited reference “Representing Scientific Knowledge: The Role of Uncertainty”.[Methods] First, we categorized these literature into computational linguistics and informetrics studies. Then, we summarized their research design, data analytics and conclusions.[Results] The thoughts of paradigm shift and the Bayesian causal networks were the foundation for measuring uncertainty of medical knowledge. Latest developments included: identifying uncertain cues from biomedical literature; extracting structured knowledge from unstructured biomedical texts; and measuring the uncertainty level of scientific text which resulted Subject-Predicate-Object (SPO) triples.[Limitations] Our discussion focused on the Data-Information-Knowledge-Wisdom driven research, such as information science, knowledge engineering and artificial intelligence.[Conclusions] The uncertainty of scientific knowledge and its evolution over time indirectly reflect the strength of competing knowledge claims, the contribution to fill up knowledge gap, as well as the probability of certainty for a given knowledge claim. It will promote the developments of informetrics and knowmetrics, as well as their applications in emerging fields, such as detecting reserch fronts, evaluating academic contributions and improving the efficacy of computable knowledge driven decision support.
杜建. 医学知识不确定性测度的进展与展望*[J]. 数据分析与知识发现, 2020, 4(10): 14-27.
Du Jian. Measuring Uncertainty of Medical Knowledge: A Literature Review. Data Analysis and Knowledge Discovery, 2020, 4(10): 14-27.
构建疾病临床研究方面相反的语义关系对: ①有因果意义的4对相反关系,如TREATS versus CAUSES、PREVENTS versus CAUSES、TREATS versus PREDISPOSES、PREVENTS versus PREDISPOSES; ②无因果意义的4对相反关系,包括TREATS、PREVENTS、CAUSES、PREDISPOSES及其否定形式。
Chen C M, Song M. Representing Scientific Knowledge: The Role of Uncertainty[M]. Springer International Publishing, 2017.
[2]
Small H. Past as Prologue: Approaches to the Study of Confirmation in Science[J]. Quantitative Science Studies, 2020,1(3):1025-1040.
doi: 10.1162/qss_a_00063
Chen C, Song M, Heo G E. A Scalable and Adaptive Method for Finding Semantically Equivalent Cue Words of Uncertainty[J]. Journal of Informetrics, 2018,12(1):158-180.
doi: 10.1016/j.joi.2017.12.004
[5]
Murray D, Lamers W, Boyack K, et al. Measuring Disagreement in Science[C]//Proceedings of the 17th International Conference of the International Society for Scientometrics and Informetrics. 2019: 2370-2375.
[6]
Herrera-Perez D, Haslam A, Crain T, et al. A Comprehensive Review of Randomized Clinical Trials in Three Medical Journals Reveals 396 Medical Reversals[J]. eLife Sciences, 2019,8:e45183.
[7]
Tatsioni A, Bonitsis N G, Ioannidis J P A. Persistence of Contradicted Claims in the Literature[J]. JAMA, 2007,298(21):2517-2526.
doi: 10.1001/jama.298.21.2517
pmid: 18056905
[8]
Simpkin A L, Schwartzstein R M. Tolerating Uncertainty—The Next Medical Revolution?[J]. New England Journal of Medicine, 2016,375(18):1713-1715.
doi: 10.1056/NEJMp1606402
pmid: 27806221
[9]
Kuhn T S, Hacking I. The Structure of Scientific Revolutions: 50th Anniversary Edition[M]. University of Chicago Press, 2012.
[10]
Kilicoglu H. Biomedical Text Mining for Research Rigor and Integrity: Tasks, Challenges, Directions[J]. Briefings in Bioinformatics, 2018,19(6):1400-1414.
doi: 10.1093/bib/bbx057
pmid: 28633401
[11]
Small H. Some Questions for Information Science Arising from the History and Philosophy of Science?[C]//Proceedings of the BIR 2020 Workshop on Bibliometric-enhanced Information Retrieval. 2020: 118-120.
[12]
Hyland K. Talking to the Academy: Forms of Hedging in Science Research Articles[J]. Written Communication, 1996,13(2):251-281.
doi: 10.1177/0741088396013002004
[13]
Light M, Qiu X Y, Srinivasan P. The Language of Bioscience: Facts, Speculations, and Statements in Between[C]//Proceedings of the Workshop on Linking Biological Literature, Ontologies and Databases, Boston, USA. 2004: 17-24.
[14]
Zerva C. Automatic Identification of Textual Uncertainty[D]. Manchester: University of Manchester, 2019.
[15]
Vincze V, Szarvas G, Farkas R, et al. The BioScope Corpus: Biomedical Texts Annotated for Uncertainty, Negation and Their Scopes[J]. BMC Bioinformatics, 2008, 9(11): Article No. S9.
[16]
Farkas R, Vincze V, Móra G, et al. The CoNLL-2010 Shared Task: Learning to Detect Hedges and Their Scope in Natural Language Text[C]//Proceedings of the 14th Conference on Computational Natural Language Learning. 2010: 1-12.
[17]
Thompson P, Nawaz R, McNaught J, et al. Enriching a Biomedical Event Corpus with Meta-Knowledge Annotation[J]. BMC Bioinformatics, 2011, 12(1): Article No.393.
[18]
Tawfik N S, Spruit M R. Automated Contradiction Detection in Biomedical Literature[C]//Proceedings of the 14th International Conference on Machine Learning and Data Mining in Pattern Recognition. 2018: 138-148.
[19]
Szarvas G, Vincze V, Farkas R, et al. Cross-Genre and Cross-Domain Detection of Semantic Uncertainty[J]. Computational Linguistics, 2012,38(2):335-367.
doi: 10.1162/COLI_a_00098
( Zou Bowei, Qian Zhong, Chen Zhancheng, et al. Negation and Uncertainty Information Extraction Oriented to Natural Language Text[J]. Journal of Software, 2016,27(2):309-328.)
[21]
Mercer R E, Di Marco C, Kroon F W. The Frequency of Hedging Cues in Citation Contexts in Scientific Writing[C]//Proceddings of the 17th Conference of the Canadian Society for Computational Studies of Intelligence. 2004: 75-88.
[22]
Small H. Characterizing Highly Cited Method and Non-Method Papers Using Citation Contexts: The Role of Uncertainty[J]. Journal of Informetrics, 2018,12(2):461-480.
doi: 10.1016/j.joi.2018.03.007
[23]
Small H, Boyack K W, Klavans R. Citations and Certainty: A New Interpretation of Citation Counts[J]. Scientometrics, 2019,118(3):1079-1092.
doi: 10.1007/s11192-019-03016-z
[24]
Small H. What Makes Some Scientific Findings More Certain Than Others? A Study of Citing Sentences for Low-Hedged Papers[C]//Proceedings of the 17th International Conference of the International Society for Scientometrics and Informetrics, Rome, Italy. 2019: 554-560.
[25]
Kilicoglu H, Peng Z, Tafreshi S, et al. Confirm or Refute?: A Comparative Study on Citation Sentiment Classification in Clinical Research Publications[J]. Journal of Biomedical Informatics, 2019,91:103123.
doi: 10.1016/j.jbi.2019.103123
pmid: 30753947
[26]
Xu J, Zhang Y, Wu Y, et al. Citation Sentiment Analysis in Clinical Trial Papers[J]. American Medical Informatics Association Annual Symposium, 2015: 1334-1341.
[27]
Atanassova I, Rey F, Claude, Bertin M. Studying Uncertainty in Science: A Distributional Analysis Through the IMRaD Structure[C]//Proceedings of the 7th International Workshop on Mining Scientific Publications at 11th Edition of the Language Resources and Evaluation Conference, Miyazaki, Japan. 2018: 01940294.
[28]
Malhotra A, Younesi E, Gurulingappa H, et al. ‘HypojournalFinder:’ A Strategy for the Detection of Speculative Statements in Scientific Text[J]. PLoS Computational Biology, 2013,9(7):e1003117.
doi: 10.1371/journal.pcbi.1003117
pmid: 23935466
( Zhao Hongzhou, Jiang Guohua. On the Element of Knowledge and Exponential Growth Rate[J]. Science of Science and Management of S. &. T., 1984(9):41-43.)
( Suo Chuanjun, Gai Shuangshuang. The Connotation, Structure and Description Model of Knowledge Unit[J]. Journal of Library Science in China, 2018,44(4):54-72.)
( Niu Lihui, Ou Shiyan . Design and Application of a Semantic Annotation Framework for Scientific Articles[J]. Information Studies: Theory & Application, 2020,43(3):124-130.)
[33]
Kilicoglu H, Shin D, Fiszman M, et al. SemMedDB: A Pubmed-Scale Repository of Biomedical Semantic Predications[J]. Bioinformatics, 2012,28(23):3158-3160.
doi: 10.1093/bioinformatics/bts591
pmid: 23044550
[34]
Kilicoglu H, Rosemblat G, Fiszman M, et al. Broad-Coverage Biomedical Relation Extraction with SemRep[J]. BMC Bioinformatics, 2020, 21(1): Article No.188.
doi: 10.1186/s12859-020-03775-0
pmid: 33092523
[35]
Groth P, Gibson A, Velterop J. The Anatomy of a Nanopublication[J]. Information Services & Use, 2010,30(1):51-56.
[36]
Clark T, Ciccarese P N, Goble C A. Micropublications: A Semantic Model for Claims, Evidence, Arguments and Annotations in Biomedical Communications[J]. Journal of Biomedical Semantics, 2014, 5: Article No. 28.
doi: 10.1186/2041-1480-5-29
pmid: 25093068
[37]
Friedman C P, Flynn A J. Computable Knowledge: An Imperative for Learning Health Systems[J]. Learning Health Systems, 2019,3:e10203.
doi: 10.1002/lrh2.10203
pmid: 31641690
[38]
Flynn A J, Friedman C P, Boisvert P, et al. The Knowledge Object Reference Ontology (KORO): A Formalism to Support Management and Sharing of Computable Biomedical Knowledge for Learning Health Systems[J]. Learning Health Systems, 2018,2:e10054.
doi: 10.1002/lrh2.10054
pmid: 31245583
[39]
Mons B. FAIR Science for Social Machines: Let’s Share Metadata Knowlets in the Internet of FAIR Data and Services[J]. Data Intelligence, 2019,1(1):22-42.
doi: 10.1162/dint_a_00002
[40]
Kilicoglu H, Rosemblat G, Rindflesch T C. Assigning Factuality Values to Semantic Relations Extracted from Biomedical Research Literature[J]. PLoS ONE, 2017,12(7):e0179926.
doi: 10.1371/journal.pone.0179926
pmid: 28678823
[41]
Jia S, Xiang Y, Chen X, et al. Triple Trustworthiness Measurement for Knowledge Graph[C]// Proceedings of the 2019 World Wide Web Conference. 2019.
[42]
Alamri A. The Detection of Contradictory Claims in Biomedical Abstracts[D]. Sheffield: University of Sheffield, 2016.
[43]
Rosemblat G, Fiszman M, Shin D, et al. Towards a Characterization of Apparent Contradictions in the Biomedical Literature Using Context Analysis[J]. Journal of Biomedical Informatics, 2019,98:103275.
doi: 10.1016/j.jbi.2019.103275
pmid: 31473364
[44]
Pinto J M G, Wawrzinek J, Balke W. What Drives Research Efforts? Find Scientific Claims That Count![C]// Proceedings of the 2019 ACM/IEEE Joint Conference on Digital Libraries. 2019: 217-226.
( Du Jian. An Automated Approach for Extracting Uncertain Clinical Knowledge from Published Medical Documents[C]// Proceedings of the 2019 Tianfu International Forum on Scientometrics and Research Evaluation, Chengdu, China. 2019.)
[46]
Debons A. The Measurement of Knowledge[C]// Proceedings of the 55th Annual Meeting on Celebrating Change: Information Management on the Move, Pittsburgh, Pennsylvania, USA. American Society for Information Science, 1992: 212-215.
[47]
Ding Y, Song M, Han J, et al. Entitymetrics: Measuring the Impact of Entities[J]. PLoS ONE, 2013,8(8):e71416.
doi: 10.1371/journal.pone.0071416
pmid: 24009660
( Li Xiaoying, Li Junlian, Li Danya. Research on the Unified Medical Language System and Its Application to Knowledge Discovery[J]. Digital Library Forum, 2019(9):24-29.)
[49]
Keselman A, Rosemblat G, Kilicoglu H, et al. Adapting Semantic Natural Language Processing Technology to Address Information Overload in Influenza Epidemic Management[J]. Journal of the American Society for Information Science and Technology, 2010,61(12):2531-2543.
doi: 10.1002/asi.v61.12
[50]
Bakal G, Talari P, Kakani E V, et al. Exploiting Semantic Patterns over Biomedical Knowledge Graphs for Predicting Treatment and Causative Relations[J]. Journal of Biomedical Informatics, 2018,82:189-199.
doi: 10.1016/j.jbi.2018.05.003
pmid: 29763706