Data Analysis and Knowledge Discovery  2020, Vol. 4 Issue (4): 44-55    DOI: 10.11925/infotech.2096-3467.2019.0530
Computer-Assisted ICD-11 Coding Method Based on Chinese Semantic Analysis
Zhang Runtong1,Chen Donghua1,Zhao Hongmei2,Zhu Xiaomin3()
1 School of Economics and Management, Beijing Jiaotong University, Beijing 100044, China
2 Peking University People’s Hospital, Beijing 100044, China
3 School of Mechanical, Electronic and Control Engineering, Beijing Jiaotong University, Beijing 100044, China
[Objective] This study proposes a computer-assisted coding method based on the 11th Revision of International Classification of Diseases (ICD-11) and Chinese semantic analysis, aiming to improve the efficiency of medical coding. [Methods] First, we constructed a new model for the entities and relations in ICD-11 based on traditional graphic models. Then, we used an improved measurement for semantic similarity to estimate the confidence of ICD-11 candidate codes. Finally, the proposed model generated candidate ICD codes. [Results] We examined our model with a coded hospital dataset, and found the proposed method outperformed existing ones. Our method achieved a success rate of 42% in assisted mode and 73% in precise mode. [Limitations] The Chinese version of ICD-11 does not allow us to leverage more Chinese semantics information to improve coding precision. [Conclusions] The proposed method improves the efficiency of coders and quality of medical records. It also promotes the development of Chinese medical informatics.

Key wordsICD-11      International Classification of Diseases      Chinese Semantics      Medical Coding System      Computer-Assisted Coding     
Received: 20 May 2019      Published: 01 June 2020
ZTFLH:  TP319  
Zhu Xiaomin

Zhang Runtong,Chen Donghua,Zhao Hongmei,Zhu Xiaomin. Computer-Assisted ICD-11 Coding Method Based on Chinese Semantic Analysis. Data Analysis and Knowledge Discovery, 2020, 4(4): 44-55.

Flowchart of the Computer-Assisted ICD-11 Coding Method in Chinese Semantics Context
语义元素模型 模型标记 模型属性 属性描述
词语 Lemma lexicalEntryId
含义 Sense senseId
词义关系 SenseRelation sourceSenseId
同义词集 Synset synsetId 同义词集ID
同义词集关系 SysnetRelation sourceSynsetId
Modeling of Semantic Elements Based on Chinese Open WordNet
ICD内容类别 图模型标签 ICD-11中的定义或描述
根节点 ICD11MMS 各个章节点的父节点
ICD-11实体 Entity 章、块以及ICD有效编码实体
章节点 Chapter_{章号} 每章下所有信息的父节点
扩展编码实体 Chapter_X 扩展编码实体
有效编码实体 Detail 有效编码实体
索引节点 IndexTerm 索引文本项
章节关系 r(0)=IS_A 章节层次结构中父节点和子节点的关系
编码指导关系 r(1) 实体与其他实体间的编码约束关系
后组配关系 r(2) 用于标记不同实体的组配关系,如“Associated with”关系
索引关系 HAS_INDEX 用于关联实体与对应的索引文本的关系
Entities and Their Relationships in the ICD-11 Graph Model
IS_A and HAS_INDEX Relationships in Graph Model G
Three Relationship Types in Graph Model G
Mapping Between Information from the WHO’s Post-Coordination System and Entity Relations in Graph Model
图模型参数 ICD-11图模型 ICD-11中文简明版现有结构
节点数量(个) 134 399 32 675
属性数量(个) 635 922 231 744
关系数量(个) 270 210 32 324
关系类型数量(种) 38 1
Comparasion Between the ICD-11 Graph Model and Simplified Chinese Version of ICD-11
Closeness Centrality of Chapter 28 Between the ICD-11 Graph Model and Existing Structures
变量 N N* 均值 标准差 最小值 Q1 中值 Q3 最大值
P 3 019 483 0.11 0.27 0.00 0.00 0.00 0.02 1.00
R 3 019 483 0.56 0.48 0.00 0.00 1.00 1.00 1.00
F1 1 827 1 675 0.21 0.33 0.00 0.01 0.02 0.25 1.00
Performance of Our Method in Precise Mode
变量 N N* 均值 标准差 最小值 Q1 中值 Q3 最大值
P 3 019 483 0.21 0.36 0.00 0.00 0.01 0.18 1.00
R 2 978 524 0.41 0.38 0.00 0.00 0.29 0.78 1.00
F1 2 229 1 273 0.18 0.25 0.00 0.02 0.07 0.22 1.00
D 2 229 1273 0.38 0.27 0.00 0.14 0.38 0.60 0.98
Performance of Our Method in Assisted Mode
Success Rates Between Different Methods
