Data Analysis and Knowledge Discovery  2022, Vol. 6 Issue (7): 99-106    DOI: 10.11925/infotech.2096-3467.2022.0040
Matching Similar Cases with Legal Knowledge Fusion
Zheng Jie1,Huang Hui2,Qin Yongbin2()
1Department of Information Science, Guiyang Vocational and Technical College, Guiyang 550081, China
2College of Computer Science and Technology, Guizhou University, Guiyang 550025, China
[Objective] This paper constructs a model to match similar cases with integrated legal knowledge, aiming to improve the accuracy of case matching. [Methods] First, we concatenated the legal knowledge with the case texts, which helped the model learn characteristics of legal knowledge and text information simultaneously. Then, we used the LSTM network to model text segmentally, and increased the length of the accommodated texts. Finally, we used triplet loss and adversarial-based contrastive loss to jointly train the model and enhanced its robustness. [Results] The proposed model significantly improved the accuracy of similar case matching, which is 7.07% higher than the baseline BERT model. [Limitations] We used longer text sequences for matching, which is more time consuming than other models. [Conclusions] The proposed model has stronger matching and generalization ability, which helps legal case retrieval.

Key wordsCase Matching      BERT      Legal Knowledge      Segmented Modelling      Triplet Loss     
Received: 13 January 2022      Published: 01 March 2022
Fund:National Natural Science Foundation of China(62066008)
Corresponding Authors: Qin Yongbin,ORCID: 0000-0002-1960-8628     E-mail:

Zheng Jie, Huang Hui, Qin Yongbin. Matching Similar Cases with Legal Knowledge Fusion. Data Analysis and Knowledge Discovery, 2022, 6(7): 99-106.

Architecture of Similar Case Matching Model with Fusion Legal Knowledge
属性 属性值
出借人基本属性 法人,自然人,其他组织
借款人基本属性 法人,自然人,其他组织
借款用途 个人生活,企业生产经营,夫妻生活,违法犯罪,其他
借贷合意的凭据 微信、短信、电话等聊天记录,收据、收条,还款承诺,借款合同、借条、借据,担保,欠条,未知或模糊,其他
出借意图 正常出借、转贷牟利、其他
借款交付形式 银行出账,未出借,票据,授权支配特定资金账户,现金,网上电子汇款,网络贷款平台,未知或模糊,其他
担保类型 保证,无担保,抵押,质押
约定期内利率(换算成年利率) 24%(含)以下,24%(不含)~36%(含),36%(不含)以上,其他
约定计息方式 无利息,单利,复利,约定不明,其他
还款交付形式 银行转账,票据,现金,部分还款,网上电子汇款,未还款,未知或模糊,其他
Attributes of Private Loan Elements
An Example of the Input Construction
Distribution of Text Length
属性名称 参数值
最大句子长度 512
切分段数 2
迭代轮次 4
学习率 2e-5
单卡批次大小 4
梯度累积步数 4
优化器 AdamW
权重衰减指数 0.01
Parameter Setting
模型 准确率/%
验证集 测试集
Baseline[1] CNN 62.27 69.53
LSTM 62.00 68.00
BERT 61.93 67.32
Team[1] 11.2 yuan(ensemble) 66.73 72.07
backward(ensemble) 67.73 71.81
AlphaCourt(ensemble) 70.07 72.66
Ours BERT(single) 68.73 72.72
MS-BERT(single) 68.51 73.24
MS-BERT(ensemble) 70.10 74.39
Model Performance
模型 准确率/%
验证集 测试集
BERT+Triplet 63.93 68.50
BERT+Triplet+CL 64.34 69.09
BERT+Triplet+Multi 65.95 70.71
BERT+Triplet+Feature 65.86 70.64
BERT+Triplet+Feature+Multi 68.47 72.07
BERT+Triplet+Feature+Multi+CL 68.73 72.72
MS-BERT+Triplet+Feature+Multi+CL 68.51 73.24
Results of Ablation Experiments
