|
|
Matching Similar Cases with Legal Knowledge Fusion |
Zheng Jie1,Huang Hui2,Qin Yongbin2( ) |
1Department of Information Science, Guiyang Vocational and Technical College, Guiyang 550081, China 2College of Computer Science and Technology, Guizhou University, Guiyang 550025, China |
|
|
Abstract [Objective] This paper constructs a model to match similar cases with integrated legal knowledge, aiming to improve the accuracy of case matching. [Methods] First, we concatenated the legal knowledge with the case texts, which helped the model learn characteristics of legal knowledge and text information simultaneously. Then, we used the LSTM network to model text segmentally, and increased the length of the accommodated texts. Finally, we used triplet loss and adversarial-based contrastive loss to jointly train the model and enhanced its robustness. [Results] The proposed model significantly improved the accuracy of similar case matching, which is 7.07% higher than the baseline BERT model. [Limitations] We used longer text sequences for matching, which is more time consuming than other models. [Conclusions] The proposed model has stronger matching and generalization ability, which helps legal case retrieval.
|
Received: 13 January 2022
Published: 01 March 2022
|
|
Fund:National Natural Science Foundation of China(62066008) |
Corresponding Authors:
Qin Yongbin,ORCID: 0000-0002-1960-8628
E-mail: ybqin@foxmail.com
|
[1] |
Xiao C J, Zhong H X, Guo Z P, et al. CAIL2019-SCM: A Dataset of Similar Case Matching in Legal Domain[OL]. arXiv Preprint, arXiv:1911.08962.
|
[2] |
Devlin J, Chang M W, Lee K, et al. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding[OL]. arXiv Preprint, arXiv:1810.04805.
|
[3] |
Schroff F, Kalenichenko D, Philbin J. FaceNet: A Unified Embedding for Face Recognition and Clustering[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2015: 815-823.
|
[4] |
Robertson S, Zaragoza H. The Probabilistic Relevance Framework: BM25 and Beyond[J]. Foundations and Trends® in Information Retrieval, 2009, 3(4): 333-389.
|
[5] |
黄名选, 卢守东, 徐辉. 基于加权关联模式挖掘与规则后件扩展的跨语言信息检索[J]. 数据分析与知识发现, 2019, 3(9): 77-87.
|
[5] |
( Huang Mingxuan, Lu Shoudong, Xu Hui. Cross-Language Information Retrieval Based on Weighted Association Patterns and Rule Consequent Expansion[J]. Data Analysis and Knowledge Discovery, 2019, 3(9): 77-87.)
|
[6] |
Mikolov T, Yih W, Zweig G. Linguistic Regularities in Continuous Space Word Representations[C]// Proceedings of NAACL-HLT 2013. Association for Computational Linguistics, 2013: 746-751.
|
[7] |
Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv: 1301.3781.
|
[8] |
Pennington J, Socher R, Manning C. GloVe: Global Vectors for Word Representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2014: 1532-1543.
|
[9] |
Li B H, Zhou H, He J X, et al. On the Sentence Embeddings from Pre-Trained Language Models[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2020: 9119-9130.
|
[10] |
Su J L, Cao J R, Liu W J, et al. Whitening Sentence Representations for Better Semantics and Faster Retrieval[OL]. arXiv Preprint, arXiv: 2103.15316.
|
[11] |
Gao T Y, Yao X C, Chen D Q. SimCSE: Simple Contrastive Learning of Sentence Embeddings[C]// Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2021: 6894-6910.
|
[12] |
Shen Y L, He X D, Gao J F, et al. Learning Semantic Representations Using Convolutional Neural Networks for Web Search[C]// Proceedings of the 23rd International Conference on World Wide Web. 2014: 373-374.
|
[13] |
Shen Y L, He X D, Gao J F, et al. A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval[C]// Proceedings of the 23rd ACM International Conference on Information and Knowledge Management. 2014: 101-110.
|
[14] |
Chen Q, Zhu X D, Ling Z H, et al. Enhanced LSTM for Natural Language Inference[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2017: 1657-1668.
|
[15] |
Wang Z G, Hamza W, Florian R. Bilateral Multi-Perspective Matching for Natural Language Sentences[C]// Proceedings of the 26th International Joint Conference on Artificial Intelligence. 2017: 4144-4150.
|
[16] |
Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[OL]. arXiv Preprint, arXiv:1706.03762.
|
[17] |
Chen H J, Cai D, Dai W, et al. Charge-Based Prison Term Prediction with Deep Gating Network[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, 2019: 6362-6367.
|
[18] |
Tran V le Nguyen M, Satoh K. Building Legal Case Retrieval Systems with Lexical Matching and Summarization Using a Pre-Trained Phrase Scoring Model[C]// Proceedings of the 17th International Conference on Artificial Intelligence and Law. 2019: 275-282.
|
[19] |
李佳敏, 刘兴波, 聂秀山, 等. 三元组深度哈希学习的司法案例相似匹配方法[J]. 智能系统学报, 2020, 15(6): 1147-1153.
|
[19] |
( Li Jiamin, Liu Xingbo, Nie Xiushan, et al. Triplet Deep Hashing Learning for Judicial Case Similarity Matching Method[J]. CAAI Transactions on Intelligent Systems, 2020, 15(6): 1147-1153.)
|
[20] |
Jing L L, Tian Y L. Self-Supervised Visual Feature Learning with Deep Neural Networks: A Survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(11): 4037-4058.
doi: 10.1109/TPAMI.2020.2992393
|
[21] |
Miyato T, Dai A M, Goodfellow I. Adversarial Training Methods for Semi-Supervised Text Classification[OL]. arXiv Preprint, arXiv: 1605.07725.
|
[22] |
Zhong H, Zhang Z, Liu Z, et al. Open Chinese Language Pre-Trained Model Zoo[R]. 2019.
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|