Please wait a minute...
Data Analysis and Knowledge Discovery  2022, Vol. 6 Issue (12): 113-122    DOI: 10.11925/infotech.2096-3467.2022.0303
Current Issue | Archive | Adv Search |
Calculating Case Similarity with Heterogeneous Property Graph
Cheng Ge1(),Wang Shuo1,Liao Yongan2,Zhang Dongliang2
1School of Computer Science & School of Cyberspace Science, Xiangtan University, Xiangtan 411105, China
2Law School, Xiangtan University, Xiangtan 411105, China
Download: PDF (1318 KB)   HTML ( 18
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper proposes an algorithm to decide judicial case similarity with heterogeneous property graphs, aiming to improve the speed and precision of case similarity comparison. [Methods] First, we constructed a heterogeneous graph for legal case properties based on their contents and other related information. Then, we transformed the text similarity to graph similarity, and combined graph attention network with neighborhood node consensus matching method. Finally, the proposed model learned the local and global information of the heterogeneous property graphs, and calculated the similarity of cases. [Results] We examined the new model on the data set for similar case matching from CAIL 2019. Our model’s performance is better than other popular methods and only required 1.02% of the latter’s FLOPs. [Limitations] The precision of our model is positively correlated with the property graph’s complexity. However, the graph constructed by offline method will not increase the algorithm’s complexity. [Conclusions] The proposed model could effectively improve the speed and precision of similarity comparison for legal cases.

Key wordsSimilarity of Judicial Cases      Heterogeneous Property Graph      Graph Similarity Algorithm      Graph Attention     
Received: 06 April 2022      Published: 09 November 2022
ZTFLH:  TP391  
Fund:National Key R&D Program of China(2020YFC0832400);Hunan Provincial Natural Science Foundation(2022SK2108)
Corresponding Authors: Cheng Ge,ORCID:0000-0002-4342-8029     E-mail: chengge@xtu.edu.cn

Cite this article:

Cheng Ge, Wang Shuo, Liao Yongan, Zhang Dongliang. Calculating Case Similarity with Heterogeneous Property Graph. Data Analysis and Knowledge Discovery, 2022, 6(12): 113-122.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2022.0303     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2022/V6/I12/113

Case Similarity Calculation Model Based on Heterogeneous Property Graph
Similar Case Matching Dataset Data Example
Example of a Heterogeneous Property Graph
Model Node Matching Precision and Result Change with Training Epoches
模型或方法 Precision/% Recall/% F1/%
TF-IDF 54.4 54.2 54.3
GED 57.9 66.0 61.7
BERT(无监督) 54.8 54.9 54.9
BERT 65.3 66.9 66.1
BERT+特征提取 70.2 70.1 70.1
本文 71.2 71.0 71.1
Comparison of Experimental Results
项目 本文(M) BERT+特征提取(M) BERT(M)
参数量(Params) 1.954 5 103.922 7 0.190 0
计算量(FLOPs) 3 471.754 8 340 168.605 7 6.414 3
Comparison on Model of Params and FLOPs
[1] Giri R, Porwal Y, Shukla V, et al. Approaches for Information Retrieval in Legal Documents[C]// Proceedings of the 10th International Conference on Contemporary Computing. 2017: 1-6.
[2] Gilmer J, Schoenholz S S, Riley P F, et al. Neural Message Passing for Quantum Chemistry[C]// Proceedings of the 34th International Conference on Machine Learning-Volume 70. 2017: 1263-1272.
[3] Wu Z H, Pan S R, Chen F W, et al. A Comprehensive Survey on Graph Neural Networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(1): 4-24.
doi: 10.1109/TNNLS.2020.2978386
[4] Wang X, Bo D Y, Shi C, et al. A Survey on Heterogeneous Graph Embedding: Methods, Techniques, Applications and Sources[J]. IEEE Transactions on Big Data. DOI: 10.1109/TBDATA.2022.3177455.
doi: 10.1109/TBDATA.2022.3177455
[5] Hu B B, Zhang Z Q, Shi C, et al. Cash-out User Detection Based on Attributed Heterogeneous Information Network with a Hierarchical Attention Mechanism[C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2019, 33: 946-953.
[6] Zhao F, Zhang Y, Lu J G, et al. Measuring Academic Influence Using Heterogeneous Author-Citation Networks[J]. Scientometrics, 2019, 118(3): 1119-1140.
doi: 10.1007/s11192-019-03010-5
[7] Hu L M, Xu S Y, Li C, et al. Graph Neural News Recommendation with Unsupervised Preference Disentanglement[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 4255-4264.
[8] Sun Y Z, Han J W. Mining Heterogeneous Information Networks: A Structural Analysis Approach[J]. ACM SIGKDD Explorations Newsletter, 2013, 14(2): 20-28.
doi: 10.1145/2481244.2481248
[9] 洪文兴, 胡志强, 翁洋, 等. 面向司法案件的案情知识图谱自动构建[J]. 中文信息学报, 2020, 34(1): 34-44.
[9] (Hong Wenxing, Hu Zhiqiang, Weng Yang, et al. Automated Knowledge Graph Construction for Judicial Case Facts[J]. Journal of Chinese Information Processing, 2020, 34(1): 34-44.)
[10] Veličković P, Cucurull G, Casanova A, et al. Graph Attention Networks[OL]. arXiv Preprint, arXiv:1710.10903.
[11] Fey M, Lenssen J E, Morris C, et al. Deep Graph Matching Consensus[OL]. arXiv Preprint, arXiv: 2001.09621.
[12] Bhattacharya P, Ghosh K, Pal A, et al. Methods for Computing Legal Document Similarity: A Comparative Study[OL]. arXiv Preprint, arXiv: 2004.12307.
[13] Kumar S, Reddy P K, Reddy V B, et al. Similarity Analysis of Legal Judgments[C]// Proceedings of the 4th Annual ACM Bangalore Conference. 2011: 1-4.
[14] Bhattacharya P, Ghosh K, Pal A, et al. Hier-SPCNet: A Legal Statute Hierarchy-Based Heterogeneous Network for Computing Legal Case Document Similarity[C]// Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2020: 1657-1660.
[15] Landthaler J, Glaser I, Scepankova E, et al. Semantic Text Matching of Contract Clauses and Legal Comments in Tenancy Law[C]// Proceedings of Internationales Rechtsinformatik Symposium 2018. 2018.
[16] Mandal A, Chaki R, Saha S, et al. Measuring Similarity among Legal Court Case Documents[C]// Proceedings of the 10th Annual ACM India Compute Conference. 2017: 1-9.
[17] Liu B, Niu D, Wei H J, et al. Matching Article Pairs with Graphical Decomposition and Convolutions[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019: 6284-6294.
[18] Kumar S, Reddy P K, Reddy V B, et al. Finding Similar Legal Judgements under Common Law System[C]// Proceedings of International Workshop on Databases in Networked Information Systems. 2013: 103-116.
[19] 郑洁, 黄辉, 秦永彬. 一种融合法律知识的相似案例匹配模型[J]. 数据分析与知识发现, 2022, 6(7): 99-106.
[19] (Zheng Jie, Huang Hui, Qin Yongbin. Matching Similar Cases with Legal Knowledge Fusion[J]. Data Analysis and Knowledge Discovery, 2022, 6(7): 99-106.)
[20] Wang Y P, Zhang D H, Yuan Y, et al. Improvement of TF-IDF Algorithm Based on Knowledge Graph[C]// Proceedings of 2018 IEEE 16th International Conference on Software Engineering Research, Management and Applications. 2018: 19-24.
[21] Minocha A, Singh N. Legal Document Similarity Using Triples Extracted from Unstructured Text[C]// Proceedings of the 1st Workshop on Language Resources and Technologies for the Legal Knowledge Graph. 2018: 15-18.
[22] Ying C X, Cai T L, Luo S J, et al. Do Transformers Really Perform Badly for Graph Representation?[C]// Proceedings of the 35th Conference on Neural Information Processing Systems. 2021.
[23] Xiao C J, Zhong H X, Guo Z P, et al. CAIL2019-SCM: A Dataset of Similar Case Matching in Legal Domain[OL]. arXiv Preprint, arXiv:1911.08962.
[24] Cui Y M, Che W X, Liu T, et al. Pre-training with Whole Word Masking for Chinese BERT[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021, 29: 3504-3514.
doi: 10.1109/TASLP.2021.3124365
[25] Li J W, Chen X L, Hovy E, et al. Visualizing and Understanding Neural Models in NLP[C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. 2016: 681-691.
[26] Cambria E, White B. Jumping NLP Curves: A Review of Natural Language Processing Research[Review Article][J]. IEEE Computational Intelligence Magazine, 2014, 9(2): 48-57.
doi: 10.1109/MCI.2014.2307227
[27] Doan K D, Manchanda S, Mahapatra S, et al. Interpretable Graph Similarity Computation via Differentiable Optimal Alignment of Node Embeddings[C]// Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2021: 665-674.
[28] Bai Y S, Ding H, Gu K, et al. Learning-Based Efficient Graph Similarity Computation via Multi-scale Convolutional Set Matching[C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2020, 34: 3219-3226.
[29] Ma G X, Ahmed N K, Willke T L, et al. Deep Graph Similarity Learning: A Survey[J]. Data Mining and Knowledge Discovery, 2021, 35(3): 688-725.
doi: 10.1007/s10618-020-00733-5
[1] Gu Yaowen, Zhang Bowen, Zheng Si, Yang Fengchun, Li Jiao. Predicting Drug ADMET Properties Based on Graph Attention Network[J]. 数据分析与知识发现, 2021, 5(8): 76-85.
[2] Wang Song, Yang Yang, Liu Xinmin. Discovering Potentialities of User Ideas from Open Innovation Communities with Graph Attention Network[J]. 数据分析与知识发现, 2021, 5(11): 89-101.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn