Please wait a minute...
Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (5): 104-114    DOI: 10.11925/infotech.2096-3467.2020.1109
Current Issue | Archive | Adv Search |
Automatic Abstracting Civil Judgment Documents with Two-Stage Procedure
Wang Yizhen,Ou Shiyan(),Chen Jinju
School of Information Management, Nanjing University, Nanjing 210023, China
Download: PDF (939 KB)   HTML ( 13
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper tries to automatically summarize the contents of civil judgment documents in the first-instance, aiming to provide concise, readable, coherent, accurate and efficient knowledge services. [Methods] We proposed an automatic abstracting method for judgment documents, which includes extractive summary stage and abstract summary stage. We first added the expanded residual gate convolution to the pre-training model to extract key sentences from the judgment documents. Then, we input the extractive summary to the sequence to sequence model and generated the final judgment document abstracts. [Results] The ROUGE indicators of the proposed model were 50.31, 36.60, and 48.86 with the experimental data sets of judgment documents, which were 25.00, 23.25, 24.66 higher than the results of the benchmark model (LEAD-3). [Limitations] The extractive summary obtained in the first stage is used as the input of the second stage abstract model, which creates cumulative error issue. The overall performance of the proposed model is decided by the extractive model of the first stage. [Conclusions] The proposed model could summarize judgment texts automatically, which solve the information overload issue and help users quickly read judgment documents.

Key wordsPre-trained Language Model      Automatic Summary      Judgment Documents      Abstract Summarization      Extractive Summarization     
Received: 11 November 2020      Published: 27 May 2021
ZTFLH:  TP391  
Fund:The work is supported by the National Social Science Foundation of China(17ATQ001)
Corresponding Authors: Ou Shiyan     E-mail: oushiyan@nju.edu.cn

Cite this article:

Wang Yizhen,Ou Shiyan,Chen Jinju. Automatic Abstracting Civil Judgment Documents with Two-Stage Procedure. Data Analysis and Knowledge Discovery, 2021, 5(5): 104-114.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2020.1109     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2021/V5/I5/104

篇章结构组成
成分
功能说明
案件基本信息 记录案件基础信息,包含:文书名、案号、文书性质、裁判日期、审理程序等信息
当事人信息 记录当事人信息,包含:原被告信息简介、原被告类型和数量以及代理律所等信息
审理经过 记录案件审理的过程,包含:案由、原被告名称、案件立案时间等信息
原告诉称 记录原告当事人的诉讼请求和对应诉讼请求的理由等信息
被告辩称 记录被告当事人针对原告当事人的诉讼请求和理由提出反驳的抗辩事由等信息
法院查明 记录法院针对当前案件进行事实、证据调查的结果,查明事实包含详细的证据、事情经过等信息
本院认为 记录裁判文书的说理过程,是法院就案件作出的说理评判信息,包含:当事人双方的争议焦点、案件说理逻辑、引用的法律法规
判决结果 记录案件的详细判决结果,包含原告的权利和义务信息
其他信息 记录审判人员、书记员、判决日期等信息
Explanation of the Chapter Structure and Function of the First-instance of Civil Judgments Document
The Framework of Sentences Structure Function in the First-instance of Civil Judgments Document
Two-Stage Automatic Summarization Model for Judgment Documents
The Extractive Summary Model of Judgment Document
The Abstractive Summary Model of Judgment Document
The Process of Marking the First-instance of Civil Judgments Document
操作系统 GPU Python Cuda Tensorflow-GPU Rouge Keras
Ubuntu 18.04 TITAN RTX 3.6.9 10.0 1.14.0 1.5.5 2.3.1
The Experimental Environments
方法 ROUGE
n=1 n=2 n=L
Baseline LEAD-3 25.31 13.35 24.20
抽取式 NeuSum 44.36 17.79 41.96
BERT+Classifier 45.60 19.99 43.57
BERT+Transformer 47.75 30.81 46.28
生成式 Transformer-Abstractive 44.15 28.62 43.14
Pointer-Generator Networks 47.78 32.82 47.13
Bottom-Up Abstractive 48.01 25.17 46.47
本文模型 TSSM-Extractive 49.69 31.94 48.85
TSSM 50.31 36.60 48.86
Control Experimental Results
[1] Edmundson H P. New Methods in Automatic Extracting[J]. Journal of the ACM, 1969,16(2):264-285.
doi: 10.1145/321510.321519
[2] Liu M, Yu Y, Qi Q, et al. Extractive Single Document Summarization via Multi-feature Combination and Sentence Compression[C]// Proceedings of the 6th CCF International Conference on Natural Language Processing and Chinese Computing. Springer, Cham, 2017: 807-817.
[3] Kupiec J, Pedersen J, Chen F. A Trainable Document Summarizer[C]// Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle. ACM, 1995: 68-73.
[4] Conroy J, O’leary D. Text Summarization via Hidden Markov Models[C]// Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New Orleans, Louisiana, USA. Association for Computing Machinery, 2001: 406-407.
[5] Alguliyev R M, Aliguliyev R M, Isazade N R, et al. COSUM: Text Summarization Based on Clustering and Optimization[J]. Expert Systems, 2019,36(1):e12340.
doi: 10.1111/exsy.v36.1
[6] Osborne T J, Nielsen M A. Entanglement in a Simple Quantum Phase Transition[J]. Physical Review A, 2002,66(3):0321103.
[7] Svore K, Vanderwende L, Burges C. Enhancing Single-document Summarization by Combining RankNet and Third-party Sources[C]// Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Prague, Czech Republic. Association for Computational Linguistics, 2007: 448-457.
[8] Liu L, Lu Y, Yang M, et al. Generative Adversarial Network for Abstractive Text Summarization[C]//Proceedings of the 2018 AAAI Conference on Artificial Intelligence, New Orleans, Louisiana, USA. AAAI Press, 2018: 8109-8110.
[9] Al-Sabahi K, Zhang Z, Nadher M. A Hierarchical Structured Self-Attentive Model for Extractive Document Summarization (HSSAS)[J]. IEEE Access, 2018,6:24205-24212.
doi: 10.1109/ACCESS.2018.2829199
[10] Cho K, van Merrienboer B, Gulcehre C, et al. Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation[OL]. arXiv Preprint, arXiv: 1406. 1078.
[11] Tan J, Wan X, Xiao J. Abstractive Document Summarization with a Graph-based Attentional Neural Model[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada. Association for Computational Linguistics, 2017: 1171-1181.
[12] Siddiqui T, Shamsi J A. Generating Abstractive Summaries Using Sequence to Sequence Attention[C]// Proceedings of the 2018 International Conference on Frontiers of Information Technology (FIT), Islamabad, Pakistan. IEEE Computer Society, 2018: 212-217.
[13] Celikyilmaz A, Bosselut A, He X D, et al. Deep Communicating Agents for Abstractive Summarization[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans. Association for Computational Linguistics, 2018,1:1662-1675.
[14] 江跃华, 丁磊, 李娇娥, 等. 融合词汇特征的生成式摘要模型[J]. 河北科技大学学报, 2019,40(2):152-158.
[14] ( Jiang Yuehua, Ding Lei, Li Jiao’e, et al. Abstractive Summarization Model Considering Hybrid Lexical Features[J]. Journal of Hebei University of Science and Technology, 2019,40(2):152-158.)
[15] Hachey B, Grover C. Automatic Legal Text Summarization: Experiments with Summary Structuring[C]// Proceedings of the 10th International Conference on Artificial Intelligence and Law, Bologna, Italy. ACM, 2005: 75-84.
[16] Anand D, Wagh R. Effective Deep Learning Approaches for Summarization of Legal Texts[J]. Journal of King Saud University-Computer and Information Sciences, 2019.
[17] Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis. Association for Computational Linguistics, 2019: 4171-4186.
[18] Li Y, Zhang X, Chen D. CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake. IEEE, 2018: 1091-1100.
[19] Dong L, Yang N, Wang W, et al. Unified Language Model Pre-training for Natural Language Understanding and Generation[J]. Advances in Neural Information Processing Systems, 2019,32:13063-13075.
[20] Gu J, Lu Z, Li H, et al. Incorporating Copying Mechanism in Sequence-to-Sequence Learning[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin. ACL, 2016: 1631-1640.
[21] Lin C. Rouge: A Package for Automatic Evaluation of Summaries[C]// Proceedings of the 2004 Workshop on Text Summarization Branches Out, Spain. 2004: 74-81.
[22] Zhou Q, Yang N, Wei F, et al. Neural Document Summarization by Jointly Learning to Score and Select Sentences[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne. ACL, 2018: 654-663.
[23] See A, Liu P J, Manning C D. Get to the Point: Summarization with Pointer-generator Networks[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada. Association for Computational Linguistics, 2017: 1073-1083.
[24] Gehrmann S, Deng Y T, Rush A. Bottom-up Abstractive Summarization[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. ACL, 2018: 4098-4109.
[1] Wang Hong, Shu Zhan, Gao Yinquan, Tian Wenhong. Analyzing Implicit Discourse Relation with Single Classifier and Multi-Task Network[J]. 数据分析与知识发现, 2021, 5(11): 80-88.
[2] Wu Yanwen, Cai Qiuting, Liu Zhi, Deng Yunze. Digital Resource Recommendation Based on Multi-Source Data and Scene Similarity Calculation[J]. 数据分析与知识发现, 2021, 5(11): 114-123.
[3] Li Zhenyu, Li Shuqing. Deep Collaborative Filtering Algorithm with Embedding Implicit Similarity Groups[J]. 数据分析与知识发现, 2021, 5(11): 124-134.
[4] Dong Miao, Su Zhongqi, Zhou Xiaobei, Lan Xue, Cui Zhigang, Cui Lei. Improving PubMedBERT for CID-Entity-Relation Classification Using Text-CNN[J]. 数据分析与知识发现, 2021, 5(11): 145-152.
[5] Yu Chuanming, Zhang Zhengang, Kong Lingge. Comparing Knowledge Graph Representation Models for Link Prediction[J]. 数据分析与知识发现, 2021, 5(11): 29-44.
[6] Ding Hao, Ai Wenhua, Hu Guangwei, Li Shuqing, Suo Wei. A Personalized Recommendation Model with Time Series Fluctuation of User Interest[J]. 数据分析与知识发现, 2021, 5(11): 45-58.
[7] Hua Bin, Wu Nuo, He Xin. Integrating Expert Reviews for Government Information Projects with Knowledge Fusion[J]. 数据分析与知识发现, 2021, 5(10): 124-136.
[8] Wang Yuan, Shi Kaize, Niu Zhendong. Position-Aware Stepwise Tagging Method for Triples Extraction of Entity-Relationship[J]. 数据分析与知识发现, 2021, 5(10): 71-80.
[9] Yang Chen, Chen Xiaohong, Wang Chuhan, Liu Tingting. Recommendation Strategy Based on Users’ Preferences for Fine-Grained Attributes[J]. 数据分析与知识发现, 2021, 5(10): 94-102.
[10] Dai Zhihong, Hao Xiaoling. Extracting Hypernym-Hyponym Relationship for Financial Market Applications[J]. 数据分析与知识发现, 2021, 5(10): 60-70.
[11] Wang Xuefeng, Ren Huichao, Liu Yuqin. Research on the Visualization Method of Drawing Technology Theme Map with Clusters [J]. 数据分析与知识发现, 0, (): 1-.
[12] Wang Yifan,Li Bo,Shi Hua,Miao Wei,Jiang Bin. Annotation Method for Extracting Entity Relationship from Ancient Chinese Works[J]. 数据分析与知识发现, 2021, 5(9): 63-74.
[13] Che Hongxin,Wang Tong,Wang Wei. Comparing Prediction Models for Prostate Cancer[J]. 数据分析与知识发现, 2021, 5(9): 107-114.
[14] Zhou Yang,Li Xuejun,Wang Donglei,Chen Fang,Peng Lijuan. Visualizing Knowledge Graph for Explosive Formula Design[J]. 数据分析与知识发现, 2021, 5(9): 42-53.
[15] Ma Jiangwei, Lv Xueqiang, You Xindong, Xiao Gang, Han Junmei. Extracting Relationship Among Military Domains with BERT and Relation Position Features[J]. 数据分析与知识发现, 2021, 5(8): 1-12.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn