Please wait a minute...
Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (5): 104-114    DOI: 10.11925/infotech.2096-3467.2020.1109
Current Issue | Archive | Adv Search |
Automatic Abstracting Civil Judgment Documents with Two-Stage Procedure
Wang Yizhen,Ou Shiyan(),Chen Jinju
School of Information Management, Nanjing University, Nanjing 210023, China
Download: PDF (939 KB)   HTML ( 8
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper tries to automatically summarize the contents of civil judgment documents in the first-instance, aiming to provide concise, readable, coherent, accurate and efficient knowledge services. [Methods] We proposed an automatic abstracting method for judgment documents, which includes extractive summary stage and abstract summary stage. We first added the expanded residual gate convolution to the pre-training model to extract key sentences from the judgment documents. Then, we input the extractive summary to the sequence to sequence model and generated the final judgment document abstracts. [Results] The ROUGE indicators of the proposed model were 50.31, 36.60, and 48.86 with the experimental data sets of judgment documents, which were 25.00, 23.25, 24.66 higher than the results of the benchmark model (LEAD-3). [Limitations] The extractive summary obtained in the first stage is used as the input of the second stage abstract model, which creates cumulative error issue. The overall performance of the proposed model is decided by the extractive model of the first stage. [Conclusions] The proposed model could summarize judgment texts automatically, which solve the information overload issue and help users quickly read judgment documents.

Key wordsPre-trained Language Model      Automatic Summary      Judgment Documents      Abstract Summarization      Extractive Summarization     
Received: 11 November 2020      Published: 27 May 2021
ZTFLH:  TP391  
Fund:The work is supported by the National Social Science Foundation of China(17ATQ001)
Corresponding Authors: Ou Shiyan     E-mail: oushiyan@nju.edu.cn

Cite this article:

Wang Yizhen,Ou Shiyan,Chen Jinju. Automatic Abstracting Civil Judgment Documents with Two-Stage Procedure. Data Analysis and Knowledge Discovery, 2021, 5(5): 104-114.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2020.1109     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2021/V5/I5/104

篇章结构组成
成分
功能说明
案件基本信息 记录案件基础信息,包含:文书名、案号、文书性质、裁判日期、审理程序等信息
当事人信息 记录当事人信息,包含:原被告信息简介、原被告类型和数量以及代理律所等信息
审理经过 记录案件审理的过程,包含:案由、原被告名称、案件立案时间等信息
原告诉称 记录原告当事人的诉讼请求和对应诉讼请求的理由等信息
被告辩称 记录被告当事人针对原告当事人的诉讼请求和理由提出反驳的抗辩事由等信息
法院查明 记录法院针对当前案件进行事实、证据调查的结果,查明事实包含详细的证据、事情经过等信息
本院认为 记录裁判文书的说理过程,是法院就案件作出的说理评判信息,包含:当事人双方的争议焦点、案件说理逻辑、引用的法律法规
判决结果 记录案件的详细判决结果,包含原告的权利和义务信息
其他信息 记录审判人员、书记员、判决日期等信息
Explanation of the Chapter Structure and Function of the First-instance of Civil Judgments Document
The Framework of Sentences Structure Function in the First-instance of Civil Judgments Document
Two-Stage Automatic Summarization Model for Judgment Documents
The Extractive Summary Model of Judgment Document
The Abstractive Summary Model of Judgment Document
The Process of Marking the First-instance of Civil Judgments Document
操作系统 GPU Python Cuda Tensorflow-GPU Rouge Keras
Ubuntu 18.04 TITAN RTX 3.6.9 10.0 1.14.0 1.5.5 2.3.1
The Experimental Environments
方法 ROUGE
n=1 n=2 n=L
Baseline LEAD-3 25.31 13.35 24.20
抽取式 NeuSum 44.36 17.79 41.96
BERT+Classifier 45.60 19.99 43.57
BERT+Transformer 47.75 30.81 46.28
生成式 Transformer-Abstractive 44.15 28.62 43.14
Pointer-Generator Networks 47.78 32.82 47.13
Bottom-Up Abstractive 48.01 25.17 46.47
本文模型 TSSM-Extractive 49.69 31.94 48.85
TSSM 50.31 36.60 48.86
Control Experimental Results
[1] Edmundson H P. New Methods in Automatic Extracting[J]. Journal of the ACM, 1969,16(2):264-285.
doi: 10.1145/321510.321519
[2] Liu M, Yu Y, Qi Q, et al. Extractive Single Document Summarization via Multi-feature Combination and Sentence Compression[C]// Proceedings of the 6th CCF International Conference on Natural Language Processing and Chinese Computing. Springer, Cham, 2017: 807-817.
[3] Kupiec J, Pedersen J, Chen F. A Trainable Document Summarizer[C]// Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle. ACM, 1995: 68-73.
[4] Conroy J, O’leary D. Text Summarization via Hidden Markov Models[C]// Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New Orleans, Louisiana, USA. Association for Computing Machinery, 2001: 406-407.
[5] Alguliyev R M, Aliguliyev R M, Isazade N R, et al. COSUM: Text Summarization Based on Clustering and Optimization[J]. Expert Systems, 2019,36(1):e12340.
doi: 10.1111/exsy.v36.1
[6] Osborne T J, Nielsen M A. Entanglement in a Simple Quantum Phase Transition[J]. Physical Review A, 2002,66(3):0321103.
[7] Svore K, Vanderwende L, Burges C. Enhancing Single-document Summarization by Combining RankNet and Third-party Sources[C]// Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Prague, Czech Republic. Association for Computational Linguistics, 2007: 448-457.
[8] Liu L, Lu Y, Yang M, et al. Generative Adversarial Network for Abstractive Text Summarization[C]//Proceedings of the 2018 AAAI Conference on Artificial Intelligence, New Orleans, Louisiana, USA. AAAI Press, 2018: 8109-8110.
[9] Al-Sabahi K, Zhang Z, Nadher M. A Hierarchical Structured Self-Attentive Model for Extractive Document Summarization (HSSAS)[J]. IEEE Access, 2018,6:24205-24212.
doi: 10.1109/ACCESS.2018.2829199
[10] Cho K, van Merrienboer B, Gulcehre C, et al. Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation[OL]. arXiv Preprint, arXiv: 1406. 1078.
[11] Tan J, Wan X, Xiao J. Abstractive Document Summarization with a Graph-based Attentional Neural Model[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada. Association for Computational Linguistics, 2017: 1171-1181.
[12] Siddiqui T, Shamsi J A. Generating Abstractive Summaries Using Sequence to Sequence Attention[C]// Proceedings of the 2018 International Conference on Frontiers of Information Technology (FIT), Islamabad, Pakistan. IEEE Computer Society, 2018: 212-217.
[13] Celikyilmaz A, Bosselut A, He X D, et al. Deep Communicating Agents for Abstractive Summarization[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans. Association for Computational Linguistics, 2018,1:1662-1675.
[14] 江跃华, 丁磊, 李娇娥, 等. 融合词汇特征的生成式摘要模型[J]. 河北科技大学学报, 2019,40(2):152-158.
[14] ( Jiang Yuehua, Ding Lei, Li Jiao’e, et al. Abstractive Summarization Model Considering Hybrid Lexical Features[J]. Journal of Hebei University of Science and Technology, 2019,40(2):152-158.)
[15] Hachey B, Grover C. Automatic Legal Text Summarization: Experiments with Summary Structuring[C]// Proceedings of the 10th International Conference on Artificial Intelligence and Law, Bologna, Italy. ACM, 2005: 75-84.
[16] Anand D, Wagh R. Effective Deep Learning Approaches for Summarization of Legal Texts[J]. Journal of King Saud University-Computer and Information Sciences, 2019.
[17] Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis. Association for Computational Linguistics, 2019: 4171-4186.
[18] Li Y, Zhang X, Chen D. CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake. IEEE, 2018: 1091-1100.
[19] Dong L, Yang N, Wang W, et al. Unified Language Model Pre-training for Natural Language Understanding and Generation[J]. Advances in Neural Information Processing Systems, 2019,32:13063-13075.
[20] Gu J, Lu Z, Li H, et al. Incorporating Copying Mechanism in Sequence-to-Sequence Learning[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin. ACL, 2016: 1631-1640.
[21] Lin C. Rouge: A Package for Automatic Evaluation of Summaries[C]// Proceedings of the 2004 Workshop on Text Summarization Branches Out, Spain. 2004: 74-81.
[22] Zhou Q, Yang N, Wei F, et al. Neural Document Summarization by Jointly Learning to Score and Select Sentences[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne. ACL, 2018: 654-663.
[23] See A, Liu P J, Manning C D. Get to the Point: Summarization with Pointer-generator Networks[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada. Association for Computational Linguistics, 2017: 1073-1083.
[24] Gehrmann S, Deng Y T, Rush A. Bottom-up Abstractive Summarization[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. ACL, 2018: 4098-4109.
[1] Xu Zheng,Le Xiaoqiu. Generating AND-OR Logical Expressions for Semantic Features of Categorical Documents[J]. 数据分析与知识发现, 2021, 5(5): 95-103.
[2] Ma Yingxue,Gan Mingxin,Xiao Kejun. A Matrix Factorization Recommendation Method with Tags and Contents[J]. 数据分析与知识发现, 2021, 5(5): 71-82.
[3] Meng Zhen,Wang Hao,Yu Wei,Deng Sanhong,Zhang Baolong. Vocal Music Classification Based on Multi-category Feature Fusion[J]. 数据分析与知识发现, 2021, 5(5): 59-70.
[4] Xu Guang,Ren Ming,Song Chengyu. Extracting China’s Economic Image from Western News[J]. 数据分析与知识发现, 2021, 5(5): 30-40.
[5] Song Ruoxuan,Qian Li,Du Yu. Identifying Academic Creative Concept Topics Based on Future Work of Scientific Papers[J]. 数据分析与知识发现, 2021, 5(5): 10-20.
[6] Wu Xu,Chen Chunxu. Detecting Topics of Group Chats with Multiple Strategies[J]. 数据分析与知识发现, 2021, 5(5): 1-9.
[7] Duan Jianyong,Wei Xiaopeng,Wang Hao. A Multi-Perspective Co-Matching Model for Machine Reading Comprehension[J]. 数据分析与知识发现, 2021, 5(4): 134-141.
[8] Wang Yuzhu,Xie Jun,Chen Bo,Xu Xinying. Multi-modal Sentiment Analysis Based on Cross-modal Context-aware Attention[J]. 数据分析与知识发现, 2021, 5(4): 49-59.
[9] Li Feifei,Wu Fan,Wang Zhongqing. Sentiment Analysis with Reviewer Types and Generative Adversarial Network[J]. 数据分析与知识发现, 2021, 5(4): 72-79.
[10] Lv Xueqiang,Luo Yixiong,Li Jiaquan,You Xindong. Review of Studies on Detecting Chinese Patent Infringements[J]. 数据分析与知识发现, 2021, 5(3): 60-68.
[11] Chang Chengyang,Wang Xiaodong,Zhang Shenglei. Polarity Analysis of Dynamic Political Sentiments from Tweets with Deep Learning Method[J]. 数据分析与知识发现, 2021, 5(3): 121-131.
[12] Zhao Tianzi, Duan Liang, Yue Kun, Qiao Shaojie, Ma Zijuan. Generating News Clues with Biterm Topic Model[J]. 数据分析与知识发现, 2021, 5(2): 1-13.
[13] Xie Wang, Wang Lizhen, Chen Hongmei, Zeng Lanqing. Identifying Relationship Between Pollution Sources and Cancer Cases with Spatial Ordered Pair Patterns[J]. 数据分析与知识发现, 2021, 5(2): 14-31.
[14] Liu Huan,Zhang Zhixiong,Wang Yufei. A Review on Main Optimization Methods of BERT[J]. 数据分析与知识发现, 2021, 5(1): 3-15.
[15] Jiang Cuiqing,Wang Xiangxiang,Wang Zhao. Forecasting Car Sales Based on Consumer Attention[J]. 数据分析与知识发现, 2021, 5(1): 128-139.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn