Please wait a minute...
New Technology of Library and Information Service  2007, Vol. 2 Issue (11): 63-66    DOI: 10.11925/infotech.1003-3513.2007.11.13
Current Issue | Archive | Adv Search |
Research on the Copy Detection Based on the Similarity of Sentences
Qin Xinguo
(Dean’s Office of Nanjing Audit College,Nanjing 210029,China)
Download: PDF(435 KB)   HTML  
Export: BibTeX | EndNote (RIS)      
Abstract  

In the paper,a new document copy detection algorithm based on the similarity of the sentences is proposed.In order to improve the detection accuracy,the authors not only emphasize on the whole document,but also on the structure of the document.In the end,experiments and comparison are taken between the new algorithm and other typical algorithms,the result shows that it is feasible.

Key wordsDocument copy detection      Sentence similarity      Fingerprints     
Received: 18 September 2007      Published: 25 November 2007
: 

TP391

 
Corresponding Authors: Qin Xinguo     E-mail: qxg19811025@163.com
About author:: Qin Xinguo

Cite this article:

Qin Xinguo. Research on the Copy Detection Based on the Similarity of Sentences. New Technology of Library and Information Service, 2007, 2(11): 63-66.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2007.11.13     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2007/V2/I11/63

[1]  史彦军,滕弘飞,金博抄袭论文识别研究与进展[J]大连理工大学学报,2005,45(1):50-57
[2] 鲍军鹏,沈钧毅,刘晓东,等自然语言文档复制检测研究综述[J]软件学报,2003,14(10):1753-1760
[3] NamOh Kang,Alexander Gelbukh,et al.PPCheck:Plagiarism Pattern Checker in Document Copy Detection[EB/OL] .http://www.gelbukh.com/CV/Publications/2006/TSD-2006-Plagiarism.pdf.
[4] 何明,胡彩霞一种文本相似性的度量方法和计算方法[J]黄山学院学报,2005,7(6):71-72
[5] 宋擒豹,杨向荣,沈钧义,等数字商品非法复制的检测算法[J]计算机学报,2002,25(11):1206-1211
[6] Andrei Z B.On the Resemblance and Containment of Documents[C].Compression and Complexity of SEQUENCES1997,Salerno,Italy,1997:21-29
[7] Shivakumar N,Molina H G.SCAM:A Copy Detection Mechanism for Digital Documents[C]The 2nd International Conference in Theory and Practice of Digital Libraries,Austin,Texas,USA,1995:9-17
[8] Manber U.Finding Similar Files in a Large File System[C].USENIX Conference,SanFrancisco,CA,1994:1-10

[1] Yuan Dong, Xiong Jing, Liu Yongge. Research on Example-based Machine Translation for Oracle Bone Inscriptions[J]. 现代图书情报技术, 2012, 28(5): 48-54.
[2] Wang Zhichao, Weng Nan, Wang Yu. Research of Title Party News Identification Technology Based on Topic Sentence Similarity[J]. 现代图书情报技术, 2011, (11): 48-53.
[3] He Wei,Wang Yu. Extracting Topic Sentences form Web Text Based on Sentence Relationship Map[J]. 现代图书情报技术, 2009, 3(3): 57-61.
[4] Wang Sen,Wang Yu. Algorithm of the Text Copy Detection Based on Text Structure Tree[J]. 现代图书情报技术, 2009, (10): 50-55.
[5] Geng Chong,Xue Dejun. Study on Chinese Document Copy Detection[J]. 现代图书情报技术, 2007, 2(6): 33-37.
[6] Lian Zhanjun,Lv Xueqiang,Zhang Yujie,Shi Shuicai. Information Extraction Based on Calculation of Sentence Similarity[J]. 现代图书情报技术, 2007, 2(6): 38-41.
[7] Hua Bolin. Article Novelty Evaluation System Based on Sentence Matching[J]. 现代图书情报技术, 2007, 2(11): 40-44.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn