New Technology of Library and Information Service  2009, Vol. Issue (10): 50-55    DOI: 10.11925/infotech.1003-3513.2009.10.09
Algorithm of the Text Copy Detection Based on Text Structure Tree
Wang Sen  Wang Yu
(School of Management, Dalian University of Technology, Dalian 116024, China)
 Concerning the present problem of a growing academic plagiarism,the algorithm of the text copy detection based on text structure tree is put forward.A paper can be divided into a construction tree with three layers:the uppermost root node is a text;branch node represents a sentence bag;leaf node denotes sentence.According to synthetic similarity and a function this paper computes sentence similarity,and similarity of leaf node is based on maximal sentence similarity.At the same time,the upper similarity is derived from the adjacent lower similarity.Finally,papers of China Journal Full-Text Database is chosen for a test,and the experimental result shows that this algorithm is feasible and efficient.

Key words Copy detection      Sentence similarity      Sentence bag      Structure tree     
Received: 28 August 2009      Published: 25 October 2009


Corresponding Authors: Wang Yu     E-mail:
About author:: Wang Sen,Wang Yu

Cite this article:

Wang Sen,Wang Yu. Algorithm of the Text Copy Detection Based on Text Structure Tree. New Technology of Library and Information Service, 2009, (10): 50-55.

