New Technology of Library and Information Service  2016, Vol. 32 Issue (11): 27-33    DOI: 10.11925/infotech.1003-3513.2016.11.04
Using Non-standard Text Features to Identify Authors
Guo Xu(),Qi Ruihua
School of Software, Dalian University of Foreign Languages, Dalian 116044,China
[Objective] This paper aims to identify authors with features extracted from non-standard online texts. [Methods] First, we used the non-standard text similarity M defined by the Jaccard coefficient. Second, we adopted the frequency of non-standard text from the corpus. [Results] The recognition accuracy of the two features were 85.1% and 80.2%. Adding the two features to the traditional recognition mechanism, the precision of the system increased by 5.8% and 4%, respectively. [Limitations] We did not study the online texts from the syntactic and structure levels. [Conclusions] The proposed method could effectively extract the non-standard text features and then improve the accuracy of author identification.

Key wordsAuthor identification      Non-standard text      Network text      Text similarity     
Received: 12 July 2016      Published: 20 December 2016

Guo Xu,Qi Ruihua. Using Non-standard Text Features to Identify Authors. New Technology of Library and Information Service, 2016, 32(11): 27-33.

