Data Analysis and Knowledge Discovery  2017, Vol. 1 Issue (6): 47-55    DOI: 10.11925/infotech.2096-3467.2017.06.05
Identifying Phishing Websites with Multiple Online Data Sources
Zhongyi Hu(),Chaoqun Wang,Jiang Wu
School of Information Management, Wuhan University, Wuhan 430072, China
The Center for Electronic Commerce Research and Development, Wuhan University, Wuhan 430072, China
[Objective] This study aims to identify phishing websites more effectively with the help of online evaluation data and URL abnormal features. [Methods] First, we used eight machine learning techniques to compare the performance of various online evaluation data and URL abnormal features in identifying phishing websites. Then, we proposed a new method to improve the accuracy of the identification procedures. [Results] We found that the evaluation data had better performance than abnormal features of URL. Combining the two data sets could improve the identification performance. [Limitations] We did not consider the difference between the numbers of phishing sites and the good ones. [Conclusions] Online evaluation data and URL abnormal features could help us identify phishing websites effectively, which indicates the direction of future studies.

Key wordsData Mining      Phishing Websites Identification      Machine Learning     
Received: 10 April 2017      Published: 25 August 2017

