Please wait a minute...
Data Analysis and Knowledge Discovery  2018, Vol. 2 Issue (5): 59-69    DOI: 10.11925/infotech.2096-3467.2017.1119
Orginal Article Current Issue | Archive | Adv Search |
Extracting Text Features with Improved Fruit Fly Optimization Algorithm
Tingxin Wen1,Yangzi Li1(),Jingshuang Sun2
1Institute of Systems Engineering, Liaoning Technical University, Huludao 125105, China
2 College of Business Administration, Liaoning Technical University, Huludao 125105, China
Download: PDF(1207 KB)   HTML ( 3
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper tries to reduce the dimension of text feature vector space and then improves the accuracy of text classification. [Methods] We proposed a text feature selection model IFOATFSO based on the improved fruit fly optimization algorithm. It introduced the classification accuracy variance to monitor the convergence degree of the model. We also used the crossover operator, roulette wheel selection method based on simulated annealing mechanism and genetic algorithm to deepen global search and improve population diversity. [Results] The IFOATFSO model, which optimized the feature selection based on CHI method, not only reduced the feature dimension, but also improved the accuracy of text classification by up to 10.5%. [Limitations] The performance of IFOATFSO model for extracting English text features needs to be improved. [Conclusions] The IFOATFSO model improves the text classification.

Key wordsText Feature Selection      Fruit Fly Optimization Algorithm      Classification Accuracy Variance     
Received: 08 November 2017      Published: 20 June 2018

Cite this article:

Tingxin Wen,Yangzi Li,Jingshuang Sun. Extracting Text Features with Improved Fruit Fly Optimization Algorithm. Data Analysis and Knowledge Discovery, 2018, 2(5): 59-69.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2017.1119     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2018/V2/I5/59

[1] 林艳峰. 中文文本分类特征选择方法的研究与实现[D]. 西安: 西安电子科技大学, 2014.
[1] (Lin Yanfeng.Research and Implementation of Feature Selection in Chinese Text Classification [D]. Xi’an : Xidian University, 2014.)
[2] 范雪莉, 冯海泓, 原猛. 基于互信息的主成分分析特征选择算法[J]. 控制与决策, 2013, 28(6): 915-919.
[2] (Fan Xueli, Feng Haihong, Yuan Meng.PCA Based on Mutual Information for Feature Selection[J]. Control and Decision, 2013, 28(6): 915-919.)
[3] 路永和, 梁明辉. 遗传算法在改进文本特征提取方法中的应用[J]. 现代图书情报技术, 2014(4): 48-57.
[3] (Lu Yonghe, Liang Minghui.Improvement of Text Feature Extraction with Genetic Algorithm[J]. New Technology of Library and Information Service, 2014(4): 48-57.)
[4] 张彪. 文本分类中特征选择算法的分析与研究[D]. 合肥: 中国科学技术大学, 2010.
[4] (Zhang Biao.Analysis and Research on Feature Selection Algorithm for Text Classification [D]. Hefei: University of Science and Technology of China, 2010.)
[5] 邱云飞, 王威, 刘大有, 等. 基于方差的CHI特征选择方法[J]. 计算机应用研究, 2012, 29(4): 1304-1306.
[5] (Qiu Yunfei, Wang Wei, Liu Dayou, et al.CHI Feature Selection Method Based on Variance[J]. Application Research of Computers, 2012, 29(4): 1304-1306.)
[6] 石慧, 贾代平, 苗培. 基于词频信息的改进信息增益文本特征选择算法[J]. 计算机应用, 2014, 34(11): 3279-3282.
[6] (Shi Hui, Jia Daiping, Miao Pei.Improved Information Gain Text Feature Selection Algorithm Based on Word Frequency Information[J]. Journal of Computer Applications, 2014, 34(11): 3279-3282.)
[7] 刘松, 张德贤. 基于权重差异和类别关联的互信息改进研究[J]. 计算机应用研究, 2014, 31(7): 1998-2000.
[7] (Liu Song, Zhang Dexian.Mutual Information Feature Selection Method Based on Weight Difference and Categories Association[J]. Application Research of Computers, 2014, 31(7): 1998-2000.)
[8] U?uz H.A Two-stage Feature Selection Method for Text Categorization by Using Information Gain, Principal Component Analysis and Genetic Algorithm[J]. Knowledge- Based Systems, 2011, 24(7): 1024-1032.
[9] 邬开俊, 鲁怀伟. 采用并行协同进化遗传算法的文本特征选择[J]. 系统工程理论与实践, 2012, 32(10): 2215-2220.
[9] (Wu Kaijun, Lu Huaiwei.PCGA Used to Solve Text Feature Selection[J]. Systems Engineering — Theory & Practice, 2012, 32(10): 2215-2220.)
[10] Lu Y, Liang M, Ye Z, et al.Improved Particle Swarm Optimization Algorithm and Its Application in Text Feature Selection[J]. Applied Soft Computing, 2015, 35(C): 629-636.
[11] Dadaneh B Z, Markid H Y, Zakerolhosseini A.Unsupervised Probabilistic Feature Selection Using Ant Colony Optimization[J]. Expert Systems with Applications, 2016, 53: 27-42.
[12] 李志鹏, 李卫忠. 基于可拓小生境量子粒子群算法的特征选择[J]. 数据分析与知识发现, 2017, 1(7): 82-89.
[12] (Li Zhipeng, Li Weizhong.Feature Selection Based on Modified QPSO Algorithm[J]. Data Analysis and Knowledge Discovery, 2017, 1(7): 82-89.)
[13] 潘文超. 果蝇最佳化演算法[M]. 台北: 沧海书局, 2011: 10-12.
[13] (Pan Wenchao.Fruit Fly Optimization Algorithm [M]. Taipei: Tsang Hai Publishing Co., 2011: 10-12.)
[14] 肖振久, 孙健, 王永滨, 等. 基于果蝇优化算法的小波域数字水印算法[J]. 计算机应用, 2015, 35(9): 2527-2530.
[14] (Xiao Zhenjiu, Sun Jian, Wang Yongbin, et al.Wavelet Domain Digital Watermarking Method Based on Fruit Fly Optimization Algorithm[J]. Journal of Computer Applications, 2015, 35(9): 2527-2530.)
[15] Li M W, Geng J, Han D F, et al.Ship Motion Prediction Using Dynamic Seasonal RvSVR with Phase Space Reconstruction and the Chaos Adaptive Efficient FOA[J]. Neurocomputing, 2016, 174: 661-680.
[16] 耿立艳, 陈丽华. 基于FOA优化混合核LSSVM的铁路货运量预测[J]. 计算机应用研究, 2017, 34(2): 409-412.
[16] (Geng Liyan, Chen Lihua.Forecast on Railway Traffic Volume Using Mixed-kernel LSSVM Optimized by FOA[J]. Application Research of Computers, 2017, 34(2): 409-412.)
[17] 田旭, 李杰. 一种改进的果蝇优化算法及其在气动优化设计中的应用[J]. 航空学报, 2017, 38(4): 120370.
[17] (Tian Xu, Li Jie.An Improved Fruit Fly Optimization Algorithm and Its Application in Aerodynamic Optimization Design[J]. Acta Aeronautica et Astronautica Sinica, 2017, 38(4): 120370.)
[18] 徐同伟, 何庆, 吴意乐, 等. 基于量子果蝇优化的认知无线网络频谱分配[J]. 计算机应用研究, 2017, 34(10): 3116-3120.
[18] (Xu Tongwei, He Qing, Wu Yile, et al.Spectrum Allocation Based on Quantum Fruit Fly Optimization Algorithm in Cognitive Radio Network[J]. Application Research of Computers, 2017, 34(10): 3116-3120.)
[19] 王岩, 张波, 薛博. 基于FOA-SVM的中文文本分类方法研究[J]. 四川大学学报: 自然科学版, 2016, 53(4): 759-763.
[19] (Wang Yan, Zhang Bo, Xue Bo.Research on Chinese Classification Based on FOA-SVM[J]. Journal of Sichuan University: Natural Science Edition, 2016, 53(4): 759-763.)
No related articles found!
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn