Abstract:This paper introduces Gaussian Template and Sharpen Gaussian Template in computer image processing technology and summarizes main ideas of text feature weight adjustment,then proposes a text feature weight adjustment methodology based on Sharpen Gaussian Template. With corpus of Sogou Lab Data, KNN classifier and Class-center classifier, this methodology is experimented by Macro-averaging F-measures. The experimental result shows that the KNN classifier with this methodology performs better than the traditional method. However,Class-center classifier with this methodology has no significant improvement.
路永和, 何新宇. 锐化高斯模板在文本特征项权重调整方法中的应用[J]. 现代图书情报技术, 2012, (12): 39-44.
Lu Yonghe, He Xinyu. An Application of Sharpen Gaussian Template in a Text Feature Weight Adjustment Methodology. New Technology of Library and Information Service, 2012, (12): 39-44.
[1] How B C, Narayanan K. An Empirical Study of Feature Selection for Text Categorization Based on Term Weightage[C]. In: Proceedings of the 2004 IEEE /WIC/ACM International Conference on Web Intelligence (WI’04). Washington, DC: IEEE Computer Society, 2004:599-602. [2] Deng Z H, Tang S W, Yang D Q, et al. A Comparative Study on Feature Weight in Text Categorization[C]. In: Proceedings of the 6th Asia-Pacific Web Conference (APWeb 2004), Hangzhou, China. Springer, 2004:588-597. [3] 张保富,施化吉,马素琴.基于TF-IDF文本特征加权方法的改进研究[J]. 计算机应用与软件, 2011, 28(2):17-20.( Zhang Baofu, Shi Huaji, Ma Suqin. An Improved Text Feature Weighting Algorithm Based on TFIDF[J].Computer Applications and Software, 2011, 28(2):17-20.) [4] 李原.中文文本分类中分词和特征选择方法研究[D]. 长春: 吉林大学, 2011. (Li Yuan. Research on Word Segmentation and Feature Selection of Chinese Text Classification [D]. Changchun: Jilin University, 2011.) [5] 张瑜, 张德贤.一种改进的特征权重算法[J]. 计算机工程, 2011, 37(5): 210-212. (Zhang Yu, Zhang Dexian. Improved Feature Weight Algorithm[J]. Computer Engineering, 2011, 37(5): 210-212.) [6] 罗欣, 夏德麟, 晏蒲柳.基于词频差异的特征选取及改进的TF-IDF公式[J]. 计算机应用, 2005, 25(9):2031-2033. (Luo Xin, Xia Delin, Yan Puliu. Improved Feature Selection Method and TF-IDF Formula Based on Word Frequency Differentia[J].Journal of Computer Applications, 2005, 25(9):2031-2033.) [7] 吕佳.文本分类中基于方差的改进特征提取算法[J]. 计算机工程与设计, 2007, 28(24):6039-6041. (Lv Jia. Improved Feature Selection Algorithm Based on Variance in Text Categorization[J]. Computer Engineering and Design, 2007, 28(24):6039-6041.) [8] 苏力华,朱章华,白文华. 基于向量空间模型的文本分类特征权重算法研究[J]. 电脑知识与技术, 2010, 6(33):9327-9329. (Su Lihua, Zhu Zhanghua, Bai Wenhua. Term Weighting Algorithm in Text Categorization Based on VSM[J]. Computer Knowledge and Technology, 2010, 6(33):9327-9329.) [9] 石美红,毛江辉,梁颖,等. 一种强高斯噪声的图像滤波方法[J]. 计算机应用, 2007, 27(7): 1637-1640. (Shi Meihong, Mao Jianghui, Liang Ying, et al. Method for Filtering Image Contaminated with Strong Gaussian Noises[J]. Journal of Computer Applications, 2007, 27(7): 1637-1640.) [10] 田原嫄.图像平滑算子对边缘检测精度的影响[J]. 计算机工程与应用, 2009, 45(32):161-202. (Tian Yuanyuan. Precision of Edge Detection Affected by Smoothing Operator of Image[J]. Computer Engineering and Applications, 2009, 45(32):161-202.) [11] 图像锐化算法C + +实现[EB/OL]. [2012-11-25]. http://blog.csdn.net/hhygcy/article/details/4330939. (An Image Sharpening Algorithm Based on C + +[EB/OL]. [2012-11-25]. http://blog.csdn.net/hhygcy/article/details/4330939.) [12] 张爱华,靖红芳,王斌,等.文本分类中特征权重因子的作用研究[J]. 中文信息学报, 2010, 24(3):97-104. (Zhang Aihua, Jing Hongfang, Wang Bin, et al. Research on Effects of Term Weighting Factors for Text Categorization[J]. Journal of Chinese Information Processing, 2010, 24(3):97-104.) [13] 搜狗. 文本分类语料库[EB/OL]. [2012-11-25].http://www.sogou.com/labs/dl/c.html. (Sogou Lab. Text Classification Corpus [EB/OL]. [2012-11-25]. http://www.sogou.com/labs/dl/c.html.) [14] Turtle H R, Croft W B. A Comparison of Text Retrieval Models[J]. The Computer Journal, 1992, 35(3):279-290.