Multi-Label Classification of Chinese Books with LSTM Model
Deng Sanhong, Fu Yuyangzi(), Wang Hao
School of Information Management, Nanjing University, Nanjing 210023 Jiangsu Key Laboratory of Data Engineering and Knowledge Service (Nanjing University), Nanjing 210023, China
[Objective] This paper proposes a new method to automatically cataloguing Chinese books based on LSTM model, aiming to solve the issues facing single or multi-label classification. [Methods] First, we introduced deep learning algorithms to construct a new classification system with character embedding technique. Then, we trained the LSTM model with strings consisting of titles and keywords. Finally, we constructed multiple binary classifiers, which were examined with bibliographic data from three universities. [Results] The proposed model performed well and had practical value. [Limitations] We only analyzed five categories of Chinese bibliographies, and the granularity of classification was coarse. [Conclusions] The proposed Chinese book classification system based on LSTM model could preprocess data and learn incrementally, which could be transferred to other fields.
邓三鸿, 傅余洋子, 王昊. 基于LSTM模型的中文图书多标签分类研究*[J]. 数据分析与知识发现, 2017, 1(7): 52-60.
Deng Sanhong,Fu Yuyangzi,Wang Hao. Multi-Label Classification of Chinese Books with LSTM Model. Data Analysis and Knowledge Discovery, 2017, 1(7): 52-60.
(Luo Xueying.Talking About the Construction Target of Digital Library[J]. Modern Information, 2002, 22(12): 131-132.)
doi: 10.3969/j.issn.1008-0821.2002.12.072
[2]
Luhn H P.Auto-encoding of Documents for Information Retrieval Systems[M]. IBM Research Center, 1958.
(Xiao Ming.Study on the Theory and Practice of Automatic Indexing of WWW Science and Technology Information Resources[D]. Beijing: National Science Library, Chinese Academy of Sciences, 2001.)
[4]
Lewis D D, Ringuette M.A Comparison of Two Learning Algorithms for Text Categorization[C]//Proceedings of the 3rd Annual Symposium on Document Analysis and Information Retrieval, Las Vegas. Information Science Research Institute, University of Nevada, 1994, 33: 81-93.
[5]
Yang Y, Chute C G.An Example-based Mapping Method for Text Categorization and Retrieval[J]. ACM Transactions on Information Systems (TOIS), 1994, 12(3): 252-277.
doi: 10.1145/183422.183424
(Chen Lifu, Zhou Ning, Li Dan.Study on Machine Learning Based Automatic Text Categorization Model[J]. New Technology of Library and Information Service,2005(10): 23-27.)
doi: 10.3969/j.issn.1003-3513.2005.10.006
[7]
Weigend A S, Wiener E D, Pedersen J O.Exploiting Hierarchy in Text Categorization[J]. Information Retrieval, 1999, 1(3): 193-216.
doi: 10.1023/A:1009983522080
(Lv Xiaoyong, Shi Hongbo.Multi-label Text Classification Algorithm Based on Frequent Item Sets[J]. Computer Engineering, 2010, 36(15): 83-85.)
[10]
Joachims T.Text Categorization with Support Vector Machines: Learning with Many Relevant Features[A]// Machine Learning: ECML-98[M]. Springer, Berlin, Heidelberg, 1998: 137-142.
[11]
Crammer K, Singer Y.A New Family of Online Algorithms for Category Ranking[C]// Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Tampere, Finland. New York: ACM, 2002: 151-158.
[12]
Ueda N, Saito K.Parametric Mixture Models for Multi- Labeled Text[A]//Advances in Neural Information Processing Systems[M]. MIT Press, 2003: 737-744.
[13]
Zhang M, Zhou Z.Multi-Label Learning by Instance Differentiation[C]//Proceedings of the 22nd Conference on Artificial Intelligence. 2007: 669-674.
[14]
Liu Y, Jin R, Yang L.Semi-supervised Multi-label Learning by Constrained Non-negative Matrix Factorization[C]// Proceedings of the 21st Conference on Artificial Intelligence, Boston, Massachusetts, USA. 2006, 6: 421-426.
Gers F A, Schmidhuber J, Cummins F.Learning to Forget: Continual Prediction with LSTM[J]. Neural Computation, 2000, 12(10): 2451-2471.
doi: 10.1162/089976600300015015
[17]
Graves A.Supervised Sequence Labelling with Recurrent Neural Networks [D]. München: Technische Universität München, 2008.
Hochreiter S.Recurrent Neural Net Learning and Vanishing Gradient[J]. International Journal of Uncertainity, Fuzziness and Knowledge-Based Systems, 1998, 6(2): 107-116.
doi: 10.1142/S0218488598000094
[20]
Hochreiter S, Bengio Y, Frasconi P, et al.Gradient Flow in Recurrent Nets: The Difficulty of Learning Long-term Dependencies[A]// A Field Guide to Dynamical Recurrent Neural Networks[M]. Wiley-IEEE Press, 2001.
[21]
邱锡鹏. 神经网络与深度学习[EB/OL]. [2017-04-21].
[21]
(Qiu Xipeng.Neural Network and Deep Learning [EB/OL]. [2017-04-21].)
[22]
Hinton G E.Learning Distributed Representations of Concepts[C]//Proceedings of the 8th Annual Conference of the Cognitive Science Society. 1986.
[23]
Chung J, Cho K, Bengio Y.A Character-Level Decoder Without Explicit Segmentation for Neural Machine Translation[OL]. arXiv Preprint, arXiv:1603.06147.
[24]
周志华. 机器学习[M]. 北京: 清华大学出版社, 2016.
[24]
(Zhou Zhihua.Machine Learning[M]. Beijing: Tsinghua University Press, 2016.)
[25]
Kingma D, Ba J.Adam: A Method for Stochastic Optimization[OL]. arXiv Preprint, arXiv:1412.6980.
(Wang Hao, Yan Ming, Su Xinning.Research on Automatic Classification for Chinese Bibliography Based on Machine Learning[J]. Journal of the Library Science in China, 2010, 36(6): 28-39.)