Please wait a minute...
Data Analysis and Knowledge Discovery  2018, Vol. 2 Issue (3): 30-38    DOI: 10.11925/infotech.2096-3467.2017.0822
Current Issue | Archive | Adv Search |
Identifying Interdisciplinary Social Science Research Based on Article Classification
Liu Liu1,2(),Dongbo Wang2,3
1(School of Information Management, Nanjing University, Nanjing 210023, China)
2(Jiangsu Key Laboratory of Data Engineering and Knowledge Service (Nanjing University), Nanjing 210023, China)
3(College of Information Science and Technology, Nanjing Agricultural University, Nanjing 210095, China)
Download: PDF(622 KB)   HTML ( 1
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This study aims to quantitatively examine the interdisciplinary social science research with the help of machine learning technique’s automatic classification method. [Methods] We used the KNN algorithm to classify social science papers indexed by CNKI and then proposed a new method to calculate their degree of interdisciplinarity. [Results] There was significant difference among classification results of all disciplines. We also found significant correlation between the classification results and interdisciplinarity of papers. [Limitations] More quantitative research is needed to expand the present study. [Conclusions] Machine learning could effectively identify the interdisciplinary social science studies.

Key wordsKNN      Text Classification      Interdisciplinarity     
Received: 15 August 2017      Published: 03 April 2018

Cite this article:

Liu Liu,Dongbo Wang. Identifying Interdisciplinary Social Science Research Based on Article Classification. Data Analysis and Knowledge Discovery, 2018, 2(3): 30-38.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2017.0822     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2018/V2/I3/30

[1] Yang Y, Liu X.A Re-examination of Text Categorization Methods[C]//Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 1999: 42-49.
[2] 苏金树, 张博锋, 徐昕. 基于机器学习的文本分类技术研究进展[J]. 软件学报, 2006, 17(9): 1848-1859.
[2] (Su Jinshu, Zhang Bofeng, Xu Xin.Advances in Machine Learning Based Text Categorization[J]. Journal of Software, 2006, 17(9): 1848-1859.)
[3] Tong S, Koller D.Support Vector Machine Active Learning with Applications to Text Classification[J]. Journal of Machine Learning Research, 2012, 2(1): 45-66.
[4] 李琼, 陈利. 一种改进的支持向量机文本分类方法[J]. 计算机技术与发展, 2015, 25(5): 78-82.
[4] (Li Qiong, Chen Li.An Improved Text Classification Method for Support Vector Machine[J]. Computer Technology and Development, 2015, 25(5): 78-82.)
[5] 周庆平, 谭长庚, 王宏君, 等. 基于聚类改进的KNN文本分类算法[J]. 计算机应用研究, 2016, 33(11): 3374-3377.
[5] (Zhou Qingping, Tan Changgeng, Wang Hongjun, et al.Improved KNN Text Classification Algorithm Based on Clustering[J]. Application Research of Computer, 2016, 33(11): 3374-3377.)
[6] Zhang X, Zhao J, Lecun Y.Character-level Convolutional Networks for Text Classification[C]//Proceedings of the 29th Annual Conference on Neural Information Processing Systems. 2015: 649-657.
[7] Conneau A, Schwenk H, Barrault L, et al.Very Deep Convolutional Networks for Text Classification[C]// Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. 2016: 1107-1116.
[8] 陈频. 中文科技论文文本分类研究[D]. 厦门: 厦门大学, 2006.
[8] (Chen Pin.The Text Classification Research of Chinese Technology Text[D]. Xiamen: Xiamen University, 2006.)
[9] 王东波, 苏新宁, 朱丹浩, 等. 基于支持向量机的医学期刊文章自动分类研究[J]. 情报理论与实践, 2011, 34(4): 115-118.
[9] (Wang Dongbo, Su Xinning, Zhu Danhao, et al.Automatic Classification of Medical Journal Articles Based on SVM[J]. Information Studies: Theory & Application, 2011, 34(4): 115-118.)
[10] 曾立梅. 基于文本数据挖掘的硕士论文分类技术[J]. 重庆邮电大学学报: 自然科学版, 2010, 22(5): 669-672.
[10] (Zeng Limei.Categorization of Master Thesis Based on Text Data Mining[J]. Journal of Chongqing University of Posts and Telecommunications: Natural Science Edition, 2010, 22(5): 669-672.)
[11] 叶鹏. 基于机器学习的中文期刊论文自动分类研究 [D]. 南京: 南京大学, 2013.
[11] (Ye Peng.Automatic Categorization of Chinese Journal Papers Based on Machine Learning[D]. Nanjing: Nanjing University, 2013.)
[12] 王昊, 叶鹏, 邓三鸿. 机器学习在中文期刊论文自动分类研究中的应用[J]. 现代图书情报技术, 2014(3): 80-87.
[12] (Wang Hao, Ye Peng, Deng Sanhong.The Application of Machine-Learning in the Research on Automatic Categorization of Chinese Periodical Articles[J]. New Technology of Library and Information Service, 2014(3): 80-87.)
[13] 王细薇, 樊兴华, 赵军. 一种基于特征扩展的中文短文本分类方法[J]. 计算机应用, 2009, 29(3): 843-845.
[13] (Wang Xiwei, Fan Xinghua, Zhao Jun.Method for Chinese Short Text Classification Based on Feature Extension[J]. Journal of Computer Applications, 2009, 29(3): 843-845.)
[14] 王盛, 樊兴华, 陈现麟. 利用上下位关系的中文短文本分类[J]. 计算机应用, 2010, 30(3): 603-606.
[14] (Wang Sheng, Fan Xinghua, Chen Xianlin.Chinese Short Text Classification Based on Hyponymy Relation[J]. Journal of Computer Applications, 2010, 30(3): 603-606.)
[15] Bronstein L R.A Model for Interdisciplinary Collaboration[J]. Social Work, 2003, 48(3): 297-306.
[16] Rhoten D, Parker A.Risks and Rewards of an Interdisciplinary Research Path[J]. Science, 2004, 306(5704): 2046.
[17] Klein J T.Interdisciplinarity: History, Theory, and Practice [M]. Wayne State University Press, 1990.
[18] 章成志, 吴小兰. 跨学科研究综述[J]. 情报学报, 2017, 36(5): 523-535.
[18] (Zhang Chengzhi, Wu Xiaolan.Review on Interdisciplinary Research[J]. Journal of the China Society for Scientific and Technical Information, 2017, 36(5): 523-535.)
[19] Per A, Kjell A, Matilda A, et al.Solving Problems in Social-Ecological Systems: Definition, Practice and Barriers of Transdisciplinary Research[J]. Ambio, 2013, 42(2): 254-265.
[20] Alexander J, Bache K, Chase J, et al.An Exploratory Study of Interdisciplinarity and Breakthrough Ideas[C]// Proceedings of Picmet 2013: Technology Management in the It-Driven Services. IEEE, 2013: 2130-2140.
[21] Klein J T.Evaluation of Interdisciplinary and Transdisciplinary Research: A Literature Review[J]. American Journal of Preventive Medicine, 2008, 35(2): 116-123.
[22] 刘小宝, 刘仲林. 跨学科研究前沿理论动态: 学术背景和理论焦点[J]. 浙江大学学报: 人文社会科学版, 2012, 42(6): 16-26.
[22] (Liu Xiaobao, Liu Zhonglin.Academic Background and Theoretical Focus of Interdisciplinary Research[J]. Journal of Zhejiang University: Humanities and Social Sciences, 2012, 42(6): 16-26.)
[23] 韩普, 王东波. 跨学科性的理论与实践研究综述[J]. 情报学报, 2014, 33(11): 1222-1232.
[23] (Han Pu, Wang Dongbo.A Review on Theories and Practices of Interdisplinarity[J]. Journal of the China Society for Scientific and Technical Information, 2014, 33(11): 1222-1232.)
[24] 许海云, 尹春晓, 郭婷, 等. 学科交叉研究综述[J]. 图书情报工作, 2015, 59(5): 119-127.
[24] (Xu Haiyun, Yin Chunxiao, Guo Ting, et al.Interdisciplinary Research Review[J]. Library and Information Service, 2015, 59(5): 119-127.)
[25] Stein Z.Modeling the Demands of Interdisciplinarity: Toward a Framework for Evaluating Interdisciplinary Endeavors[J]. Integral Review, 2007, 4(1): 91-107.
[26] Jacobs J A, Frickel S.Interdisciplinarity: A Critical Assessment[J]. Annual Review of Sociology, 2009, 35: 43-65.
[27] Porter A L, Roessner D J, Heberger A E.How Interdisciplinary is a Given Body of Research?[J]. Research Evaluation, 2008, 17(4): 273-282.
[28] Graybill J K, Dooling S, Shandas V, et al.A Rough Guide to Interdisciplinarity: Graduate Student Perspectives[J]. BioScience, 2006, 56(9): 757-763.
[29] Dalgaard T, Hutchings N J, Porter J R.Agroecology, Scaling and Interdisciplinarity[J]. Agriculture, Ecosystems & Environment, 2003, 100(1): 39-51.
[30] Schummer J. Multidisciplinarity, Interdisciplinarity,Patterns of Research Collaboration in Nanoscience and Nanotechnology[J]. Scientometrics, 2004, 59(3): 425-465.
[31] Choi B C K, Pak A W P. Multidisciplinarity, Interdisciplinarity, and Transdisciplinarity in Health Research, Services, Education and Policy: 2. Promotors, Barriers, and Strategies of Enhancement[J]. Clinical & Investigative Medicine, 2007, 30(6): 224-232.
[32] Choi B C K, Anita W P. Multidisciplinarity, Interdisciplinarity, and Transdisciplinarity in Health Research, Services, Education and Policy: 3. Discipline, Inter-discipline Distance, and Selection of Discipline[J]. Clinical & Investigative Medicine, 2008, 31(1): 41-48.
[33] Hirst G.Discipline Impact Factors: A Method for Determining Core Journal Lists[J]. Journal of the Association for Information Science & Technology, 2010, 29(4): 171-172.
[34] Cassi L, Mescheba W, De Turckheim E.How to Evaluate the Degree of Interdisciplinarity of an Institution?[J]. Scientometrics, 2014, 101(3): 1871-1895.
[35] Leeuwen T N V, Moed H F. Characteristics of Journal Impact Factors: The Effects of Uncitedness and Citation Distribution on the Understanding of Journal Impact Factors[J]. Scientometrics, 2005, 63(2): 357-371.
[36] Vinkler P.The Use of the Percentage Rank Position Index for Comparative Evaluation of Journals[J]. Journal of Informetrics, 2014, 8(2): 340-348.
[37] ICTCLAS [EB/OL]. [2018-01-12]. .
[38] Jones K S.A Statistical Interpretation of Term Specificity and Its Application in Retrieval[J]. Journal of Documentation, 1972, 28(1): 11-21.
[39] Yang Y, Pedersen J O.A Comparative Study on Feature Selection in Text Categorization[C]//Proceedings of the 14th International Conference on Machine Learning (ICML 1997). 1997: 412-420.
[40] 代六玲, 黄河燕, 陈肇雄. 中文文本分类中特征抽取方法的比较研究[J]. 中文信息学报, 2004, 18(1): 27-33.
[40] (Dai Liuling, Huang Heyan, Chen Zhaoxiong.A Comparative Study on Feature Selection in Chinese Text Categorization[J]. Journal of Chinese Information Processing, 2004, 18(1): 27-33.)
[41] 周志华. 机器学习[M]. 北京: 清华大学出版社, 2016.
[41] (Zhou Zhihua.Machine Learning [M]. Beijing: Tsinghua University Press, 2016.)
[1] Wancheng Chen,Haoran Dai,Yinghan Jin. Appraising Home Prices with HEDONIC Model: Case Study of Seattle, U.S.[J]. 数据分析与知识发现, 2019, 3(5): 19-26.
[2] Bengong Yu,Yangnan Chen,Ying Yang. Classifying Short Text Complaints with nBD-SVM Model[J]. 数据分析与知识发现, 2019, 3(5): 77-85.
[3] Zixuan Zhang,Hao Wang,Liping Zhu,Sanhong eng. Identifying Risks of HS Codes by China Customs[J]. 数据分析与知识发现, 2019, 3(1): 72-84.
[4] Xinlei Li,Hao Wang,Xiaomin Liu,Sanhong Deng. Comparing Text Vector Generators for Weibo Short Text Classification[J]. 数据分析与知识发现, 2018, 2(8): 41-50.
[5] Dong Li,Shouchuan Tong,Jiang Li. Analyzing Interdisciplinarity and Scientists’ Academic Impacts[J]. 数据分析与知识发现, 2018, 2(12): 1-11.
[6] Xiangdong Li,Tao Ruan,Kang Liu. Automatic Classification of Documents from Wikipedia[J]. 数据分析与知识发现, 2017, 1(10): 43-52.
[7] Yonghe Lu,Jinghuang Chen. Optimizing Feature Selection Method for Text Classification with Shuffled Frog Leaping Algorithm[J]. 数据分析与知识发现, 2017, 1(1): 91-101.
[8] Qun Zhang, Hongjun Wang, Lunwen Wang. Classifying Short Texts with Word Embedding and LDA Model[J]. 数据分析与知识发现, 2016, 32(12): 27-35.
[9] Hu Juxiang, Lv Xueqiang, Liu Kehui. Complaint Text Classification Based on Guiding Words[J]. 现代图书情报技术, 2015, 31(7-8): 97-103.
[10] Li Xiangdong, Ba Zhichao, Huang Li. Allocation and Multi-granularity[J]. 现代图书情报技术, 2015, 31(5): 42-49.
[11] Lu Yonghe, Wang Hongbin. Feature Weighting Method Affected by Part of Speech in Text Classification[J]. 现代图书情报技术, 2015, 31(4): 18-25.
[12] Li Xiangdong, Cao Huan, Ding Cong, Huang Li. Short-text Classification Based on HowNet and Domain Keyword Set Extension[J]. 现代图书情报技术, 2015, 31(2): 31-38.
[13] Liu Huailiang, Du Kun, Qin Chunxiu. Research on Chinese Text Categorization Based on Semantic Similarity of HowNet[J]. 现代图书情报技术, 2015, 31(2): 39-45.
[14] Du Kun, Liu Huailiang, Guo Lujie. Study on the Modified Method of Feature Weighting with Complex Networks[J]. 现代图书情报技术, 2015, 31(11): 26-32.
[15] Tan Xueqing, Zhou Tong, Luo Lin. A Text Classification Algorithm Based on the Average Category Similarity[J]. 现代图书情报技术, 2014, 30(9): 66-73.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn