Please wait a minute...
Data Analysis and Knowledge Discovery  2018, Vol. 2 Issue (3): 30-38    DOI: 10.11925/infotech.2096-3467.2017.0822
Current Issue | Archive | Adv Search |
Identifying Interdisciplinary Social Science Research Based on Article Classification
Liu Liu1,2(), Wang Dongbo2,3
1(School of Information Management, Nanjing University, Nanjing 210023, China)
2(Jiangsu Key Laboratory of Data Engineering and Knowledge Service (Nanjing University), Nanjing 210023, China)
3(College of Information Science and Technology, Nanjing Agricultural University, Nanjing 210095, China)
Download: PDF (622 KB)   HTML ( 4
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This study aims to quantitatively examine the interdisciplinary social science research with the help of machine learning technique’s automatic classification method. [Methods] We used the KNN algorithm to classify social science papers indexed by CNKI and then proposed a new method to calculate their degree of interdisciplinarity. [Results] There was significant difference among classification results of all disciplines. We also found significant correlation between the classification results and interdisciplinarity of papers. [Limitations] More quantitative research is needed to expand the present study. [Conclusions] Machine learning could effectively identify the interdisciplinary social science studies.

Key wordsKNN      Text Classification      Interdisciplinarity     
Received: 15 August 2017      Published: 03 April 2018
ZTFLH:  G350  

Cite this article:

Liu Liu,Wang Dongbo. Identifying Interdisciplinary Social Science Research Based on Article Classification. Data Analysis and Knowledge Discovery, 2018, 2(3): 30-38.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2017.0822     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2018/V2/I3/30

学科 正确率 召回率 F1值
体育学 0.99 0.98 0.99
图书情报 0.98 0.97 0.98
教育学 0.96 0.98 0.97
心理学 0.97 0.97 0.97
法学 0.96 0.96 0.96
民族学 0.97 0.94 0.95
人文、地理 0.95 0.94 0.94
经济学 0.90 0.96 0.93
环境科学 0.93 0.92 0.92
管理科学 0.92 0.85 0.89
宏平均 0.95 0.95 0.95
微平均 0.94 0.94 0.94
学科 正确率 召回率 F1值
体育学 0.94 0.90 0.92
图书情报 0.86 0.84 0.85
教育学 0.82 0.85 0.83
心理学 0.81 0.81 0.81
法学 0.81 0.79 0.80
民族学 0.78 0.76 0.77
经济学 0.64 0.76 0.69
人文、地理 0.65 0.58 0.62
环境科学 0.65 0.51 0.57
管理科学 0.52 0.48 0.50
宏平均 0.75 0.73 0.74
微平均 0.73 0.73 0.73
学科 跨学科度 自动分类的召回率
体育学 0.08 0.92
教育学 0.14 0.86
图书情报 0.15 0.85
民族学 0.21 0.79
人文、地理 0.22 0.78
心理学 0.22 0.78
环境科学 0.36 0.64
法学 0.36 0.64
经济学 0.46 0.54
管理科学 0.52 0.48
[1] Yang Y, Liu X.A Re-examination of Text Categorization Methods[C]//Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 1999: 42-49.
[2] 苏金树, 张博锋, 徐昕. 基于机器学习的文本分类技术研究进展[J]. 软件学报, 2006, 17(9): 1848-1859.
[2] (Su Jinshu, Zhang Bofeng, Xu Xin.Advances in Machine Learning Based Text Categorization[J]. Journal of Software, 2006, 17(9): 1848-1859.)
[3] Tong S, Koller D.Support Vector Machine Active Learning with Applications to Text Classification[J]. Journal of Machine Learning Research, 2012, 2(1): 45-66.
[4] 李琼, 陈利. 一种改进的支持向量机文本分类方法[J]. 计算机技术与发展, 2015, 25(5): 78-82.
doi: 10.3969/j.issn.1673-629X.2015.05.019
[4] (Li Qiong, Chen Li.An Improved Text Classification Method for Support Vector Machine[J]. Computer Technology and Development, 2015, 25(5): 78-82.)
doi: 10.3969/j.issn.1673-629X.2015.05.019
[5] 周庆平, 谭长庚, 王宏君, 等. 基于聚类改进的KNN文本分类算法[J]. 计算机应用研究, 2016, 33(11): 3374-3377.
[5] (Zhou Qingping, Tan Changgeng, Wang Hongjun, et al.Improved KNN Text Classification Algorithm Based on Clustering[J]. Application Research of Computer, 2016, 33(11): 3374-3377.)
[6] Zhang X, Zhao J, Lecun Y.Character-level Convolutional Networks for Text Classification[C]//Proceedings of the 29th Annual Conference on Neural Information Processing Systems. 2015: 649-657.
[7] Conneau A, Schwenk H, Barrault L, et al.Very Deep Convolutional Networks for Text Classification[C]// Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. 2016: 1107-1116.
[8] 陈频. 中文科技论文文本分类研究[D]. 厦门: 厦门大学, 2006.
[8] (Chen Pin.The Text Classification Research of Chinese Technology Text[D]. Xiamen: Xiamen University, 2006.)
[9] 王东波, 苏新宁, 朱丹浩, 等. 基于支持向量机的医学期刊文章自动分类研究[J]. 情报理论与实践, 2011, 34(4): 115-118.
[9] (Wang Dongbo, Su Xinning, Zhu Danhao, et al.Automatic Classification of Medical Journal Articles Based on SVM[J]. Information Studies: Theory & Application, 2011, 34(4): 115-118.)
[10] 曾立梅. 基于文本数据挖掘的硕士论文分类技术[J]. 重庆邮电大学学报: 自然科学版, 2010, 22(5): 669-672.
doi: 10.3979/j.issn.1673-825X.2010.05.029
[10] (Zeng Limei.Categorization of Master Thesis Based on Text Data Mining[J]. Journal of Chongqing University of Posts and Telecommunications: Natural Science Edition, 2010, 22(5): 669-672.)
doi: 10.3979/j.issn.1673-825X.2010.05.029
[11] 叶鹏. 基于机器学习的中文期刊论文自动分类研究 [D]. 南京: 南京大学, 2013.
[11] (Ye Peng.Automatic Categorization of Chinese Journal Papers Based on Machine Learning[D]. Nanjing: Nanjing University, 2013.)
[12] 王昊, 叶鹏, 邓三鸿. 机器学习在中文期刊论文自动分类研究中的应用[J]. 现代图书情报技术, 2014(3): 80-87.
[12] (Wang Hao, Ye Peng, Deng Sanhong.The Application of Machine-Learning in the Research on Automatic Categorization of Chinese Periodical Articles[J]. New Technology of Library and Information Service, 2014(3): 80-87.)
[13] 王细薇, 樊兴华, 赵军. 一种基于特征扩展的中文短文本分类方法[J]. 计算机应用, 2009, 29(3): 843-845.
[13] (Wang Xiwei, Fan Xinghua, Zhao Jun.Method for Chinese Short Text Classification Based on Feature Extension[J]. Journal of Computer Applications, 2009, 29(3): 843-845.)
[14] 王盛, 樊兴华, 陈现麟. 利用上下位关系的中文短文本分类[J]. 计算机应用, 2010, 30(3): 603-606.
doi: 10.7666/d.y1989082
[14] (Wang Sheng, Fan Xinghua, Chen Xianlin.Chinese Short Text Classification Based on Hyponymy Relation[J]. Journal of Computer Applications, 2010, 30(3): 603-606.)
doi: 10.7666/d.y1989082
[15] Bronstein L R.A Model for Interdisciplinary Collaboration[J]. Social Work, 2003, 48(3): 297-306.
doi: 10.1111/j.1540-4781.2007.00719_7.x pmid: 12899277
[16] Rhoten D, Parker A.Risks and Rewards of an Interdisciplinary Research Path[J]. Science, 2004, 306(5704): 2046.
doi: 10.1126/science.1103628 pmid: 15604393
[17] Klein J T.Interdisciplinarity: History, Theory, and Practice [M]. Wayne State University Press, 1990.
[18] 章成志, 吴小兰. 跨学科研究综述[J]. 情报学报, 2017, 36(5): 523-535.
[18] (Zhang Chengzhi, Wu Xiaolan.Review on Interdisciplinary Research[J]. Journal of the China Society for Scientific and Technical Information, 2017, 36(5): 523-535.)
[19] Per A, Kjell A, Matilda A, et al.Solving Problems in Social-Ecological Systems: Definition, Practice and Barriers of Transdisciplinary Research[J]. Ambio, 2013, 42(2): 254-265.
doi: 10.1007/s13280-012-0372-4 pmid: 3593036
[20] Alexander J, Bache K, Chase J, et al.An Exploratory Study of Interdisciplinarity and Breakthrough Ideas[C]// Proceedings of Picmet 2013: Technology Management in the It-Driven Services. IEEE, 2013: 2130-2140.
[21] Klein J T.Evaluation of Interdisciplinary and Transdisciplinary Research: A Literature Review[J]. American Journal of Preventive Medicine, 2008, 35(2): 116-123.
doi: 10.1016/j.amepre.2008.05.010
[22] 刘小宝, 刘仲林. 跨学科研究前沿理论动态: 学术背景和理论焦点[J]. 浙江大学学报: 人文社会科学版, 2012, 42(6): 16-26.
doi: 10.3785/j.issn.1008-942X.2012.01.171
[22] (Liu Xiaobao, Liu Zhonglin.Academic Background and Theoretical Focus of Interdisciplinary Research[J]. Journal of Zhejiang University: Humanities and Social Sciences, 2012, 42(6): 16-26.)
doi: 10.3785/j.issn.1008-942X.2012.01.171
[23] 韩普, 王东波. 跨学科性的理论与实践研究综述[J]. 情报学报, 2014, 33(11): 1222-1232.
doi: 10.3772/j.issn.10000135.2014.011.011
[23] (Han Pu, Wang Dongbo.A Review on Theories and Practices of Interdisplinarity[J]. Journal of the China Society for Scientific and Technical Information, 2014, 33(11): 1222-1232.)
doi: 10.3772/j.issn.10000135.2014.011.011
[24] 许海云, 尹春晓, 郭婷, 等. 学科交叉研究综述[J]. 图书情报工作, 2015, 59(5): 119-127.
[24] (Xu Haiyun, Yin Chunxiao, Guo Ting, et al.Interdisciplinary Research Review[J]. Library and Information Service, 2015, 59(5): 119-127.)
[25] Stein Z.Modeling the Demands of Interdisciplinarity: Toward a Framework for Evaluating Interdisciplinary Endeavors[J]. Integral Review, 2007, 4(1): 91-107.
[26] Jacobs J A, Frickel S.Interdisciplinarity: A Critical Assessment[J]. Annual Review of Sociology, 2009, 35: 43-65.
doi: 10.1146/annurev-soc-070308-115954
[27] Porter A L, Roessner D J, Heberger A E.How Interdisciplinary is a Given Body of Research?[J]. Research Evaluation, 2008, 17(4): 273-282.
doi: 10.3152/095820208X364553
[28] Graybill J K, Dooling S, Shandas V, et al.A Rough Guide to Interdisciplinarity: Graduate Student Perspectives[J]. BioScience, 2006, 56(9): 757-763.
doi: 10.1641/0006-3568(2006)56[757:ARGTIG]2.0.CO;2
[29] Dalgaard T, Hutchings N J, Porter J R.Agroecology, Scaling and Interdisciplinarity[J]. Agriculture, Ecosystems & Environment, 2003, 100(1): 39-51.
[30] Schummer J. Multidisciplinarity, Interdisciplinarity,Patterns of Research Collaboration in Nanoscience and Nanotechnology[J]. Scientometrics, 2004, 59(3): 425-465.
doi: 10.1023/B:SCIE.0000018542.71314.38
[31] Choi B C K, Pak A W P. Multidisciplinarity, Interdisciplinarity, and Transdisciplinarity in Health Research, Services, Education and Policy: 2. Promotors, Barriers, and Strategies of Enhancement[J]. Clinical & Investigative Medicine, 2007, 30(6): 224-232.
doi: 10.3727/096368908786576499 pmid: 18053389
[32] Choi B C K, Anita W P. Multidisciplinarity, Interdisciplinarity, and Transdisciplinarity in Health Research, Services, Education and Policy: 3. Discipline, Inter-discipline Distance, and Selection of Discipline[J]. Clinical & Investigative Medicine, 2008, 31(1): 41-48.
[33] Hirst G.Discipline Impact Factors: A Method for Determining Core Journal Lists[J]. Journal of the Association for Information Science & Technology, 2010, 29(4): 171-172.
doi: 10.1002/asi.4630290403
[34] Cassi L, Mescheba W, De Turckheim E.How to Evaluate the Degree of Interdisciplinarity of an Institution?[J]. Scientometrics, 2014, 101(3): 1871-1895.
doi: 10.1007/s11192-014-1280-0
[35] Leeuwen T N V, Moed H F. Characteristics of Journal Impact Factors: The Effects of Uncitedness and Citation Distribution on the Understanding of Journal Impact Factors[J]. Scientometrics, 2005, 63(2): 357-371.
doi: 10.1007/s11192-005-0217-z
[36] Vinkler P.The Use of the Percentage Rank Position Index for Comparative Evaluation of Journals[J]. Journal of Informetrics, 2014, 8(2): 340-348.
doi: 10.1016/j.joi.2014.01.001
[37] ICTCLAS [EB/OL]. [2018-01-12]. .
[38] Jones K S.A Statistical Interpretation of Term Specificity and Its Application in Retrieval[J]. Journal of Documentation, 1972, 28(1): 11-21.
doi: 10.1108/eb026526
[39] Yang Y, Pedersen J O.A Comparative Study on Feature Selection in Text Categorization[C]//Proceedings of the 14th International Conference on Machine Learning (ICML 1997). 1997: 412-420.
[40] 代六玲, 黄河燕, 陈肇雄. 中文文本分类中特征抽取方法的比较研究[J]. 中文信息学报, 2004, 18(1): 27-33.
[40] (Dai Liuling, Huang Heyan, Chen Zhaoxiong.A Comparative Study on Feature Selection in Chinese Text Categorization[J]. Journal of Chinese Information Processing, 2004, 18(1): 27-33.)
[41] 周志华. 机器学习[M]. 北京: 清华大学出版社, 2016.
[41] (Zhou Zhihua.Machine Learning [M]. Beijing: Tsinghua University Press, 2016.)
[1] Chen Jie,Ma Jing,Li Xiaofeng. Short-Text Classification Method with Text Features from Pre-trained Models[J]. 数据分析与知识发现, 2021, 5(9): 21-30.
[2] Zhou Zeyu,Wang Hao,Zhao Zibo,Li Yueyan,Zhang Xiaoqin. Construction and Application of GCN Model for Text Classification with Associated Information[J]. 数据分析与知识发现, 2021, 5(9): 31-41.
[3] Yu Bengong,Zhu Xiaojie,Zhang Ziwei. A Capsule Network Model for Text Classification with Multi-level Feature Extraction[J]. 数据分析与知识发现, 2021, 5(6): 93-102.
[4] Wang Yan, Wang Huyan, Yu Bengong. Chinese Text Classification with Feature Fusion[J]. 数据分析与知识发现, 2021, 5(10): 1-14.
[5] Yu Shuo,Hayat Dino Bedru,Chu Xinbei,Yuan Yuyuan,Wan Liangtian,Xia Feng. Understanding Serendipity in Science: A Survey[J]. 数据分析与知识发现, 2021, 5(1): 16-35.
[6] Wang Sidi,Hu Guangwei,Yang Siyu,Shi Yun. Automatic Transferring Government Website E-Mails Based on Text Classification[J]. 数据分析与知识发现, 2020, 4(6): 51-59.
[7] Xu Yuemei,Liu Yunwen,Cai Lianqiao. Predicitng Retweets of Government Microblogs with Deep-combined Features[J]. 数据分析与知识发现, 2020, 4(2/3): 18-28.
[8] Xu Tongtong,Sun Huazhi,Ma Chunmei,Jiang Lifen,Liu Yichen. Classification Model for Few-shot Texts Based on Bi-directional Long-term Attention Features[J]. 数据分析与知识发现, 2020, 4(10): 113-123.
[9] Bengong Yu,Yumeng Cao,Yangnan Chen,Ying Yang. Classification of Short Texts Based on nLD-SVM-RF Model[J]. 数据分析与知识发现, 2020, 4(1): 111-120.
[10] Weimin Nie,Yongzhou Chen,Jing Ma. A Text Vector Representation Model Merging Multi-Granularity Information[J]. 数据分析与知识发现, 2019, 3(9): 45-52.
[11] Yunfei Shao,Dongsu Liu. Classifying Short-texts with Class Feature Extension[J]. 数据分析与知识发现, 2019, 3(9): 60-67.
[12] Heran Qin,Liu Liu,Bin Li,Dongbo Wang. Automatic Classification of Ancient Classics with Entity Features[J]. 数据分析与知识发现, 2019, 3(9): 68-76.
[13] Guo Chen,Tianxiang Xu. Sentence Function Recognition Based on Active Learning[J]. 数据分析与知识发现, 2019, 3(8): 53-61.
[14] Wancheng Chen,Haoran Dai,Yinghan Jin. Appraising Home Prices with HEDONIC Model: Case Study of Seattle, U.S.[J]. 数据分析与知识发现, 2019, 3(5): 19-26.
[15] Bengong Yu,Yangnan Chen,Ying Yang. Classifying Short Text Complaints with nBD-SVM Model[J]. 数据分析与知识发现, 2019, 3(5): 77-85.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn