Please wait a minute...
Data Analysis and Knowledge Discovery  2020, Vol. 4 Issue (6): 129-138    DOI: 10.11925/infotech.2096-3467.2019.0967
Current Issue | Archive | Adv Search |
Research on Academic Evaluation Based on Fine-Grain Citation Sentimental Quantification
Jiang Lin1,2(),Zhang Qilin3
1School of Economics and Management, Nantong University, Nantong 226019, China
2Jiangsu Key Laboratory of Data Engineering and Knowledge Service, Nanjing University, Nanjing 210023, China
3Southwest University Library, Chongqing 400715, China
Download: PDF (868 KB)   HTML ( 23
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper uses sentiment analysis technology to deeply excavate and quantify the cited sentiment contained in the cited content, to provide a more scientific theoretical basis and data support for the discovery of the intrinsic value of academic literature. [Methods] Taking the journal papers retrieved in CNKI as an example, through the fine-grained sentiment analysis and sentiment quantification of the citation content in the citing literature, the intrinsic academic value of the cited literature was deeply explored and a new academic evaluation method was proposed. [Results] Experiments showed that the dispersion coefficient based on citation sentiment method was 0.12 higher than the traditional method based on cited frequency, and the Spearman correlation coefficient reached 0.981. [Limitations] Because there is no full text citation database in China, it is difficult to obtain experimental data. The sample size in the experiment is small. [Conclusions] The academic evaluation method based on fine-grained citation sentiment quantification has a higher degree of discrimination and can more effectively measure the intrinsic academic value of the literature.

Key wordsCitation Content      Fine-Grained Sentiment Analysis      Sentimental Quantification      Academic Evaluation     
Received: 26 August 2019      Published: 21 April 2020
ZTFLH:  G354  
Corresponding Authors: Jiang Lin     E-mail: Jianglin@ntu.edu.cn

Cite this article:

Jiang Lin,Zhang Qilin. Research on Academic Evaluation Based on Fine-Grain Citation Sentimental Quantification. Data Analysis and Knowledge Discovery, 2020, 4(6): 129-138.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2019.0967     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2020/V4/I6/129

Schematic Diagram of Research
实验序号 准确率 召回率 F1
1 93.10% 84.38% 88.53%
2 87.10% 90.00% 88.53%
3 86.20% 89.29% 87.72%
4 90.90% 85.71% 88.23%
5 84.85% 87.50% 86.15%
平均 88.43% 87.38% 87.73%
The Result of Sentimental Citation Recognition
引用情感 数量 占比
综述性引用 283 68.03%
情感性引用 133 31.97%
总计 416 100.00%
The Result of Sentimental Citation Classification
召回率 降低 准确率 大大提高
召回率 0 2 0 0 2 0
降低 2 0 0 0 1 0
准确率 0 0 0 2 0 2
0 0 2 0 0 1
2 1 0 0 0 0
大大提高 0 0 2 1 0 0
Word Co-occurrence Matrix
召回率 降低 准确率 大大提高
召回率 1.00 0.32 0.00 0.00 0.32 0.00
降低 0.32 1.00 0.00 0.00 0.80 0.00
准确率 0.00 0.00 1.00 0.32 0.00 0.32
0.00 0.00 0.32 1.00 0.00 0.80
0.32 0.80 0.00 0.00 1.00 0.00
大大提高 0.00 0.00 0.32 0.80 0.00 1.00
Word Similarity Distance Matrix
Schematic Diagram of Word Polarity
分级 举例
极量 太 极为 极其 极度 最 最为 过 过于 分外
高量 很 挺 非常 特别 相当 十分 好不 颇 甚为 颇为 异常
深为 满 蛮 够 大为 何等 多么 格外 何其 尤其 无比
不胜 更 更加 更为 更其 越 越发 越加 备加 愈 愈发 愈加
中量 不太 不大 不甚 不够 较 比较 较为 还 相对
低量 有点 有些 稍 稍稍 稍微 稍许 略微 略为 些许 多少
The Levels of Adverbs Hierarchy
The Trend of the Methods Based on Citation Sentiment and Cited Frequency
指标 斯皮尔曼相关系数 离散系数
被引频次 0.981** 1.319 054
引用情感评价值 1.439 410
Citation Count and Citation Sentiment Index
被引文献编号 施引文献 引用内容 情感量化结果
1 基于Ontology的中文信息抽取系统的研究与实现 由于它是基于Ontology的抽取,因此这种方法对文档的结构没有依赖性。从理论上讲,只要领域Ontology足够强大,它就能在该领域的信息抽取中达到很高的抽取精确率和召回率。 (-0.5)×(-0.87)+1.5×0.76=1.575
3 基于深度学习的图像检索 相比较一般的多层神经网络来说,深度信念网络DBN利用它的基本结构RBM来给网络赋了一个比较好的初值,预防了整个网络陷入局部最小值,而且结构简单,易于扩展 0.75×0.84+0.74+0.73=2.1
领域文本句子基本概念结构抽取研究 用深度学习处理文本并提取文本信息及文本之间的隐含关系,可以明显提高分析的效率,发现一些隐秘却有价值的有用信息。 0.85=0.85
9 基于领域词库的新闻提取技术的研究及应用 这种抽取方式大多都是通过人工制定规则,很难用计算机自动发现规则,特别是如今网络流行语千奇百怪更难发现其规则性,所以十分困难 1.5×(-0.67)+1.5×(-0.67)+1.5×(-0.71)=-3.075
支持DOM模板可视化配置的网页抽取方法 手动配置对专业要求较高,需要了解网页结构、正则表达式等知识;又因其配置过程复杂且需手动输入而使效率低下且容易出错 (-0.67)+(-0.63)+(-0.73)+(-0.57)=- 2.6
26 基于Web数据挖掘的多因素科技专家信息提取方法 但该文并没有区分 Table 标签的两种不同作用,对于结构复杂、噪音较多的网页会留下多的噪音信息。 0.75×(-0.62)=- 0.465
31 基于混合机器学习模型的多文档自动摘要 如张晗、赵玉虹提出了基于语义图的医学多文档摘要模型,能够有效识别文本中的核心内容。 0.68=0.68
The Examples of Citation Content
[1] 何春建. 单篇论文学术影响力评价指标构建[J]. 图书情报工作, 2017,61(4):98-107.
[1] ( He Chunjian. Study on the Evaluation Indicators of Single Article Academic Impact[J]. Library and Information Service, 2017,61(4):98-107.)
[2] 赵蓉英, 曾宪琴, 陈必坤. 全文本引文分析—引文分析的新发展[J]. 图书情报工作, 2014,58(9):129-135.
[2] ( Zhao Rongying, Zeng Xianqin, Chen Bikun. Citation in Full-Text: The Development of Citation Analysis[J]. Library and Information Service, 2014,58(9):129-135.)
[3] Small H. Co-Citation Context Analysis and the Structure of Paradigms[J]. Journal of Documentation, 1980,36(3):183-196.
doi: 10.1108/eb026695
[4] Moravcsik M J, Murugesan P. Some Results on the Function and Quality of Citations[J]. Social Studies of Science, 1975,5(1):86-92.
doi: 10.1177/030631277500500106
[5] Ding Y, Zhang G, Chambers T, et al. Content‐Based Citation Analysis: The Next Generation of Citation Analysis[J]. Journal of the Association for Information Science and Technology, 2014,65(9):1820-1833.
doi: 10.1002/asi.23256
[6] 祝青松, 冷伏海. 基于引文内容分析的高被引论文主题识别研究[J]. 中国图书馆学报, 2014,40(1):39-49.
[6] ( Zhu Qingsong, Leng Fuhai. Topic Identification of Highly Cited Papers Based on Citation Content Analysis[J]. Journal of Library Science in China, 2014,40(1):39-49.)
[7] 赵蓉英, 郭凤娇, 曾宪琴. 基于位置的共被引分析实证研究[J]. 情报学报, 2016,35(5):492-500.
[7] ( Zhao Rongying, Guo Fengjiao, Zeng Xianqin. Empirical Research on Location-based Co-citation Analysis[J]. Journal of the China Society for Scientific and Technical Information, 2016,35(5):492-500.)
[8] 章成志, 李卓, 赵梦圆, 等. 基于引文内容的中文图书被引行为研究[J]. 中国图书馆学报, 2019,45(3):96-109.
[8] ( Zhang Chengzhi, Li Zhuo, Zhao Mengyuan, et al. Citing Behavior of Chinese Books Based on Citation Content[J]. Journal of Library Science in China, 2019,45(3):96-109.)
[9] Teufel S, Siddharthan A, Tidhar D. Automatic Classification of Citation Function [C]//Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing. 2006: 103-110.
[10] Ikram M T, Afzal M T. Aspect Based Citation Sentiment Analysis Using Linguistic Patterns for Better Comprehension of Scientific Knowledge[J]. Scientometrics, 2019,119(1):73-95.
doi: 10.1007/s11192-019-03028-9
[11] Yousif A, Niu Z D, Nyamawe A, et al. Improving Citation Sentiment and Purpose Classification Using Hybrid Deep Neural Network Model [C]//Proceedings of the International Conference on Advanced Intelligent Systems and Informatics. 2018: 327-336.
[12] Catalini C, Lacetera N, Oettl A. The Incidence and Role of Negative Citations in Science[J]. Proceedings of the National Academy of Sciences, 2015,112(45):13823-13826.
doi: 10.1073/pnas.1502280112
[13] 刘盛博, 丁堃, 张春博. 基于引用内容性质的引文评价研究[J]. 情报理论与实践, 2015,38(3):77-81.
[13] ( Liu Shengbo, Ding Kun, Zhang Chunbo. Research on Citation Evaluation Based on the Nature of Citation Content[J]. Information Studies: Theory & Application, 2015,38(3):77-81.)
[14] 耿树青, 杨建林. 基于引用情感的论文学术影响力评价方法研究[J]. 情报理论与实践, 2018,41(12):93-98.
[14] ( Geng Shuqing, Yang Jianlin. A Method to Evaluate the Academic Influence of Papers Based on Citation Sentiment[J]. Information Studies: Theory & Application, 2018,41(12):93-98.)
[15] 苏新宁. 面向知识服务的知识组织理论与方法[M]. 第1版. 北京: 科学出版社, 2014.
[15] ( Su Xinning. Knowledge Service Oriented Knowledge Organization Theory and Method[M].The 1st Edition. Beijing: Science Press, 2014.)
[16] Alag S. Collective Intelligence 实战[M].第1版.腾灵灵,冯飞,译. 北京: 清华大学出版社, 2010.
[16] ( Alag S. Collective Intelligence in Action[M]. The 1st Edition.Translated by Teng Lingling,Feng Fei. Beijing: Tsinghua University Press, 2010.)
[17] Moilanen K, Pulman S. Sentiment Composition [C]//Proceedings of the Recent Advances in Natural Language Processing. Springer, 2007: 378-382.
[18] Benamara F, Cesarano C, Picariello A, et al. Sentiment Analysis: Adjectives and Adverbs are Better than Adjectives Alone [C]//Proceedings of the International Conference on Weblogs and Social Media. 2007: 203-206.
[19] 姜霖, 张麒麟. 基于引用情感交互的学术检索结果排序方法研究[J]. 情报理论与实践, 2020,43(6):172-179.
[19] ( Jiang Lin, Zhang Qilin. Research on the Ranking Method of Academic Retrieval Results Based on Citation Sentiment Interaction[J]. Information Studies: Theory & Application, 2020,43(6):172-179.)
[20] Velikovich L, Blair-Goldensohn S, Hannan K, et al. The Viability of Web-derived Polarity Lexicons [C]//Proceedings of the 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 2010: 777-785.
[21] 姜霖, 张麒麟. 基于评论情感分析的个性化推荐策略研究——以豆瓣影评为例[J]. 情报理论与实践, 2017,40(8):99-104.
[21] ( Jiang Lin, Zhang Qilin. Research on Personalized Recommendation Strategy Based on Sentimental Analysis of the Reviews——Taking Film Reviews of douban. com as an Example[J]. Information Studies: Theory & Application, 2017,40(8):99-104.)
[22] 蔺璜, 郭姝慧. 程度副词的特点范围与分类[J]. 山西大学学报:哲学社会科学版, 2003,26(2):71-74.
[22] ( Lin Huang, Guo Shuhui. On the Characteristics, Range and Classification of Adverbs of Degree[J]. Journal of Shanxi University: Philosophy & Social Science, 2003,26(2):71-74.)
[23] 徐琳宏, 林鸿飞, 杨志豪. 基于语义理解的文本倾向性识别机制[J]. 中文信息学报, 2007,21(1):96-100.
[23] ( Xu Linhong, Lin Hongfei, Yang Zhihao. Text Orientation Identification Based on Semantic Comprehension[J]. Journal of Chinese Information Processing, 2007,21(1):96-100.)
[24] 姚天昉, 娄德成. 汉语语句主题语义倾向分析方法的研究[J]. 中文信息学报, 2007,21(5):73-79.
[24] ( Yao Tianfang, Lou Decheng. Research on Semantic Orientation Analysis for Topics in Chinese Sentences[J]. Journal of Chinese Information Processing, 2007,21(5):73-79.)
[25] 孙春华, 刘业政. 基于产品特征词关系识别的评论倾向性合成方法[J]. 情报学报, 2013,32(8):844-852.
[25] ( Sun Chunhua, Liu Yezheng. A Method for Combining Online Reviews’ Sentiment Orientation Based on Recognition of Relationship Between Product Feature Words[J]. Journal of the China Society for Scientific and Technical Information, 2013,32(8):844-852.)
[26] 苏丽敏, 何慧爽. 基于区间数的 Spearman 秩相关系数的多属性决策方法[J]. 统计与决策, 2019,35(6):51-53.
[26] ( Su Limin, He Huishuang. Multiple Attribute Decision Making Method Based on Interval Number Spearman Rank Correlation Coefficient[J]. Statistics & Decision, 2019,35(6):51-53.)
[27] 李喜梅. 等级相关系数在教学质量评价中的应用[J]. 统计与决策, 2003(5):36-37.
[27] ( Li Ximei. Application of Grade Correlation Coefficient in Teaching Quality Evaluation[[J]. Statistics & Decision, 2003(5):36-37.)
[28] 刘盛博, 王博, 唐德龙, 等. 基于引用内容的论文影响力研究——以诺贝尔奖获得者论文为例[J]. 图书情报工作, 2015,59(24):109-114.
[28] ( Liu Shengbo, Wang Bo, Tang Delong, et al. Research on Paper Influence Based on Citation Context: A Case Study of the Nobel Prize Winner’s Paper[J]. Library and Information Service, 2015,59(24):109-114.)
[1] Tan Ying, Tang Yifei. Extracting Citation Contents with Coreference Resolution[J]. 数据分析与知识发现, 2021, 5(8): 25-33.
[2] Chengzhi Zhang,Zheng Li. Extracting Sentences of Research Originality from Full Text Academic Articles[J]. 数据分析与知识发现, 2019, 3(10): 12-18.
[3] Dun Xinhui,Zhang Yunqiu,Yang Kaixi. Fine-grained Sentiment Analysis Based on Weibo[J]. 数据分析与知识发现, 2017, 1(7): 61-72.
[4] Lu Chao, Zhang Chengzhi. Study on the Reference Network of Single Academic Article Based on Citation Content[J]. 现代图书情报技术, 2014, 30(10): 33-41.
[5] Deng Sanhong, Wang Hao, Su Xinning. Association Analysis of Academic Periodicals Based on CSSCI_Onto[J]. 现代图书情报技术, 2011, 27(3): 30-37.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn