Data Analysis and Knowledge Discovery  2020, Vol. 4 Issue (6): 129-138    DOI: 10.11925/infotech.2096-3467.2019.0967
Research on Academic Evaluation Based on Fine-Grain Citation Sentimental Quantification
Jiang Lin1,2(),Zhang Qilin3
1School of Economics and Management, Nantong University, Nantong 226019, China
2Jiangsu Key Laboratory of Data Engineering and Knowledge Service, Nanjing University, Nanjing 210023, China
3Southwest University Library, Chongqing 400715, China
[Objective] This paper uses sentiment analysis technology to deeply excavate and quantify the cited sentiment contained in the cited content, to provide a more scientific theoretical basis and data support for the discovery of the intrinsic value of academic literature. [Methods] Taking the journal papers retrieved in CNKI as an example, through the fine-grained sentiment analysis and sentiment quantification of the citation content in the citing literature, the intrinsic academic value of the cited literature was deeply explored and a new academic evaluation method was proposed. [Results] Experiments showed that the dispersion coefficient based on citation sentiment method was 0.12 higher than the traditional method based on cited frequency, and the Spearman correlation coefficient reached 0.981. [Limitations] Because there is no full text citation database in China, it is difficult to obtain experimental data. The sample size in the experiment is small. [Conclusions] The academic evaluation method based on fine-grained citation sentiment quantification has a higher degree of discrimination and can more effectively measure the intrinsic academic value of the literature.

Key wordsCitation Content      Fine-Grained Sentiment Analysis      Sentimental Quantification      Academic Evaluation     
Received: 26 August 2019      Published: 21 April 2020
ZTFLH:  G354  
Corresponding Authors: Jiang Lin     E-mail:

Cite this article:

Jiang Lin,Zhang Qilin. Research on Academic Evaluation Based on Fine-Grain Citation Sentimental Quantification. Data Analysis and Knowledge Discovery, 2020, 4(6): 129-138.

Schematic Diagram of Research
实验序号 准确率 召回率 F1
1 93.10% 84.38% 88.53%
2 87.10% 90.00% 88.53%
3 86.20% 89.29% 87.72%
4 90.90% 85.71% 88.23%
5 84.85% 87.50% 86.15%
平均 88.43% 87.38% 87.73%
The Result of Sentimental Citation Recognition
引用情感 数量 占比
综述性引用 283 68.03%
情感性引用 133 31.97%
总计 416 100.00%
The Result of Sentimental Citation Classification
召回率 降低 准确率 大大提高
召回率 0 2 0 0 2 0
降低 2 0 0 0 1 0
准确率 0 0 0 2 0 2
0 0 2 0 0 1
2 1 0 0 0 0
大大提高 0 0 2 1 0 0
Word Co-occurrence Matrix
召回率 降低 准确率 大大提高
召回率 1.00 0.32 0.00 0.00 0.32 0.00
降低 0.32 1.00 0.00 0.00 0.80 0.00
准确率 0.00 0.00 1.00 0.32 0.00 0.32
0.00 0.00 0.32 1.00 0.00 0.80
0.32 0.80 0.00 0.00 1.00 0.00
大大提高 0.00 0.00 0.32 0.80 0.00 1.00
Word Similarity Distance Matrix
Schematic Diagram of Word Polarity
分级 举例
极量 太 极为 极其 极度 最 最为 过 过于 分外
高量 很 挺 非常 特别 相当 十分 好不 颇 甚为 颇为 异常
深为 满 蛮 够 大为 何等 多么 格外 何其 尤其 无比
不胜 更 更加 更为 更其 越 越发 越加 备加 愈 愈发 愈加
中量 不太 不大 不甚 不够 较 比较 较为 还 相对
低量 有点 有些 稍 稍稍 稍微 稍许 略微 略为 些许 多少
The Levels of Adverbs Hierarchy
The Trend of the Methods Based on Citation Sentiment and Cited Frequency
指标 斯皮尔曼相关系数 离散系数
被引频次 0.981** 1.319 054
引用情感评价值 1.439 410
Citation Count and Citation Sentiment Index
被引文献编号 施引文献 引用内容 情感量化结果
1 基于Ontology的中文信息抽取系统的研究与实现 由于它是基于Ontology的抽取,因此这种方法对文档的结构没有依赖性。从理论上讲,只要领域Ontology足够强大,它就能在该领域的信息抽取中达到很高的抽取精确率和召回率。 (-0.5)×(-0.87)+1.5×0.76=1.575
3 基于深度学习的图像检索 相比较一般的多层神经网络来说,深度信念网络DBN利用它的基本结构RBM来给网络赋了一个比较好的初值,预防了整个网络陷入局部最小值,而且结构简单,易于扩展 0.75×0.84+0.74+0.73=2.1
领域文本句子基本概念结构抽取研究 用深度学习处理文本并提取文本信息及文本之间的隐含关系,可以明显提高分析的效率,发现一些隐秘却有价值的有用信息。 0.85=0.85
9 基于领域词库的新闻提取技术的研究及应用 这种抽取方式大多都是通过人工制定规则,很难用计算机自动发现规则,特别是如今网络流行语千奇百怪更难发现其规则性,所以十分困难 1.5×(-0.67)+1.5×(-0.67)+1.5×(-0.71)=-3.075
支持DOM模板可视化配置的网页抽取方法 手动配置对专业要求较高,需要了解网页结构、正则表达式等知识;又因其配置过程复杂且需手动输入而使效率低下且容易出错 (-0.67)+(-0.63)+(-0.73)+(-0.57)=- 2.6
26 基于Web数据挖掘的多因素科技专家信息提取方法 但该文并没有区分 Table 标签的两种不同作用,对于结构复杂、噪音较多的网页会留下多的噪音信息。 0.75×(-0.62)=- 0.465
31 基于混合机器学习模型的多文档自动摘要 如张晗、赵玉虹提出了基于语义图的医学多文档摘要模型,能够有效识别文本中的核心内容。 0.68=0.68
The Examples of Citation Content
