National Science Library, Chinese Academy of Sciences, Beijing 100190, China Department of Library, Information and Archives Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China
[Objective] This paper constructs a sentiment lexicon for STI policy texts, aiming to identify and quantify the embedded attitudes of policy makers. It tries to address the issues of existing studies, which ignore the semantic intensity of words. [Methods] First, we summarized the characteristics of policy texts and proposed a method to construct degree lexicon. This lexicon chose seed words from expert knowledge, expanded domain degree words with the PMI algorithm, and screened these words with Tongyi Cilin. Finally, we combined the TextRank algorithm with the new lexicon and conducted an experimental validation. [Results] The constructed degree lexicon yielded better results in policy text analysis than the traditional single text mining algorithm. [Limitations] The weights of our lexicon needs to be refined. [Conclusions] The degree words in STI policy texts are abundant, standardized and stable. The new lexicon can effectively utilize degree words, and learn more semantic features of policy texts.
(Zhao Yanyan, Qin Bing, Shi Qiuhui, et al. Large-Scale Sentiment Lexicon Collection and Its Application in Sentiment Classification[J]. Journal of Chinese Information Processing, 2017, 31(2): 187-193.)
[5]
符淮青. 现代汉语词汇[M]. 第2版. 北京: 北京大学出版社, 2004.
[5]
(Fu Huaiqing. Modern Chinese Word[M]. The 2nd Edition. Beijing: Peking University Press, 2004.)
[6]
朴镇秀. 现代汉语形容词的量研究[D]. 上海:复旦大学, 2009.
[6]
(Piao Zhenxiu . Study of Quantity in Modern Chinese Adjectives[D]. Shanghai: Fudan University, 2009.)
[7]
吕文杰. 现代汉语程度范畴表达方式研究[D]. 长春:吉林大学, 2013.
[7]
(Lü Wenjie. A Study on Expressions of Degree Category in Modern Chinese[D]. Changchun: Jilin University, 2013.)
[8]
张国宪. 形容词的记量[J]. 世界汉语教学, 1996, 10(4): 35-44.
[8]
(Zhang Guoxian. Quantitative Measurement of Chinese Adjectives[J]. Chinese Teaching in the World, 1996, 10(4): 35-44.)
[9]
朱德熙. 现代汉语语法研究[M]. 北京: 商务印书馆出版社, 1985.
[9]
(Zhu Dexi. Study on Modern Chinese Grammar[M]. Beijing: The Commercial Press, 1985.)
(Lin Huang, Guo Shuhui. On the Characteristics, Range and Classification of Adverbs of Degree[J]. Journal of Shanxi University(Philosophy and Social Sciences), 2003, 26(2): 71-74.)
[11]
刘平. 现代汉语程度副词及程度副词结构研究[D]. 武汉:武汉大学, 2011.
[11]
(Liu Ping. A Study on Adverbs of Degree and Their Structures in Modern Chinese[D]. Wuhan: Wuhan University, 2011.)
[12]
李宇明. 程度与否定[J]. 世界汉语教学, 1999, 13(1): 29-36.
[12]
(Li Yuming. Adverbs of Degree and Negation Particles[J]. Chinese Teaching in the World, 1999, 13(1): 29-36.)
[13]
赵国军. 现代汉语变量表达研究[D]. 上海:华东师范大学, 2008.
[13]
(Zhao Guojun. On Expression of Interchange Between Quantity Subcategories in Modern Chinese[D]. Shanghai: East China Normal University, 2008.)
(Lu Ying. On Interactional Metadiscourse in Political Texts: A Case Study of Report on the Work of Government (2012)[J]. Foreign Language Research, 2012(5): 52-55.)
[16]
陈涛涛. 党政机关公文写作处理:规则方法与范本[M]. 北京: 中国法制出版社, 2014.
[16]
(Chen Taotao. The Writing and Processing of Official Documents for Party and Government Organs: Rules, Methods and Models[M]. Beijing: China Legal Publishing House, 2014.)
(Xu Yinhua. From the Reform and Opening up of the “State Government Work Report” to See Our Documents Vocabulary Evolution[D]. Chengdu: Sichuan Normal University, 2012.)
[18]
李朦. 现代命令体公文语言研究[D]. 成都:四川师范大学, 2013.
[18]
(Li Meng. A Study on the Language of Contemporary Injunctive Documents[D]. Chengdu: Sichuan Normal University, 2013.)
[19]
王国璋. 汉语褒贬义词语用法词典[M]. 北京: 华语教学出版社, 2001.
[19]
(Wang Guozhang. A Dictionary of Chinese Praise and Blame Words[M]. Beijing: Sinolingua, 2001.)
[20]
HowNet. OpenHowNet’s Home Page[EB/OL].[2021-06-18]. https://openhownet.thunlp.org/about_hownet.
(Dun Xinhui, Zhang Yunqiu, Yang Kaixi. Fine-grained Sentiment Analysis Based on Weibo[J]. Data Analysis and Knowledge Discovery, 2017, 1(7): 61-72.)
[23]
Wu F Z, Huang Y F, Song Y Q, et al. Towards Building a High-quality Microblog-specific Chinese Sentiment Lexicon[J]. Decision Support Systems, 2016, 87: 39-49.
doi: 10.1016/j.dss.2016.04.007
(Shen Yan, Chen Yun, Huang Zhuo. A Literature Review of Textual Analysis in Economic and Financial Research[J]. China Economic Quarterly, 2019, 18(4): 1153-1186.)
(Hu Jiaheng, Cen Yonghua, Wu Chengyao. Constructing Sentiment Dictionary with Deep Learning: Case Study of Financial Data[J]. Data Analysis and Knowledge Discovery, 2018, 2(10): 95-102.)
(Jiang Cuiqing, Guo Yibo, Liu Yao. Constructing a Domain Sentiment Lexicon Based on Chinese Social Media Text[J]. Data Analysis and Knowledge Discovery, 2019, 3(2): 98-107.)
(Xu Linhong, Ding Kun, Chen Na, et al. Corpus Construction for Citation Sentiment in Chinese Literature[J]. Journal of the China Society for Scientific and Technical Information, 2020, 39(1): 25-37.)
[29]
Li J, Chen Y, Shen Y, et al. Measuring China’s Stock Market Sentiment[J/OL]. SSRN Electronic Journal, 2019. DOI: 10.2139/ssrn.3377684.
doi: 10.2139/ssrn.3377684
[30]
蒋健. 文本分类中特征提取和特征加权方法研究[D]. 重庆:重庆大学, 2010.
[30]
(Jiang Jian. Study on Feature Selection and Feature Weighting of Text Classification[D]. Chongqing: Chongqing University, 2010.)
[31]
Hoberg G, Phillips G. Text-Based Network Industries and Endogenous Product Differentiation[J]. Journal of Political Economy, 2016, 124(5): 1423-1465.
doi: 10.1086/688176
(Li Tingting, Ji Donghong. Sentiment Analysis of Micro-blog Based on SVM and CRF Using Various Combinations of Features[J]. Application Research of Computers, 2015, 32(4): 978-981.)
(Zhang Baojian, Li Pengli, Chen Jin, et al. Thematic Analysis and Evolution Process of National Science and Technology Innovation Policy: Based on the Perspective of Text Mining[J]. Science of Science and Management of S. & T., 2019, 40(11): 15-31.)
[37]
尹均生. 中国写作学大辞典[M]. 北京: 中国检察出版社, 1998.
[37]
(Yin Junsheng. Dictionary of Chinese Writing[M]. Beijing: China Procuratorial Press, 1998.)
[38]
杨正联. 公共政策文本解读的方法论[J]. 理论探讨, 2007(4): 143-147.
[38]
(Yang Zhenglian. Methodologies for the Interpretation of Public Policy Texts[J]. Theoretical Investigation, 2007(4): 143-147.)
[39]
Carvalho A, Pinto-Coelho Z, Seixas E. Listening to the Public -Enacting Power: Citizen Access, Standing and Influence in Public Participation Discourses[J]. Journal of Environmental Policy & Planning, 2019, 21(5): 563-576.
[40]
Turney P D, Littman M L. Measuring Praise and Criticism: Inference of Semantic Orientation from Association[J]. ACM Transactions on Information Systems, 2003, 21(4): 315-346.
doi: 10.1145/944012.944013
(Tian Jiule, Zhao Wei. Words Similarity Algorithm Based on Tongyici Cilin in Semantic Web Adaptive Learning System[J]. Journal of Jilin University(Information Science Edition), 2010, 28(6): 602-608.)
[43]
李鸿儒. 定性研究中的信度和效度[D]. 哈尔滨:哈尔滨工程大学, 2009.
[43]
(Li Hongru. Reliability and Validity in Qualitative Research[D]. Harbin: Harbin Engineering University, 2009.)
(Wang Jing. Renowned Experts Interpret the 13th Five-Year Plan for National Science and Technology Innovation-Explanations[EB/OL].[2020-11-10]. http://news.sciencenet.cn/htmlnews/2016/8/353201.shtm.)
(Chinese Government Network State Council Policy Document Database [DB/OL].[2020-11-10]. http://www.gov.cn/zhengce/zhengcewenjianku/index.htm.)
[46]
Mihalcea R, Tarau P. TextRank: Bringing Order into Text [C]// Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing. ACL, 2004: 404-411.
(Beijing Municipal Science & Technnology Commission. Interpretation of Beijing’s 13th Five-Year Plan for Strengthening the Construction of a National Science and Technology Innovation Centre [EB/OL].[2020-11-20]. http://kw.beijing.gov.cn/art/2016/10/9/art_2410_57010.html.)