Mining Policy Text Relevance with Syntactic Structure and Semantic Information
Wu Kaibiao,Lang Yuxiang,Dong Yu()
National Science Library, Chinese Academy of Sciences, Beijing 100190, China; Department of Library, Information and Archives Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China
[Objective] This paper proposes a new method to analyze policy text relevance, aiming to retrieve more in-depth semantic information. [Methods] First, we built a new algorithm combining the dependency parsing analysis and word embedding model. Then, we analyzed the semantic relevance of policy texts from the perspective of sentence and word meaning information. Our method fully utilized the language characteristics of the policy texts to establish the extraction rules for dependency syntax. [Results] For test dataset with a relatively low degree of policy text association, our new algorithm’s F1 value reached 0.857, which was 22.78% higher than the algorithm fusing TF-IDF and cosine similarity. We also described policy text relevance with the subtle word differences. [Limitations] For semantic inforamiton mining, more research is needed to train word vector models for specific policy domains to further improve their accuracy. In sentence information mining, the accuracy of existing dependency syntactic analysis tools could be improved. [Conclusions] The proposed algorithm could effectively reveal the policy text association, as well as bring new research perspectives and tools for quantitative research on policy texts.
( Ma Feicheng, Li Xiaoyu, Zhang Bin. Analysis on the Structure, Function and Evolution of China’s Internet Content Regulation Regime[J]. Journal of the China Society for Scie.pngic and Technical Information, 2013, 32(11): 1124-1137.)
( Lang Mei. The Matching Degree Between Function of Local Government and Central Government under Big Data Perspective: A Research Based on the LDA Model of Gansu Province[J]. Journal of Intelligence, 2018, 37(9): 78-85.)
( Shao Wei, Hua Bolin. Unsupervised Construction of Thesaurus in the Science and Technology Policy Based on Dependency Syntax Analysis[J]. Technology Intelligence Engineering, 2020, 6(6): 33-44.)
Mihalcea R, Corley C, Strapparava C. Corpus-Based and Knowledge-Based Measures of Text Semantic Similarity[C]// Proceedings of the 21st National Conference on A.pngicial Intelligence. 2006: 775-780.
来斯惟. 基于神经网络的词和文档语义向量表示方法研究[D]. 北京: 中国科学院大学, 2016.
( Lai Siwei. Word and Document Embeddings Based on Neural Network Approaches[D]. Beijing: University of Chinese Academy of Sciences, 2016.)
Levenshtein V. Binary Codes Capable of Correcting Deletions, Insertions, and Reversals[J]. Soviet Physics Doklady, 1965, 10: 707-710.
Melamed I D. Automatic Evaluation and Uniform Filter Cascades for Inducing n-Best Translation Lexicons[OL]. arXiv Preprint, arXiv: cmp-lg/9505044.
Kondrak G. N-gram Similarity and Distance[C]// Proceedings of International Symposium on String Processing and Information Retrieval.Springer, 2005: 115-126.
Smith T F, Waterman M S. Ide.pngication of Common Molecular Subsequences[J]. Journal of Molecular Biology, 1981, 147(1): 195-197.
Wilkerson J, Smith D, Stramp N. Tracing the Flow of Policy Ideas in Legislatures: A Text Reuse Approach[J]. American Journal of Political Science, 2015, 59(4): 943-956.
Linder F, Desmarais B, Burgess M, et al. Text as Policy: Measuring Policy Similarity Through Bill Text Reuse[J]. Policy Studies Journal, 2020, 48(2): 546-574.
Li S, Zhao Z, Hu R F, et al. Analogical Reasoning on Chinese Morphological and Semantic Relations[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018: 138-143.
( Xinhua News Agency. Proposals of the Central Committee of the Communist Party of China on Formulating the Fourteenth Five-Year Plan for National Economic and Social Development and the Long-term Goals for 2035[EB/OL]. (2020-11-03). [2021-04-20]. http://www.gov.cn/zhengce/2020-11/03/content_5556991.htm.)
( Notice of the People’s Government of Guangdong Province on Issuing the Development Plan for the New Generation of A.pngicial Intelligence in Guangdong Province[EB/OL]. (2018-08-10). [2021-04-20]. http://www.gd.gov.cn/gkmlpt/content/0/147/post_147108.html#7.)
( Notice of the General Office of the Shanghai Municipal People’s Government on Issuing the “Implementation Opinions on Promoting the Development of New Generation A.pngicial Intelligence”[EB/OL]. (2017-10-26). [2021-04-20]. https://www.shanghai.gov.cn/nw42639/20200823/0001-42639_54242.html.)