Please wait a minute...
Data Analysis and Knowledge Discovery  2022, Vol. 6 Issue (12): 13-22    DOI: 10.11925/infotech.2096-3467.2022.0087
Current Issue | Archive | Adv Search |
Identifying Topic-Problem Instances Based on Syntactic Dependency Enhancement
Wang Lu1,2,Le Xiaoqiu1,2()
1National Science Library, Chinese Academy of Sciences, Beijing 100190, China
2Department of Library, Information and Archives Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China
Download: PDF (1133 KB)   HTML ( 18
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper aims to identify the defects, deficiencies, and difficulties of existing research on a given topic. [Methods] First, we transformed the topic-problem instance pair extraction to candidate phrase classification. Then, we extracted candidate phrases from the problem sentences, and constructed a syntactic dependency tree. Third, we built a syntactic dependency enhanced classification model based on BiGCN and Transformer interaction module, Fourth, we used this new model to identify the problem instances from the candidate phrases corresponding to a given topic. [Results] The proposed model effectively identified the problem instances and topic-problem instances. Its F1 value reached 83.7%, which is 2.8 percentage point higher than the baseline model. [Limitations] We did not examine the referential relationship between sentences, which may omit some problem instances and reduce the recall rates. [Conclusions] The proposed model could effectively identify the topic and problem instances.

Key wordsProblem Extraction      Syntactic Dependency      Transformer      Graph Neural Network     
Received: 29 January 2022      Published: 03 February 2023
ZTFLH:  TP391  
Corresponding Authors: Le Xiaoqiu,ORCID:0000-0002-7114-5544     E-mail: lexq@mail.las.ac.cn

Cite this article:

Wang Lu, Le Xiaoqiu. Identifying Topic-Problem Instances Based on Syntactic Dependency Enhancement. Data Analysis and Knowledge Discovery, 2022, 6(12): 13-22.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2022.0087     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2022/V6/I12/13

Processing Flow
主题 候选短语 标签
low-fidelity
model
not very accurate 1
may be quick to evaluate 0
whereas a high-fidelity model may be computationally expensive to evaluate 0
provides an accurate estimate of the true performance 0
high-fidelity
model
not very accurate 0
a low-fidelity model 0
may be quick to evaluate 0
may be computationally expensive to evaluate 1
provides an accurate estimate of the true performance 0
Examples of Candidate Phrase Extraction and Labeling
Schematic Diagram of Multi-Word Topic Merging
The Model Framework
Interaction Module between Flat Representation and Dependency Representation
模型 准确率/% 召回率/% F1值/% 句子级准确率/%
BiLSTM 82.3 79.9 80.9 75.9
本文 84.6 82.8 83.7 80.9
Results of Comparative Experiments
模型 准确率
/%
召回率
/%
F1值
/%
句子级
准确率/%
Transformer 82.7 81.2 81.9 77.9
BiGCN 83.3 81.3 82.3 78.4
Transformer + BiGCN 83.4 80.9 82.1 78.2
本文 84.6 82.8 83.7 80.9
Results of Ablation Experiments
问题实例 模型 识别结果
1. Frequent changes in the relationship of members towards a community make the task of community detection even more challenging. Transformer /
BiGCN {<community detection; frequent changes in the relationship of members towards a community >}
Transformer + BiGCN {<community detection; frequent changes in the relationship of members towards a community >}
本文 {<community detection; frequent changes in the relationship of members towards a community >}
2. Most of the existing community detection approaches ignore node attributes information, which leads to poor results. Transformer {<community detection; ignore node attributes information>}
BiGCN {<community detection; ignore node attributes information, leads to poor results>}
Transformer + BiGCN {<community detection; ignore node attributes information >}
本文 {<community detection; ignore node attributes information, leads to poor results>}
3. The most basic and significant issue in complex network analysis is community detection, which is a branch of machine learning. Transformer /
BiGCN {<community detection; is a branch of machine learning >}
Transformer + BiGCN {<community detection; is a branch of machine learning >}
本文 /
Example of Experimental Results
[1] 邓思艺. 单篇论文核心“问题-方法-结论”共指三元组识别方法研究[D]. 北京: 中国科学院大学, 2020.
[1] (Deng Siyi. Identification Method of the Core “Problem-Method-Conclusion” Coreference Triple in a Single Scientific Paper[D]. Beijing: University of Chinese Academy of Sciences, 2020.)
[2] Heffernan K, Teufel S. Identifying Problems and Solutions in Scientific Text[J]. Scientometrics, 2018, 116(2):1367-1382.
doi: 10.1007/s11192-018-2718-6 pmid: 30147202
[3] Mishra R B, Jiang H B. Classification of Problem and Solution Strings in Scientific Texts: Evaluation of the Effectiveness of Machine Learning Classifiers and Deep Neural Networks[J]. Applied Sciences, 2021, 11(21):9997.
doi: 10.3390/app11219997
[4] 徐珍珍, 张均胜, 刘文斌. 科技文献中技术关联自动发现方法研究[J]. 图书情报工作, 2021, 65(20): 113-122.
doi: 10.13266/j.issn.0252-3116.2021.20.012
[4] (Xu Zhenzhen, Zhang Junsheng, Liu Wenbin. Automatically Discovering Associations among Technologies in Scientific Literature[J]. Library and Information Service, 2021, 65(20): 113-122.)
doi: 10.13266/j.issn.0252-3116.2021.20.012
[5] 王艳艳, 张均胜, 乔晓东, 等. 基于问题-方法矩阵的文献新颖性评估方法[J]. 情报理论与实践, 2021, 44(2): 90-95.
[5] (Wang Yanyan, Zhang Junsheng, Qiao Xiaodong, et al. Evaluating Novelty of Scientific Literature Based on Question-Method Matrix[J]. Information Studies: Theory & Application, 2021, 44(2): 90-95.)
[6] Sasaki H, Yamamoto S, Agchbayar A, et al. Extracting Problem Linkages to Improve Knowledge Exchange Between Science and Technology Domains Using an Attention-Based Language Model[J]. Engineering, Technology & Applied Science Research, 2020, 10(4): 5903-5913.
[7] 陈果, 彭家彬, 肖璐. 基于“问题-方法”知识抽取的科研领域知识演化研究:以人工智能为例[J]. 情报理论与实践, 2022, 45(6): 32-38.
[7] (Chen Guo, Peng Jiabin, Xiao Lu. Knowledge Evolution of Scientific Research Domains Based on Problem-Solution Knowledge Extraction: A Case Study of Artificial Intelligence[J]. Information Studies: Theory & Application, 2022, 45(6): 32-38.)
[8] 陆伟, 李鹏程, 张国标, 等. 学术文本词汇功能识别——基于BERT向量化表示的关键词自动分类研究[J]. 情报学报, 2020, 39(12): 1320-1329.
[8] (Lu Wei, Li Pengcheng, Zhang Guobiao, et al. Recognition of Lexical Functions in Academic Texts: Automatic Classification of Keywords Based on BERT Vectorization[J]. Journal of the China Society for Scientific and Technical Information, 2020, 39(12): 1320-1329.)
[9] 钱佳佳, 罗卓然, 陆伟. 基于问题-方法组合的科技论文新颖性度量与创新类型识别[J]. 图书情报工作, 2021, 65(14): 82-89.
doi: 10.13266/j.issn.0252-3116.2021.14.010
[9] (Qian Jiajia, Luo Zhuoran, Lu Wei. Novelty Measurement and Innovation Type Identification of Scientific Literature Based on Question-Method Combination[J]. Library and Information Service, 2021, 65(14): 82-89.)
doi: 10.13266/j.issn.0252-3116.2021.14.010
[10] D’Souza J, Auer S. Pattern-Based Acquisition of Scientific Entities from Scholarly Article Titles[C]// Proceedings of the 23rd International Conference on Asia-Pacific Digital Libraries. 2021: 401-410.
[11] Kipf T N, Welling M. Semi-Supervised Classification with Graph Convolutional Networks[OL]. arXiv Preprint, arXiv: 1609.02907.
[12] Vaswani A, Shazeer N, Parmar N, et al. Attention is all You Need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 6000-6010.
[13] 李信, 程齐凯, 刘兴帮. 基于词汇功能识别的科研文献分析系统设计与实现[J]. 图书情报工作, 2017, 61(1): 109-116.
doi: 10.13266/j.issn.0252-3116.2017.01.013
[13] (Li Xin, Cheng Qikai, Liu Xingbang. Design and Implementation of Scientific Literature Analysis System Based on Term Function Recognition[J]. Library and Information Service, 2017, 61(1): 109-116.)
doi: 10.13266/j.issn.0252-3116.2017.01.013
[14] Asadi N, Badie K, Mahmoudi M T. Automatic Zone Identification in Scientific Papers via Fusion Techniques[J]. Scientometrics, 2019, 119(2): 845-862.
doi: 10.1007/s11192-019-03060-9
[15] Howard J, Ruder S. Universal Language Model Fine-Tuning for Text Classification[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers). 2018: 328-339.
[16] Merity S, Keskar N S, Socher R. Regularizing and Optimizing LSTM Language Models[OL]. arXiv Preprint, arXiv: 1708.02182.
[17] Merity S, Xiong C, Bradbury J, et al. Pointer Sentinel Mixture Models[OL]. arXiv Preprint, arXiv: 1609.07843.
[18] Ge S Y, Huang J X, Meng Y, et al. Fine-Grained Opinion Summarization with Minimal Supervision[OL]. arXiv Preprint, arXiv: 2110.08845.
[19] Tang H, Ji D H, Li C L, et al. Dependency Graph Enhanced Dual-transformer Structure for Aspect-Based Sentiment Classification[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 6578-6588.
[20] Xing B W, Tsang I. DigNet: Digging Clues from Local-Global Interactive Graph for Aspect-Level Sentiment Classification[OL]. arXiv Preprint, arXiv: 2201.00989.
[21] Bahdanau D, Cho K, Bengio Y. Neural Machine Translation by Jointly Learning to Align and Translate[OL]. arXiv Preprint, arXiv: 1409.0473.
[22] Wang Y Q, Huang M L, Zhu X Y, et al. Attention-Based LSTM for Aspect-Level Sentiment Classification[C]// Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016: 606-615.
[23] Liu J M, Zhang Y. Attention Modeling for Targeted Sentiment[C]// Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers. 2017: 572-577.
[24] Li X, Bing L D, Lam W, et al. Transformation Networks for Target-Oriented Sentiment Classification[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers). 2018: 946-956.
[25] Xue W, Li T. Aspect Based Sentiment Analysis with Gated Convolutional Networks[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers). 2018: 2514-2523.
[26] Zhang C, Li Q C, Song D W. Aspect-Based Sentiment Classification with Aspect-Specific Graph Convolutional Networks[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019: 4568-4578.
[27] Wang K, Shen W Z, Yang Y Y, et al. Relational Graph Attention Network for Aspect-Based Sentiment Analysis[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 3229-3238.
[28] Manning C D, Surdeanu M, Bauer J, et al. The Stanford CoreNLP Natural Language Processing Toolkit[C]// Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics:System Demonstrations. 2014: 55-60.
[29] Pennington J, Socher R, Manning C. GloVe: Global Vectors for Word Representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014: 1532-1543.
[30] Kingma D P, Ba J. Adam: A Method for Stochastic Optimization[OL]. arXiv Preprint, arXiv: 1412.6980.
[1] Cheng Quan, She Dexin. Drug Recommendation Based on Graph Neural Network with Patient Signs and Medication Data[J]. 数据分析与知识发现, 2022, 6(9): 113-124.
[2] Guo Fanrong, Huang Xiaoxi, Wang Rongbo, Chen Zhiqun, Hu Chuang, Xie Yimin, Si Boyu. Identifying Metaphor with Transformer and Graph Convolutional Network[J]. 数据分析与知识发现, 2022, 6(4): 120-129.
[3] Wang Jie,Gao Yuan,Zhang Lei,Ma Liwen,Feng Jun. Predicting Short-Term Urban Traffics Based on Causality Analysis Graph[J]. 数据分析与知识发现, 2022, 6(11): 111-125.
[4] Gu Yaowen,Zheng Si,Yang Fengchun,Li Jiao. GNN-MTB: An Anti-Mycobacterium Drug Virtual Screening Model Based on Graph Neural Network[J]. 数据分析与知识发现, 2022, 6(11): 93-102.
[5] Feng Xiaodong, Hui Kangxin. Topic Clustering for Social Media Texts with Heterogeneous Graph Neural Networks[J]. 数据分析与知识发现, 2022, 6(10): 9-19.
[6] Huang Xuejian, Liu Yuyang, Ma Tinghuai. Classification Model for Scholarly Articles Based on Improved Graph Neural Network[J]. 数据分析与知识发现, 2022, 6(10): 93-102.
[7] Gu Yaowen, Zhang Bowen, Zheng Si, Yang Fengchun, Li Jiao. Predicting Drug ADMET Properties Based on Graph Attention Network[J]. 数据分析与知识发现, 2021, 5(8): 76-85.
[8] Zhang Dongyu,Cui Zijuan,Li Yingxia,Zhang Wei,Lin Hongfei. Identifying Noun Metaphors with Transformer and BERT[J]. 数据分析与知识发现, 2020, 4(4): 100-108.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn