Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (5): 95-103    DOI: 10.11925/infotech.2096-3467.2021.0023
Generating AND-OR Logical Expressions for Semantic Features of Categorical Documents
Xu Zheng,Le Xiaoqiu()
Department of Library, Information and Archives Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China
Abstract

[Objective] The paper represents category unit of the categorical document as an AND-OR logical expression with semantic features, which provides data for category semantic matching and retrieval. [Methods] We constructed the seq2seq generation model using UniLM based on the AND-OR logical semantic annotation of category unit descriptions. This model learns the speech features and explicit AND-OR logical text features, to improve the sorting strategy of Beam Search. The proposed method could generate AND-OR logical expression of semantic features within category unit. By integrating context-level semantics, we extended the external semantics of category unit. [Results] We examined our method with the manually annotated International Patent Classification data. The evaluation score of the experimental result was 87.2 points, which was 11.5 points higher than the benchmark model (BiLSTM-Attention). [Limitations] More research is needed to examine the model’s performance with other datasets. [Conclusions] The proposed semantic representation method could effectively generate AND-OR logical expressions for patent data, which integrates the internal semantic features of category unit and the semantic features at the contextual level.

Received: 10 January 2021      Published: 27 May 2021
 ZTFLH: TP391
Corresponding Authors: Le Xiaoqiu     E-mail: lexq@mail.las.ac.cn
