%A Xu Zheng,Le Xiaoqiu %T Generating AND-OR Logical Expressions for Semantic Features of Categorical Documents %0 Journal Article %D 2021 %J Data Analysis and Knowledge Discovery %R 10.11925/infotech.2096-3467.2021.0023 %P 95-103 %V 5 %N 5 %U {https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/abstract/article_5066.shtml} %8 2021-05-25 %X

[Objective] The paper represents category unit of the categorical document as an AND-OR logical expression with semantic features, which provides data for category semantic matching and retrieval. [Methods] We constructed the seq2seq generation model using UniLM based on the AND-OR logical semantic annotation of category unit descriptions. This model learns the speech features and explicit AND-OR logical text features, to improve the sorting strategy of Beam Search. The proposed method could generate AND-OR logical expression of semantic features within category unit. By integrating context-level semantics, we extended the external semantics of category unit. [Results] We examined our method with the manually annotated International Patent Classification data. The evaluation score of the experimental result was 87.2 points, which was 11.5 points higher than the benchmark model (BiLSTM-Attention). [Limitations] More research is needed to examine the model’s performance with other datasets. [Conclusions] The proposed semantic representation method could effectively generate AND-OR logical expressions for patent data, which integrates the internal semantic features of category unit and the semantic features at the contextual level.