%A Wang Xueying,Wang Hao,Zhang Zixuan %T Recognizing Semantics of Continuous Strings in Chinese Patent Documents %0 Journal Article %D 2018 %J Data Analysis and Knowledge Discovery %R 10.11925/infotech.2096-3467.2017.1065 %P 11-22 %V 2 %N 5 %U {https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/abstract/article_4503.shtml} %8 2018-05-25 %X

[Objective] This paper aims to extract the semantic information from continuous strings in Chinese patent documents in the field of iron and steel metallurgy. [Methods] First, we collected strings with identified the semantics as the learning corpus. Then, we examined the basic features, as well as characteristics of Chinese characters and strings with the corpus to establish the best model. Finally, we used this model to recognize the semantics of other strings. [Results] The proposed model could effectively extract semantics of the continuous strings. [Limitations] We did not include the identified characters to the training corpus. [Conclusions] The new model could identify the semantics of continuous strings in Chinese patent documents, which could be used to study the continuous strings in English literature.