Please wait a minute...
Data Analysis and Knowledge Discovery
Current Issue | Archive | Adv Search |
Research on Unsupervised Cross-language Patent Recommendation Based on Representation Learning
Zhang Jingzhu,Zhu Lipeng,Liu Jingjie
( School of Economics and Management, Nanjing University of Science and Technology, Nanjing  210094)
( Jiangsu Provincial Social Public Safety Science and Technology Collaborative Innovation Center, Nanjing  210094)
Download: PDF(705 KB)  
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] The object is to reduce the construction of bilingual dictionaries and large-scale corpus, improve the effect of cross-language patent recommendation and the ability of domain adaptation, from the perspective of patent text semantic representation. [Methods] Firstly, the method of unsupervised cross-language word vector mapping is designed, and the independent Chinese and English word vector is mapped to the unified semantic vector space by linear transformation, to construct the semantic mapping relationship between Chinese and English words. Then, the method of semantic representation of patent text based on cross-language word vector is formed with smooth inverse frequency (SIF) reweighting method, to realize the semantic representation of Chinese-English patent text in the same vector space. Finally, the vector similarity calculation method was used to calculate the semantic similarity between patent texts in different languages. [Results] Experiments related to "wireless communication" show that this method can achieve comprehensive and accurate Chinese-English cross-language patents recommendation. The recommendation accuracy rate of the top 1 and the top 5 reached 55.63% and 77.82%, which has increased by 0.66% and 1.45% to the weak supervised based cross-language recommendation and 4.29% and 3.9% to the machine translation based cross-language recommendation, respectively. [Limitations] Only Chinese and English patents are recommended in specific fields, so the fields and language scopes need to be expanded. [Conclusions] This method can be expanded to the research and application of patents recommendation in other domains and languages.

Key words cross-language      patent recommendation      representation learning      text semantics      
Published: 28 July 2020
ZTFLH:  G254  

Cite this article:

Zhang Jingzhu, Zhu Lipeng, Liu Jingjie. Research on Unsupervised Cross-language Patent Recommendation Based on Representation Learning . Data Analysis and Knowledge Discovery, 0, (): 1-.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2020.0272     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y0/V/I/1

[1] Liang Ye,Li Xiaoyuan,Xu Hang,Hu Yiran. CLOpin: A Cross-Lingual Knowledge Graph Framework for Public Opinion Analysis and Early Warning[J]. 数据分析与知识发现, 2020, 4(6): 1-14.
[2] Yu Chuanming,Zhong Yunci,Lin Aochen,An Lu. Author Name Disambiguation with Network Embedding[J]. 数据分析与知识发现, 2020, 4(2/3): 48-59.
[3] Chuanming Yu,Haonan Li,Manyi Wang,Tingting Huang,Lu An. Knowledge Representation Based on Deep Learning:Network Perspective[J]. 数据分析与知识发现, 2020, 4(1): 63-75.
[4] Mingxuan Huang,Shoudong Lu,Hui Xu. Cross-Language Information Retrieval Based on Weighted Association Patterns and Rule Consequent Expansion[J]. 数据分析与知识发现, 2019, 3(9): 77-87.
[5] Qingtian Zeng,Xiaohui Hu,Chao Li. Extracting Keywords with Topic Embedding and Network Structure Analysis[J]. 数据分析与知识发现, 2019, 3(7): 52-60.
[6] Zhu Fu,Yuefen Wang,Xuhui Ding. Semantic Representation of Design Process Knowledge Reuse[J]. 数据分析与知识发现, 2019, 3(6): 21-29.
[7] Qingtian Zeng,Mingdi Dai,Chao Li,Hua Duan,Zhongying Zhao. Discovering Important Locations with User Representation and Trace Data[J]. 数据分析与知识发现, 2019, 3(6): 75-82.
[8] Jinzhu Zhang,Yiming Hu. Extracting Titles from Scientific References in Patents with Fusion of Representation Learning and Machine Learning[J]. 数据分析与知识发现, 2019, 3(5): 68-76.
[9] Jinzhu Zhang,Yue Wang,Yiming Hu. Analyzing Sci-Tech Topics Based on Semantic Representation of Patent References[J]. 数据分析与知识发现, 2019, 3(12): 52-60.
[10] Chuanming Yu,Bolin Feng,Lu An. Sentiment Analysis in Cross-Domain Environment with Deep Representative Learning[J]. 数据分析与知识发现, 2017, 1(7): 73-81.
[11] Mingxuan Huang. Cross Language Information Retrieval Model Based on Matrix-weighted Association Patterns Mining[J]. 数据分析与知识发现, 2017, 1(1): 26-36.
[12] Deng Sanhong,Wan Jiexi,Wang Hao,Liu Xiwen. Experimental Study of Multilingual Text Clustering[J]. 现代图书情报技术, 2014, 30(1): 28-35.
[13] Liu Sa Zhang Chengzhi. Survey of Multilingual Document Representation[J]. 现代图书情报技术, 2010, 26(6): 33-41.
[14] Zhang Liyi,Zhang Zhenyun. A New Cross-Language Commodity Information Retrieval Approach in Book Searching[J]. 现代图书情报技术, 2010, 26(1): 9-14.
[15] Zhang Chengzhi,Huilin Wang. Survey on Multilingual Documents Clustering[J]. 现代图书情报技术, 2009, 25(6): 31-36.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn