Data Analysis and Knowledge Discovery
A Method for Author Name Disambiguation Based on Heterogeneous Information Network
Deng Qiping,Chen Weijing,Ji Ling,Zhang Yue
(Library, University of Electronic Science and Technology of China, Chengdu 611731, China)
[Objective] The paper aims to make full use of entity relationship data in academic literature to solve the problem of author name disambiguation. [Methods] First, we extracted multi-type nodes and relationships from literature to construct a heterogeneous information network(HIN). Then we applied representation learning to obtain latent vectors of authors, and used clutering analysis to get a preliminary division. Finally, we merged several clusters based on strong rule matching to obtain disambiguation results. [Results] The experimental results under we constructed dataset based on Web of Science shows that our method has good performance. The K-Metric mean value was 0.842 which increased by 63.18% over baseline method, and also increased by 34.69% without taking into account strong rule matching. [Limitations] our method requires citation information, so the application scenarios is limited. [Conclusions] On the basis of HIN, it can improve the performance of author name disambiguation to use richer entity relations to learn feature vectors of author nodes.

Key words Author Name Disambiguation      Relational Data      Heterogeneous Information Network      Network Representation Learning      
Published: 25 November 2021
Deng Qiping, Chen Weijing, Ji Ling, Zhang Yue. A Method for Author Name Disambiguation Based on Heterogeneous Information Network . Data Analysis and Knowledge Discovery

