Please wait a minute...
Data Analysis and Knowledge Discovery
Current Issue | Archive | Adv Search |
A Method for Author Name Disambiguation Based on Heterogeneous Information Network
Deng Qiping,Chen Weijing,Ji Ling,Zhang Yue
(Library, University of Electronic Science and Technology of China, Chengdu 611731, China)
Download:
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] The paper aims to make full use of entity relationship data in academic literature to solve the problem of author name disambiguation. [Methods] First, we extracted multi-type nodes and relationships from literature to construct a heterogeneous information network(HIN). Then we applied representation learning to obtain latent vectors of authors, and used clutering analysis to get a preliminary division. Finally, we merged several clusters based on strong rule matching to obtain disambiguation results. [Results] The experimental results under we constructed dataset based on Web of Science shows that our method has good performance. The K-Metric mean value was 0.842 which increased by 63.18% over baseline method, and also increased by 34.69% without taking into account strong rule matching. [Limitations] our method requires citation information, so the application scenarios is limited. [Conclusions] On the basis of HIN, it can improve the performance of author name disambiguation to use richer entity relations to learn feature vectors of author nodes.

Key words Author Name Disambiguation      Relational Data      Heterogeneous Information Network      Network Representation Learning      
Published: 25 November 2021
ZTFLH:  TP391,G350  

Cite this article:

Deng Qiping, Chen Weijing, Ji Ling, Zhang Yue. A Method for Author Name Disambiguation Based on Heterogeneous Information Network . Data Analysis and Knowledge Discovery, 0, (): 1-.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467. 2021.0805     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y0/V/I/1

[1] Wang Ruolin, Niu Zhendong, Lin Qika, Zhu Yifan, Qiu Ping, Lu Hao, Liu Donglei. Disambiguating Author Names with Embedding Heterogeneous Information and Attentive RNN Clustering Parameters[J]. 数据分析与知识发现, 2021, 5(8): 13-24.
[2] Zhang Xin,Wen Yi,Xu Haiyun. A Prediction Model with Network Representation Learning and Topic Model for Author Collaboration[J]. 数据分析与知识发现, 2021, 5(3): 88-100.
[3] Shen Zhe, Wang Yi, Yao Yifan, Cheng Ying. Author Name Disambiguation Techniques for Academic Literature: A Review[J]. 数据分析与知识发现, 2020, 4(8): 15-27.
[4] Yu Chuanming,Zhong Yunci,Lin Aochen,An Lu. Author Name Disambiguation with Network Embedding[J]. 数据分析与知识发现, 2020, 4(2/3): 48-59.
[5] Ding Yong,Chen Xi,Jiang Cuiqing,Wang Zhao. Predicting Online Ratings with Network Representation Learning and XGBoost[J]. 数据分析与知识发现, 2020, 4(11): 52-62.
[6] Chuanming Yu,Haonan Li,Manyi Wang,Tingting Huang,Lu An. Knowledge Representation Based on Deep Learning:Network Perspective[J]. 数据分析与知识发现, 2020, 4(1): 63-75.
[7] Gao Guangshang, Zhang Zhixiong. Survey on Entity Resolution over Relational Databases[J]. 现代图书情报技术, 2015, 31(7-8): 37-47.
[8] Fan Yunman, Hong Na, Qian Qing, Fang An. The Research Practices of DataBase Cloud Storage Using Hadoop/HBase for the Pharmacogenomics Data[J]. 现代图书情报技术, 2015, 31(5): 73-79.
[9] Zhang Xiaofei,Cai Yaping,Liu Wei. Design and Implementation of an Intelligent Data Gathering System for Social Network Analysis
——Based on Web Data Mining Principle
[J]. 现代图书情报技术, 2009, (9): 64-69.
[10] An Lu. Comparative Research on General Relational Database with Fuzzy Database[J]. 现代图书情报技术, 2003, 19(5): 62-65.
[11] Shen Weijie. The Explore of the Automatic Abstracting Based on Text Structure[J]. 现代图书情报技术, 2002, 18(3): 23-27.
[12] Zhao Yingli,Wang Yuan. Realization of KCBD Bsaed on SQL Server[J]. 现代图书情报技术, 2001, 17(3): 41-42.
[13] Wang Lancheng,Liu Qinghui,Yuan Hang. Using Web Database Technology to Set Up the MILINS Retrieval System[J]. 现代图书情报技术, 2000, 16(3): 34-36.
[14] Ma Ziwei,Gao Song. MELINETS——A Growing up Library Automation Information Nets System in China[J]. 现代图书情报技术, 2000, 16(1): 8-11.
[15] Wang Huaixing. Share, Conflict and Adaptive Locking Method in Relational Database[J]. 现代图书情报技术, 1999, 15(6): 25-27.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn