|
|
Developments of Named Entity Disambiguation |
Wen Pingmei1,Ye Zhiwei1,Ding Wenjian1,Liu Ying2(),Xu Jian1 |
1School of Information Management, Sun Yat-Sen University, Guangzhou 510006, China 2Sun Yat-Sen University Library, Guangzhou 510275, China |
|
|
Abstract [Objective] This paper reviews research and resources in the field of named entity disambiguation(NED) with a focus on the NED methods.[Coverage] We retrieved 57 representative papers and electronic resources from CNKI, Wanfang Data Knowledge Service Platform, and EBSCO.[Methods] First, we summarized the NED principles and methods from the perspectives of entity prominence, context similarity, entity relationship, deep learning and special identification resources. Then, we explored useful knowledge bases, open source tools as well as international conferences on NED evaluation.[Results] Traditional and classic methods were easy to use, while the new ones (e.g., deep learning) significantly improved the results of NED. Effective models often integrated various methods to yield the optimal results.[Limitations] There are subjectivity factors in comparing different methods from the literature.[Conclusions] The NED methods are still developing and could be further improved by artificial intelligence and domain resources.
|
Received: 04 May 2020
Published: 14 October 2020
|
|
Corresponding Authors:
Liu Ying
E-mail: pusly@mail.sysu.edu.cn
|
[1] |
赵军. 命名实体识别、排歧和跨语言关联[J]. 中文信息学报, 2009,23(2):3-17.
|
[1] |
( Zhao Jun. A Survey on Named Entity Recognition, Disambiguation and Cross-Lingual Coreference Resolution[J]. Journal of Chinese Information Processing, 2009,23(2):3-17.)
|
[2] |
高艳红, 李爱萍, 段利国. 面向实体链接的多特征图模型实体消歧方法[J]. 计算机应用研究, 2017,34(10):2909-2914.
|
[2] |
( Gao Yanhong, Li Aiping, Duan Liguo. Entity Disambiguation Method Based on Multi-Feature Fusion Graph Model for Entity Linking[J]. Application Research of Computers, 2017,34(10):2909-2914.)
|
[3] |
Shen W, Wang J, Han J. Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions[J]. IEEE Transactions on Knowledge and Data Engineering, 2015,27(2):443-460.
doi: 10.1109/TKDE.2014.2327028
|
[4] |
Dredze M, McNamee P, Rao D, et al. Entity Disambiguation for Knowledge Base Population[C] // Proceedings of the 23rd International Conference on Computational Linguistics. 2010: 277-285.
|
[5] |
Zhu G, Iglesias C A. Exploiting Semantic Similarity for Named Entity Disambiguation in Knowledge Graphs[J]. Expert Systems with Applications, 2018,101:8-24.
doi: 10.1016/j.eswa.2018.02.011
|
[6] |
左乃彻. 基于维基百科的中英文命名实体消歧[D]. 北京: 北京邮电大学, 2015.
|
[6] |
( Zuo Naiche. Named Entity Disambiguation Based on Chinese and English Wikipedia Knowledge Base[D]. Beijing: Beijing University of Posts and Telecommunications, 2015.)
|
[7] |
Gattani A, Lamba D S, Garera N, et al. Entity Extraction, Linking, Classification, and Tagging for Social Media: A Wikipedia-based Approach[J]. Proceedings of the VLDB Endowment, 2013,6(11):1126-1137.
doi: 10.14778/2536222.2536237
|
[8] |
王静, 谭绍峰, 贺东东, 等. 基于上下文特征的领域文献实体消歧算法[J]. 北京生物医学工程, 2018, 37(4): 398-402, 409.
|
[8] |
( Wang Jing, Tan Shaofeng, He Dongdong, et al. Entity Disambiguation Algorithm for Domain Document Based on Context Feature[J]. Beijing Biomedical Engineering, 2018,37(4):398-402, 409.)
|
[9] |
Guo S, Chang M W, Kiciman E. To Link or Not to Link? A Study on End-to-End Tweet Entity Linking[C] // Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2013: 1020-1030.
|
[10] |
线岩团, 余正涛, 洪旭东, 等. 基于特征加权重叠度的中文实体协同消歧方法[J]. 中文信息学报, 2017,31(2):36-41.
|
[10] |
( Xian Yantuan, Yu Zhengtao, Hong Xudong, et al. Collaborative Entity Disambiguation Method Based on Weighted Feature Overlap Relatedness for Chinese[J]. Journal of Chinese Information Processing, 2017,31(2):36-41.)
|
[11] |
Elmacioglu E, Tan Y, Yan S, et al. PSNUS: Web People Name Disambiguation by Simple Clustering with Rich Features[C] //Proceedings of the 4th International Workshop on Semantic Evaluations. 2007: 268-271.
|
[12] |
Hoffart J, Yosef M A, Bordino I, et al. Robust Disambiguation of Named Entities in Text[C] // Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. 2011: 782-792.
|
[13] |
Zhang W, Su J, Tan C L, et al. Entity Linking Leveraging: Automatically Generated Annotation[C] // Proceedings of the 23rd International Conference on Computational Linguistics. 2010: 1290-1298.
|
[14] |
李广一, 王厚峰. 基于多步聚类的汉语命名实体识别和歧义消解[J]. 中文信息学报, 2013,27(5):29-34, 42.
|
[14] |
( Li Guangyi, Wang Houfeng. Chinese Named Entity Recognition and Disambiguation Based on Multi-Stage Clustering[J]. Journal of Chinese Information Processing, 2013,27(5):29-34, 42.)
|
[15] |
谭咏梅, 杨雪. 结合实体链接与实体聚类的命名实体消歧[J]. 北京邮电大学学报, 2014,37(5):36-40.
|
[15] |
( Tan Yongmei, Yang Xue. An Named Entity Disambiguation Algorithm Combining Entity Linking and Entity Clustering[J]. Journal of Beijing University of Posts and Telecommunications, 2014,37(5):36-40.)
|
[16] |
怀宝兴, 宝腾飞, 祝恒书, 等. 一种基于概率主题模型的命名实体链接方法[J]. 软件学报, 2014,25(9):2076-2087.
|
[16] |
( Huai Baoxing, Bao Tengfei, Zhu Hengshu, et al. Topic Modeling Approach to Named Entity Linking[J]. Journal of Software, 2014,25(9):2076-2087.)
|
[17] |
Han X, Sun L. A Generative Entity-Mention Model for Linking Entities with Knowledge Base[C] // Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 2011: 945-954.
|
[18] |
Meij E, Bron M, Hollink L, et al. Mapping Queries to the Linking Open Data Cloud: A Case Study Using DBpedia[J]. Journal of Web Semantics, 2011,9(4):418-433.
doi: 10.1016/j.websem.2011.04.001
|
[19] |
Sun Y, Ji Z, Lin L, et al. Entity Disambiguation with Decomposable Neural Networks[J]. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2017,7(5):e1215.
doi: 10.1002/widm.2017.7.issue-5
|
[20] |
杨光, 刘秉权, 刘铭. 基于图方法的命名实体消歧[J]. 智能计算机与应用, 2015,5(5):52-55.
|
[20] |
( Yang Guang, Liu Bingquan, Liu Ming. Graph-based Method for Named Entity Disambiguation[J]. Intelligent Computer and Applications, 2015,5(5):52-55.)
|
[21] |
Cucerzan S. Large-Scale Named Entity Disambiguation Based on Wikipedia Data[C] // Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Prague, Czech Republic. DBLP, 2007: 708-716.
|
[22] |
Alhelbawy A, Gaizauskas R. Named Entity Disambiguation Using HMMs[C] // Proceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technologies. IEEE Computer Society, 2013: 159-162.
|
[23] |
Han X, Sun L, Zhao J. Collective Entity Linking in Web Text: A Graph-Based Method[C] // Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2011: 765-774.
|
[24] |
Phan M C, Sun A, Tay Y, et al. Pair-Linking for Collective Entity Disambiguation: Two Could be Better than All[J]. IEEE Transactions on Knowledge and Data Engineering, 2018,31(7):1383-1396.
doi: 10.1109/TKDE.69
|
[25] |
Niu L, Wu J, Shi Y. Entity Disambiguation with Textual and Connection Information[J]. Procedia Computer Science, 2012,9:1249-1255.
doi: 10.1016/j.procs.2012.04.136
|
[26] |
Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[C] //Proceedings of the 1st International Conference on Learning Representations. 2013.
|
[27] |
Pennington J, Socher R, Manning C. GloVe: Global Vectors for Word Representation (Code and Pre-trained Data)[EB/OL] [2019-12-21]. https://nlp.stanford.edu/projects/glove/.
|
[28] |
Pennington J, Socher R, Manning C. GloVe: Global Vectors for Word Representation[C] //Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014.
|
[29] |
Zuheros C, Tabik S, Valdivia A, et al. Deep Recurrent Neural Network for Geographical Entities Disambiguation on Social Media Data[J]. Knowledge-Based Systems, 2019,173:117-127.
doi: 10.1016/j.knosys.2019.02.030
|
[30] |
He Z, Liu S, Li M, et al. Learning Entity Representation for Entity Disambiguation[C] //Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. 2013.
|
[31] |
Francis-Landau M, Durrett G, Klein D. Capturing Semantic Similarity for Entity Linking with Convolutional Neural Networks[C] //Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016.
|
[32] |
王琰炎, 王裴岩, 蔡东风, 等. 一种用于专利实体的实体消歧方法[J]. 沈阳航空航天大学学报, 2015,32(1):77-83.
|
[32] |
( Wang Yanyan, Wang Peiyan, Cai Dongfeng, et al. An Entity Disambiguation Method for Patent Entity[J]. Journal of Shenyang Aerospace University, 2015,32(1):77-83.)
|
[33] |
Lerchenmueller M J, Olav S. Author Disambiguation in PubMed: Evidence on the Precision and Recall of Authority Among NIH-Funded Scientists[J]. PLoS ONE, 2016,11(7):e0158731.
doi: 10.1371/journal.pone.0158731
pmid: 27367860
|
[34] |
Haak L L, Fenner M, Paglione L, et al. ORCID: A System to Uniquely Identify Researchers[J]. Learned Publishing, 2012,25(4):259-264.
doi: 10.1087/20120404
|
[35] |
Auer S, Bizer C, Kobilarov G, et al. DBpedia: A Nucleus for a Web of Open Data[A]//Aberer K, Choi K, Noy N, et al. The Semantic Web[M]. Springer Berlin Heidelberg, 2007: 722-735.
|
[36] |
Bollacker K, Evans C, Paritosh P, et al. Freebase: A Collaboratively Created Graph Database for Structuring Human Knowledge[C] //Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. ACM, 2008: 1247-1250.
|
[37] |
Suchanek F M, Kasneci G, Weikum G. YAGO: A Core of Semantic Knowledge Unifying WordNet and Wikipedia[C] //Proceedings of the 16th International World Wide Web Conference. 2007: 697-706.
|
[38] |
黄恒琪, 于娟, 廖晓, 等. 知识图谱研究综述[J]. 计算机系统应用, 2019,28(6):1-12.
|
[38] |
( Huang Hengqi, Yu Juan, Liao Xiao, et al. Review on Knowledge Graphs[J]. Computer Systems & Applications, 2019,28(6):1-12.)
|
[39] |
Miller G A. WordNet: A Lexical Database for English[J]. Communications of the ACM, 1995,38(11):39-41.
|
[40] |
Microsoft Azure. Text Analytics: Detect Sentiment, Key Phrases, Named Entities and Language from Your Text[EB/OL]. [2019-12-21]. https://azure.microsoft.com/en-us/services/cognitive-services/ text-analytics/.
|
[41] |
Usbeck R, Ngomo A C N, Auer S, et al. AGDISTIS-Agnostic Disambiguation of Named Entities Using Linked Open Data[C] // Proceedings of the 12th International Semantic Web Conference, Sydney, Australia. 2013.
|
[42] |
AGDISTIS. Agnostic Disambiguation of Named Entities Using Linked Open Data[EB/OL]. [ 2019- 12- 21]. http://aksw.org/Projects/ AGDISTIS.html.
|
[43] |
Ferragina P, Scaiella U. TAGME[EB/OL]. [2019-12-21]. https://tagme.d4science.org/tagme/.
|
[44] |
Blanco R, Pappu A. FEL GitHub[DB/OL]. [2019-12-21]. https://github.com/yahoo/FEL.
|
[45] |
Blanco R, Ottaviano G, Meij E. Fast and Space-Efficient Entity Linking in Queries[C] // Proceedings of the 8th ACM International Conference on Web Search and Data Mining. 2015: 179-188.
|
[46] |
Dexter. Dexter, an Open Source Framework for Entity Linking[EB/OL]. [ 2019- 12- 21]. http://dexter.isti.cnr.it/.
|
[47] |
Ceccarelli D, Lucchese C, Orlando S, et al. Dexter: An Open Source Framework for Entity Linking[C] // Proceedings of the 6th International Workshop on Exploiting Semantic Annotations in Information Retrieval. 2013.
|
[48] |
AGDISTIS. AGDISTIS-Agnostic Named Entity Disambiguation[DB/OL]. [2019-12-21]. https://github.com/dice-group/AGDISTIS.
|
[49] |
Ji H, Grishman R, Dang H T, et al. Overview of the TAC 2010 Knowledge Base Population Track[C] //Proceedings of the 3rd Text Analysis Conference. 2010.
|
[50] |
Ji H, Grishman R, Dang H T. Overview of the TAC 2011 Knowledge Base Population Track[C] //Proceedings of the 4th Text Analysis Conference. 2011.
|
[51] |
Artiles J, Gonzalo J, Sekine S. The SemEval-2007 WePS Evaluation: Establishing a Benchmark for the Web People Search Task[C] //Proceedings of the 4th International Workshop on Semantic Evaluations. 2007: 64-69.
|
[52] |
Artiles J, Gonzalo J, Sekine S. WePS 2 Evaluation Campaign: Overview of the Web People Search Clustering Task[C] //Proceedings of the 2nd Web People Search Evaluation Workshop. 2009.
|
[53] |
Artiles J, Borthwick A, Gonzalo J, et al. WePS-3 Evaluation Campaign: Overview of the Web People Search Clustering and Attribute Extraction Tasks[C] //Proceedings of the 2010 CLEF LABs & Workshops. DBLP, 2010.
|
[54] |
TAC. Past TAC Data[DB/OL].[2019-12-21]. https://tac.nist.gov/data/index.html.
|
[55] |
He Z, Wang H, Li S. The Task 2 of CIPS-SIGHAN 2012 Named Entity Recognition and Disambiguation in Chinese Bakeoff[C] //Proceedings of the 2nd CIPS-SIGHAN Joint Conference on Chinese Language Processing. 2012: 108-144.
|
[56] |
CIPS-SIGHAN2012. The Second CIPS-SIGHAN Joint Conference on Chinese Language Processing[EB/OL]. [ 2019- 12- 21]. http://www.cipsc.org.cn/clp2012/bakeoff-cn.html.
|
[57] |
NLP&CC. 第二届自然语言处理与中文计算会议(NLP&CC 2013)技术评测测试数据下载[DB/OL]. [ 2019- 12- 21]. http://tcci.ccf.org.cn/conference/2013/pages/page04_tdata.html.
|
[57] |
( NLP&CC. The 2nd Conference on Natural Language Processing and Chinese Computing Test Data Download Address[DB/OL]. [2019- 12- 21]. http://tcci.ccf.org.cn/conference/2013/pages/page04_tdata.html
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|