[Objective] This paper reviews research and resources in the field of named entity disambiguation(NED) with a focus on the NED methods.[Coverage] We retrieved 57 representative papers and electronic resources from CNKI, Wanfang Data Knowledge Service Platform, and EBSCO.[Methods] First, we summarized the NED principles and methods from the perspectives of entity prominence, context similarity, entity relationship, deep learning and special identification resources. Then, we explored useful knowledge bases, open source tools as well as international conferences on NED evaluation.[Results] Traditional and classic methods were easy to use, while the new ones (e.g., deep learning) significantly improved the results of NED. Effective models often integrated various methods to yield the optimal results.[Limitations] There are subjectivity factors in comparing different methods from the literature.[Conclusions] The NED methods are still developing and could be further improved by artificial intelligence and domain resources.
( Zhao Jun. A Survey on Named Entity Recognition, Disambiguation and Cross-Lingual Coreference Resolution[J]. Journal of Chinese Information Processing, 2009,23(2):3-17.)
( Gao Yanhong, Li Aiping, Duan Liguo. Entity Disambiguation Method Based on Multi-Feature Fusion Graph Model for Entity Linking[J]. Application Research of Computers, 2017,34(10):2909-2914.)
[3]
Shen W, Wang J, Han J. Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions[J]. IEEE Transactions on Knowledge and Data Engineering, 2015,27(2):443-460.
doi: 10.1109/TKDE.2014.2327028
[4]
Dredze M, McNamee P, Rao D, et al. Entity Disambiguation for Knowledge Base Population[C] // Proceedings of the 23rd International Conference on Computational Linguistics. 2010: 277-285.
[5]
Zhu G, Iglesias C A. Exploiting Semantic Similarity for Named Entity Disambiguation in Knowledge Graphs[J]. Expert Systems with Applications, 2018,101:8-24.
doi: 10.1016/j.eswa.2018.02.011
[6]
左乃彻. 基于维基百科的中英文命名实体消歧[D]. 北京: 北京邮电大学, 2015.
[6]
( Zuo Naiche. Named Entity Disambiguation Based on Chinese and English Wikipedia Knowledge Base[D]. Beijing: Beijing University of Posts and Telecommunications, 2015.)
[7]
Gattani A, Lamba D S, Garera N, et al. Entity Extraction, Linking, Classification, and Tagging for Social Media: A Wikipedia-based Approach[J]. Proceedings of the VLDB Endowment, 2013,6(11):1126-1137.
doi: 10.14778/2536222.2536237
( Wang Jing, Tan Shaofeng, He Dongdong, et al. Entity Disambiguation Algorithm for Domain Document Based on Context Feature[J]. Beijing Biomedical Engineering, 2018,37(4):398-402, 409.)
[9]
Guo S, Chang M W, Kiciman E. To Link or Not to Link? A Study on End-to-End Tweet Entity Linking[C] // Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2013: 1020-1030.
( Xian Yantuan, Yu Zhengtao, Hong Xudong, et al. Collaborative Entity Disambiguation Method Based on Weighted Feature Overlap Relatedness for Chinese[J]. Journal of Chinese Information Processing, 2017,31(2):36-41.)
[11]
Elmacioglu E, Tan Y, Yan S, et al. PSNUS: Web People Name Disambiguation by Simple Clustering with Rich Features[C] //Proceedings of the 4th International Workshop on Semantic Evaluations. 2007: 268-271.
[12]
Hoffart J, Yosef M A, Bordino I, et al. Robust Disambiguation of Named Entities in Text[C] // Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. 2011: 782-792.
[13]
Zhang W, Su J, Tan C L, et al. Entity Linking Leveraging: Automatically Generated Annotation[C] // Proceedings of the 23rd International Conference on Computational Linguistics. 2010: 1290-1298.
( Li Guangyi, Wang Houfeng. Chinese Named Entity Recognition and Disambiguation Based on Multi-Stage Clustering[J]. Journal of Chinese Information Processing, 2013,27(5):29-34, 42.)
( Tan Yongmei, Yang Xue. An Named Entity Disambiguation Algorithm Combining Entity Linking and Entity Clustering[J]. Journal of Beijing University of Posts and Telecommunications, 2014,37(5):36-40.)
( Huai Baoxing, Bao Tengfei, Zhu Hengshu, et al. Topic Modeling Approach to Named Entity Linking[J]. Journal of Software, 2014,25(9):2076-2087.)
[17]
Han X, Sun L. A Generative Entity-Mention Model for Linking Entities with Knowledge Base[C] // Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 2011: 945-954.
[18]
Meij E, Bron M, Hollink L, et al. Mapping Queries to the Linking Open Data Cloud: A Case Study Using DBpedia[J]. Journal of Web Semantics, 2011,9(4):418-433.
doi: 10.1016/j.websem.2011.04.001
[19]
Sun Y, Ji Z, Lin L, et al. Entity Disambiguation with Decomposable Neural Networks[J]. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2017,7(5):e1215.
doi: 10.1002/widm.2017.7.issue-5
( Yang Guang, Liu Bingquan, Liu Ming. Graph-based Method for Named Entity Disambiguation[J]. Intelligent Computer and Applications, 2015,5(5):52-55.)
[21]
Cucerzan S. Large-Scale Named Entity Disambiguation Based on Wikipedia Data[C] // Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Prague, Czech Republic. DBLP, 2007: 708-716.
[22]
Alhelbawy A, Gaizauskas R. Named Entity Disambiguation Using HMMs[C] // Proceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technologies. IEEE Computer Society, 2013: 159-162.
[23]
Han X, Sun L, Zhao J. Collective Entity Linking in Web Text: A Graph-Based Method[C] // Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2011: 765-774.
[24]
Phan M C, Sun A, Tay Y, et al. Pair-Linking for Collective Entity Disambiguation: Two Could be Better than All[J]. IEEE Transactions on Knowledge and Data Engineering, 2018,31(7):1383-1396.
doi: 10.1109/TKDE.69
[25]
Niu L, Wu J, Shi Y. Entity Disambiguation with Textual and Connection Information[J]. Procedia Computer Science, 2012,9:1249-1255.
doi: 10.1016/j.procs.2012.04.136
[26]
Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[C] //Proceedings of the 1st International Conference on Learning Representations. 2013.
[27]
Pennington J, Socher R, Manning C. GloVe: Global Vectors for Word Representation (Code and Pre-trained Data)[EB/OL] [2019-12-21]. https://nlp.stanford.edu/projects/glove/.
[28]
Pennington J, Socher R, Manning C. GloVe: Global Vectors for Word Representation[C] //Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014.
[29]
Zuheros C, Tabik S, Valdivia A, et al. Deep Recurrent Neural Network for Geographical Entities Disambiguation on Social Media Data[J]. Knowledge-Based Systems, 2019,173:117-127.
doi: 10.1016/j.knosys.2019.02.030
[30]
He Z, Liu S, Li M, et al. Learning Entity Representation for Entity Disambiguation[C] //Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. 2013.
[31]
Francis-Landau M, Durrett G, Klein D. Capturing Semantic Similarity for Entity Linking with Convolutional Neural Networks[C] //Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016.
( Wang Yanyan, Wang Peiyan, Cai Dongfeng, et al. An Entity Disambiguation Method for Patent Entity[J]. Journal of Shenyang Aerospace University, 2015,32(1):77-83.)
[33]
Lerchenmueller M J, Olav S. Author Disambiguation in PubMed: Evidence on the Precision and Recall of Authority Among NIH-Funded Scientists[J]. PLoS ONE, 2016,11(7):e0158731.
doi: 10.1371/journal.pone.0158731
pmid: 27367860
[34]
Haak L L, Fenner M, Paglione L, et al. ORCID: A System to Uniquely Identify Researchers[J]. Learned Publishing, 2012,25(4):259-264.
doi: 10.1087/20120404
[35]
Auer S, Bizer C, Kobilarov G, et al. DBpedia: A Nucleus for a Web of Open Data[A]//Aberer K, Choi K, Noy N, et al. The Semantic Web[M]. Springer Berlin Heidelberg, 2007: 722-735.
[36]
Bollacker K, Evans C, Paritosh P, et al. Freebase: A Collaboratively Created Graph Database for Structuring Human Knowledge[C] //Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. ACM, 2008: 1247-1250.
[37]
Suchanek F M, Kasneci G, Weikum G. YAGO: A Core of Semantic Knowledge Unifying WordNet and Wikipedia[C] //Proceedings of the 16th International World Wide Web Conference. 2007: 697-706.
( Huang Hengqi, Yu Juan, Liao Xiao, et al. Review on Knowledge Graphs[J]. Computer Systems & Applications, 2019,28(6):1-12.)
[39]
Miller G A. WordNet: A Lexical Database for English[J]. Communications of the ACM, 1995,38(11):39-41.
[40]
Microsoft Azure. Text Analytics: Detect Sentiment, Key Phrases, Named Entities and Language from Your Text[EB/OL]. [2019-12-21]. https://azure.microsoft.com/en-us/services/cognitive-services/ text-analytics/.
[41]
Usbeck R, Ngomo A C N, Auer S, et al. AGDISTIS-Agnostic Disambiguation of Named Entities Using Linked Open Data[C] // Proceedings of the 12th International Semantic Web Conference, Sydney, Australia. 2013.
[42]
AGDISTIS. Agnostic Disambiguation of Named Entities Using Linked Open Data[EB/OL]. [ 2019- 12- 21]. http://aksw.org/Projects/ AGDISTIS.html.
[43]
Ferragina P, Scaiella U. TAGME[EB/OL]. [2019-12-21]. https://tagme.d4science.org/tagme/.
[44]
Blanco R, Pappu A. FEL GitHub[DB/OL]. [2019-12-21]. https://github.com/yahoo/FEL.
[45]
Blanco R, Ottaviano G, Meij E. Fast and Space-Efficient Entity Linking in Queries[C] // Proceedings of the 8th ACM International Conference on Web Search and Data Mining. 2015: 179-188.
[46]
Dexter. Dexter, an Open Source Framework for Entity Linking[EB/OL]. [ 2019- 12- 21]. http://dexter.isti.cnr.it/.
[47]
Ceccarelli D, Lucchese C, Orlando S, et al. Dexter: An Open Source Framework for Entity Linking[C] // Proceedings of the 6th International Workshop on Exploiting Semantic Annotations in Information Retrieval. 2013.
[48]
AGDISTIS. AGDISTIS-Agnostic Named Entity Disambiguation[DB/OL]. [2019-12-21]. https://github.com/dice-group/AGDISTIS.
[49]
Ji H, Grishman R, Dang H T, et al. Overview of the TAC 2010 Knowledge Base Population Track[C] //Proceedings of the 3rd Text Analysis Conference. 2010.
[50]
Ji H, Grishman R, Dang H T. Overview of the TAC 2011 Knowledge Base Population Track[C] //Proceedings of the 4th Text Analysis Conference. 2011.
[51]
Artiles J, Gonzalo J, Sekine S. The SemEval-2007 WePS Evaluation: Establishing a Benchmark for the Web People Search Task[C] //Proceedings of the 4th International Workshop on Semantic Evaluations. 2007: 64-69.
[52]
Artiles J, Gonzalo J, Sekine S. WePS 2 Evaluation Campaign: Overview of the Web People Search Clustering Task[C] //Proceedings of the 2nd Web People Search Evaluation Workshop. 2009.
[53]
Artiles J, Borthwick A, Gonzalo J, et al. WePS-3 Evaluation Campaign: Overview of the Web People Search Clustering and Attribute Extraction Tasks[C] //Proceedings of the 2010 CLEF LABs & Workshops. DBLP, 2010.
[54]
TAC. Past TAC Data[DB/OL].[2019-12-21]. https://tac.nist.gov/data/index.html.
[55]
He Z, Wang H, Li S. The Task 2 of CIPS-SIGHAN 2012 Named Entity Recognition and Disambiguation in Chinese Bakeoff[C] //Proceedings of the 2nd CIPS-SIGHAN Joint Conference on Chinese Language Processing. 2012: 108-144.
[56]
CIPS-SIGHAN2012. The Second CIPS-SIGHAN Joint Conference on Chinese Language Processing[EB/OL]. [ 2019- 12- 21]. http://www.cipsc.org.cn/clp2012/bakeoff-cn.html.
( NLP&CC. The 2nd Conference on Natural Language Processing and Chinese Computing Test Data Download Address[DB/OL]. [2019- 12- 21]. http://tcci.ccf.org.cn/conference/2013/pages/page04_tdata.html