This paper works on the task of Chinese people name disambiguation by hierarchical clustering algorithm, and proposes several good features for the task by experiments. The authors apply TF to calculate feature weight, and get better results after using artificial rules designed for extracting people name from documents. Finally, an average F-value(α=0.5) of 88.15% is achieved in the test of the corpus containing 191 ambiguous names.
章顺瑞, 游宏梁. 基于层次聚类算法的中文人名消歧[J]. 现代图书情报技术, 2010, 26(11): 64-68.
Zhang Shunrui, You Hongliang. Chinese People Name Disambiguation by Hierarchical Clustering. New Technology of Library and Information Service, 2010, 26(11): 64-68.
[1] Malin B, Airoldi E, Carley K M. A Network Analysis Model for Disambiguation of Names in Lists [J]. Computational & Mathematical Organization Theory, 2005,11(2):119-139.
[2] WePS-3 Workshop Program [EB/OL]. [2010-07-10]. http://nlp.uned.es/weps/.
[4] Mann G S, Yarowsky D. Unsupervised Personal Name Disambiguation [C]. In: Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL.2003: 33-40.
[5] Balog K, Azzopardi L, Rijke M D. UVA: Language Modeling Techniques for Web People Search [C]. In: Proceedings of the 4th International Workshop on Semantic Evaluations.2007: 468–471.
[6] Ono S, Sato I, Yoshida M,et al. Person Name Disambiguation in Web Pages Using Social Network, Compound Words and Latent Topics [C]. In: Proceedings of the 12th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining.2008:260-271.
[7] Task3 Chinese Version [EB/OL]. [2010-10-16]. http://www.cipsc.org.cn/clp2010/task3_ch.htm.
[11] Artiles J, Gonzalo J, Sekine S. Establishing a Benchmark for the Web People Search Task [C]. In: Proceedings of the 4th International Workshop on Semantic Evaluations.2007: 64–69.