|
|
Chinese People Name Disambiguation by Hierarchical Clustering |
Zhang Shunrui, You Hongliang |
China Defense Science & Technology Information Center, Beijing 100142, China |
|
|
Abstract This paper works on the task of Chinese people name disambiguation by hierarchical clustering algorithm, and proposes several good features for the task by experiments. The authors apply TF to calculate feature weight, and get better results after using artificial rules designed for extracting people name from documents. Finally, an average F-value(α=0.5) of 88.15% is achieved in the test of the corpus containing 191 ambiguous names.
|
Received: 29 September 2010
Published: 04 January 2011
|
|
[1] Malin B, Airoldi E, Carley K M. A Network Analysis Model for Disambiguation of Names in Lists [J]. Computational & Mathematical Organization Theory, 2005,11(2):119-139.
[2] WePS-3 Workshop Program [EB/OL]. [2010-07-10]. http://nlp.uned.es/weps/.
[3] SemEval 2007 [EB/OL]. [2010-07-10]. http://nlp.cs.swarthmore.edu/semeval/index.php.
[4] Mann G S, Yarowsky D. Unsupervised Personal Name Disambiguation [C]. In: Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL.2003: 33-40.
[5] Balog K, Azzopardi L, Rijke M D. UVA: Language Modeling Techniques for Web People Search [C]. In: Proceedings of the 4th International Workshop on Semantic Evaluations.2007: 468–471.
[6] Ono S, Sato I, Yoshida M,et al. Person Name Disambiguation in Web Pages Using Social Network, Compound Words and Latent Topics [C]. In: Proceedings of the 12th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining.2008:260-271.
[7] Task3 Chinese Version [EB/OL]. [2010-10-16]. http://www.cipsc.org.cn/clp2010/task3_ch.htm.
[8] 周晓,李超,胡明涵,等. 基于人物互斥属性的中文人名消歧 [C]. 见:第六届全国信息检索学术会议(CCIR2010).2010:333-340.
[9] 丁海波,肖桐,朱靖波. 基于多阶段的中文人名消歧聚类技术的研究 [C].见:第六届全国信息检索学术会(CCIR2010).2010:316-324.
[10] ICTCLAS-分词-中文分词-汉语分词 [EB/OL]. [2010-07-10]. http://ictclas.org/.
[11] Artiles J, Gonzalo J, Sekine S. Establishing a Benchmark for the Web People Search Task [C]. In: Proceedings of the 4th International Workshop on Semantic Evaluations.2007: 64–69.
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|