[Objective] To improve the data reliability in large-scale academic assessment and the performance of word-similarity or frequency based techniques in institution name normalization. [Methods] A new rule-based algorithm aided with low-value word similarity is proposed and a series of rules and statistical methods are applied jointly to mapping multiple institution names onto one entity of institution, so as to make institution name normalized. [Results] The experimental results show that the F-value of the rule-based algorithm (55.50%) is higher than the other two strategies. [Limitations] The ability to identify institution names with low value of word similarity is not good enough. [Conclusions] The rule-based algorithm proposed performs better than the other two techniques comprehensively, while the recall value needs to be improved.
杨波, 杨军威, 阎素兰. 基于规则的机构名规范化研究[J]. 现代图书情报技术, 2015, 31(6): 57-63.
Yang Bo, Yang Junwei, Yan Sulan. Research on Rule-based Normalization of Institution Name. New Technology of Library and Information Service, 2015, 31(6): 57-63.
 Csajbók E, Berhidi A, Vasas L, et al. Hirsch-index for Countries Based on Essential Science Indicators Data [J]. Scientometrics, 2007, 73(1): 91-117.
 Ta?kin Z, Al U. Institutional Name Confusion on Citation Indexes: The Example of the Names of Turkish Hospitals [J]. Procedia-Social and Behavioral Sciences, 2013, 73: 544-550.
 van Raan A F J. Fatal Attraction: Conceptual and Methodological Problems in the Ranking of Universities by Bibliometric Methods [J]. Scientometrics, 2005, 62(1): 133-143.
 吴建伟. 面向Twitter信息的机构名消歧技术研究[D]. 哈尔滨: 哈尔滨工业大学, 2012. (Wu Jianwei. Research on Organization Name Disambiguation on Twitter Data [D]. Harbin: Harbin Institute of Technology, 2012.)
 胡万亭, 杨燕, 尹红风, 等. 一种基于词频统计的组织机构名识别方法[J]. 计算机应用研究, 2013, 30(7): 2014-2016. (Hu Wanting, Yang Yan, Yin Hongfeng, et al. Organization Name Recognition Based on Word Frequency Statistics [J]. Application Research of Computers, 2013, 30(7): 2014-2016.)
 D'Angelo C A, Giuffrida C, Abramo G. A Heuristic Approach to Author Name Disambiguation in Bibliometrics Databases for Large-scale Research Assessments [J]. Journal of the American Society for Information Science and Technology, 2011, 62(2): 257-269.
 Abramo G, Cicero T, D'Angelo C A. A Field-standardized Application of DEA to National-scale Research Assessment of Universities [J]. Journal of Informetrics, 2011, 5(4): 618-628.
 Morillo F, Aparicio J, González-Albo B, et al. Towards the Automation of Address Identification [J]. Scientometrics, 2013, 94(1): 207-224.
 Jiang Y, Zheng H T, Wang X, et al. Affiliation Disambiguation for Constructing Semantic Digital Libraries [J]. Journal of the American Society for Information Science and Technology, 2011, 62(6): 1029-1041.
 Onodera N, Iwasawa M, Midorikawa N, et al. A Method for Eliminating Articles by Homonymous Authors from the Large Number of Articles Retrieved by Author Search [J]. Journal of the American Society for Information Science and Technology, 2011, 62(4): 677-690.
 French J C, Powell A L, Schulman E. Using Clustering Strategies for Creating Authority Files [J]. Journal of the American Society for Information Science and Technology, 2000, 51(8): 774-786.
 Torvik V I, Weeber M, Swanson D R, et al. A Probabilistic Similarity Metric for Medline Records: A Model for Author Name Disambiguation [J]. Journal of the American Society for Information Science and Technology, 2005, 56(2): 140-158.
 Smalheiser N R, Torvik V I. Author Name Disambiguation [J]. Annual Review of Information Science and Technology, 2009, 43(1): 1-43.
 Huang S, Yang B, Yan S, et al. Institution Name Disambiguation for Research Assessment [J]. Scientometrics, 2014, 99(3): 823-838.