Please wait a minute...
New Technology of Library and Information Service  2007, Vol. 2 Issue (7): 22-26    DOI: 10.11925/infotech.1003-3513.2007.07.06
Current Issue | Archive | Adv Search |
Study on String-based Matching of Information Integration
Sun Haixia  Cheng Ying
(Department of Information Management, Nanjing University, Nanjing 210093,China)
Download: PDF (380 KB)  
Export: BibTeX | EndNote (RIS)      

Matching is one of the most important techniques of information integration. In this paper, string-based matching algorithms,mainly distance-based,token-based and the N-gram are elucidated. The deficiencies and research directions are also outlined.

Key wordsMatching      Information integration      String-based matching     
Received: 01 June 2007      Published: 25 July 2007


Corresponding Authors: Sun Haixia     E-mail:
About author:: Sun Haixia,Cheng Ying

Cite this article:

Sun Haixia,Cheng Ying. Study on String-based Matching of Information Integration. New Technology of Library and Information Service, 2007, 2(7): 22-26.

URL:     OR

[1] 陈跃国,王京春.数据集成综述[J].计算机科学, 2004,31(5):48-51
[2] Maurizio L. Data Integration: A Theoretical Perspective[C].In:Proc.of the ACM SIGACT—SIGMOD —SIGART Symposium on Principles of Database Systems,2002:233-246.
[3] 吴昊,邢桂芬.基于本体的信息集成技术研究[J].计算机应用, 2005,25(2):456-458
[4] Shvaiko P,Euzenat J. A survey of Schema-based Matching Approaches[J]. Journal  on  Data  Semantics,LNCS 3730,2005:146-171.
[5] Rahm E,Bernstein P. A Survey of Approaches to Automatic Schema Matching[J]. The International Journal on Very Large Data Bases (VLDB),2001,10(4):334-350
[6] Madhavan J,Bernstein P,Rahm E.Generic Schema Matching With Cupid[C]. In: Proceedings of the Very Large Data Bases Conference (VLDB),2001:49-58
[7] Do H H,Rahm E. COMA-A System for Flexible Combination of Schema Matching Approaches[C]. In: Proceedings of the Very Large Data Bases Conference (VLDB), 2001: 610-621
[8] Giunchiglia F,Shvaiko P,Yatskevich M. S-Match:An Algorithm and an Implementation of Semantic Matching[C].In: Proceedings of the European Semantic Web Symposium (ESWS),2004: 61-75
[9] Melnik S,Garcia-Molina H,Rahm E. Similarity Flooding:A Versatile Graph Matching Algorithm[C].In: Proceedings of the International Conference on Data Engineering (ICDE), 2002:117-128
[10] Ilenko B,Cohenw M R,et al. Adaptive Name Matching in Information Integration [J]. IEEE Intelligent Systems, 2003,18 (5):16-23
[11] Geng J F, Yang J. AutoBib:Automatic Extraction and Integration of Bibliographic Information on the Web[C]. In:Proceedings of the 29th VLDB Conference. Berlin, Germany, 2003:193-204.
[12] Giunchiglia F,Yatskevich M. Element Level Semantic Matching[C].In: Proceedings of Meaning Coordination and Negotiation Workshop at the International Semantic Web Conference (ISWC),2004:61-75
[13] Giunchiglia F,Shvaiko P,Yatskevich M.Semantic Schema Matching[R]. Technical Report DIT-05-014, University of Trento, 2005:347-365.
[14] 孙建军,成颖.信息检索技术[M]. 北京:科学出版社.2004: 53-71,232-242
[15] Smith F, Waterman M S. Identification of Common Molecular Subsequences[J]. Journal of Molecular Biology,1981(147): 195 -197
[16] Jaro M A. Advances in Record Linkage Methodology as Applied to Matching the 1985 Census of Tampa, Florida[J]. Journal of American Statistical Association, 1989,86(406):414-420
[17] 程国达,邹亚会,朱静.一种自适应信息集成方法[J].计算机应用, 2005,25(3):666-669
[18] Hylton J A. Identifying and Merging Related Bibliographic Records[D]. MIT Institute of Technology,1996.
[19] Miller A G. WordNet: A Lexical Database for English[J]. Communications of the ACM,1995,38(11):39-41
[20] Madhavan J,Bernstein P,Doan A,et al. Corpus-based Schema Matching[C]. In:Proceedings of the International Conference on Data Engineering (ICDE),2005:57-68
[21] Similarity Metrics[EB/OL].[2007-01-10].

[1] Yu Fengchang,Lu Wei. Constructing Data Set for Location Annotations of Academic Literature Figures and Tables[J]. 数据分析与知识发现, 2020, 4(6): 35-42.
[2] Junliang Yao,Xiaoqiu Le. Semantic Matching for Sci-Tech Novelty Retrieval[J]. 数据分析与知识发现, 2019, 3(6): 50-56.
[3] Shijie Song,Yuxiang Zhao,Wenting Han,Qinghua Zhu. The Inhibition Effect of Health Literacy on Health Risk Under the Internet Environment: An Empirical Study of Chronic Diseases Based on CHNS Data[J]. 数据分析与知识发现, 2019, 3(4): 13-21.
[4] Liu Dongsu,Huo Chenhui. Recommending Image Based on Feature Matching[J]. 数据分析与知识发现, 2018, 2(3): 49-59.
[5] Hou Yinxiu,Li Weiqing,Wang Weijun,Zhang Tingting. Personalized Book Recommendation Based on User Preferences and Commodity Features[J]. 数据分析与知识发现, 2017, 1(8): 9-17.
[6] Shi Liting,Zhang Qian,Zhong Yongheng,Hu Sisi,Li Zhenzhen. Using Bidirectional Pattern Matching Model to Pre-Process Yearbook Data[J]. 现代图书情报技术, 2016, 32(9): 88-94.
[7] Hao Jiashu. Enriching Personal Name Authority with Open Semantic Resources:FOAF for Schema Design[J]. 现代图书情报技术, 2016, 32(2): 75-82.
[8] Gao Jinsong, Cheng Ya, Liang Yanqi. Ontology Matching for Linked Data Set[J]. 现代图书情报技术, 2015, 31(6): 33-40.
[9] Jiang Chuntao. Automatic Annotation of Bibliographical References in Chinese Patent Documents[J]. 现代图书情报技术, 2015, 31(10): 81-87.
[10] Zhang Aimin, Jia Junzhiz, Hao Qianqian. The Study on Automatic Mapping of Category Between Chinese Library Classification and DDC[J]. 现代图书情报技术, 2014, 30(7): 17-23.
[11] Cui Jindong, Xu Baoxiang. Research on Grid Service Ontology Matching Algorithm for IOPE Perspective[J]. 现代图书情报技术, 2014, 30(5): 10-17.
[12] Li Dan, Li Juan. Research on WeChat and Library Business and Application System Integration[J]. 现代图书情报技术, 2014, 30(12): 97-104.
[13] Gu Jun, Xu Xin. Study on Ontology Relation Extraction in Chinese Patent Documents[J]. 现代图书情报技术, 2013, 29(10): 73-78.
[14] Zhao Yan, Chen Heng. A Method to Improve Accuracy of Automatic Indexing for Chinese-English Mixed Text[J]. 现代图书情报技术, 2012, 28(6): 36-42.
[15] Xiao Jing, Liang Bing, Zhang Xiaodan, Lv Shijiong. Author Disambiguation Rules and Algorithm for Article Level Data[J]. 现代图书情报技术, 2012, 28(5): 55-59.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938