|
|
Study on String-based Matching of Information Integration |
Sun Haixia Cheng Ying |
(Department of Information Management, Nanjing University, Nanjing 210093,China) |
|
|
Abstract Matching is one of the most important techniques of information integration. In this paper, string-based matching algorithms,mainly distance-based,token-based and the N-gram are elucidated. The deficiencies and research directions are also outlined.
|
Received: 01 June 2007
Published: 25 July 2007
|
|
Corresponding Authors:
Sun Haixia
E-mail: sunyiqin1984@yahoo.com.cn
|
About author:: Sun Haixia,Cheng Ying |
[1] 陈跃国,王京春.数据集成综述[J].计算机科学, 2004,31(5):48-51
[2] Maurizio L. Data Integration: A Theoretical Perspective[C].In:Proc.of the ACM SIGACT—SIGMOD —SIGART Symposium on Principles of Database Systems,2002:233-246.
[3] 吴昊,邢桂芬.基于本体的信息集成技术研究[J].计算机应用, 2005,25(2):456-458
[4] Shvaiko P,Euzenat J. A survey of Schema-based Matching Approaches[J]. Journal on Data Semantics,LNCS 3730,2005:146-171.
[5] Rahm E,Bernstein P. A Survey of Approaches to Automatic Schema Matching[J]. The International Journal on Very Large Data Bases (VLDB),2001,10(4):334-350
[6] Madhavan J,Bernstein P,Rahm E.Generic Schema Matching With Cupid[C]. In: Proceedings of the Very Large Data Bases Conference (VLDB),2001:49-58
[7] Do H H,Rahm E. COMA-A System for Flexible Combination of Schema Matching Approaches[C]. In: Proceedings of the Very Large Data Bases Conference (VLDB), 2001: 610-621
[8] Giunchiglia F,Shvaiko P,Yatskevich M. S-Match:An Algorithm and an Implementation of Semantic Matching[C].In: Proceedings of the European Semantic Web Symposium (ESWS),2004: 61-75
[9] Melnik S,Garcia-Molina H,Rahm E. Similarity Flooding:A Versatile Graph Matching Algorithm[C].In: Proceedings of the International Conference on Data Engineering (ICDE), 2002:117-128
[10] Ilenko B,Cohenw M R,et al. Adaptive Name Matching in Information Integration [J]. IEEE Intelligent Systems, 2003,18 (5):16-23
[11] Geng J F, Yang J. AutoBib:Automatic Extraction and Integration of Bibliographic Information on the Web[C]. In:Proceedings of the 29th VLDB Conference. Berlin, Germany, 2003:193-204.
[12] Giunchiglia F,Yatskevich M. Element Level Semantic Matching[C].In: Proceedings of Meaning Coordination and Negotiation Workshop at the International Semantic Web Conference (ISWC),2004:61-75
[13] Giunchiglia F,Shvaiko P,Yatskevich M.Semantic Schema Matching[R]. Technical Report DIT-05-014, University of Trento, 2005:347-365.
[14] 孙建军,成颖.信息检索技术[M]. 北京:科学出版社.2004: 53-71,232-242
[15] Smith F, Waterman M S. Identification of Common Molecular Subsequences[J]. Journal of Molecular Biology,1981(147): 195 -197
[16] Jaro M A. Advances in Record Linkage Methodology as Applied to Matching the 1985 Census of Tampa, Florida[J]. Journal of American Statistical Association, 1989,86(406):414-420
[17] 程国达,邹亚会,朱静.一种自适应信息集成方法[J].计算机应用, 2005,25(3):666-669
[18] Hylton J A. Identifying and Merging Related Bibliographic Records[D]. MIT Institute of Technology,1996.
[19] Miller A G. WordNet: A Lexical Database for English[J]. Communications of the ACM,1995,38(11):39-41
[20] Madhavan J,Bernstein P,Doan A,et al. Corpus-based Schema Matching[C]. In:Proceedings of the International Conference on Data Engineering (ICDE),2005:57-68
[21] Similarity Metrics[EB/OL].[2007-01-10]. http://www.dcs.shef.ac.uk/~sam/stringmetrics.html |
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|