Please wait a minute...
New Technology of Library and Information Service  2016, Vol. 32 Issue (9): 88-94    DOI: 10.11925/infotech.1003-3513.2016.09.11
Orginal Article Current Issue | Archive | Adv Search |
Using Bidirectional Pattern Matching Model to Pre-Process Yearbook Data
Shi Liting1(),Zhang Qian2,Zhong Yongheng1,Hu Sisi1,Li Zhenzhen1
1Wuhan Library, Chinese Academy of Sciences, Wuhan 430071, China
2The 9th Designing of China Aerospace Science Industry Corporation, Wuhan 430040, China
Download:
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] We try to store the yearbook records as structured data, which will also be updated regularly. [Context] The yearbook data pre-process system is a C/S tool platform for collecting, auditing and uploading data. It was developed with VC++, and generated contents for the yearbook database. [Methods] We first modified the classic WM algorithm to build a new bidirectional pattern matching model. With the help of word segmentation technology, the new model could extract the metadata of original records. Then, we reduced the number of pattern sets with data storing procedure and bidirectional matched the records to ensure the effectiveness and efficiency of the system. [Results] The proposed algorithm achieved high level of matching rate and accuracy. [Conclusions] Bidirectional matching algorithm can meet the needs of the yearbook data entry, and improve the efficiency of the data preprocessing system.

Key wordsBidirectional pattern matching      The yearbook data      WM algorithm     
Received: 09 March 2016      Published: 19 October 2016

Cite this article:

Shi Liting,Zhang Qian,Zhong Yongheng,Hu Sisi,Li Zhenzhen. Using Bidirectional Pattern Matching Model to Pre-Process Yearbook Data. New Technology of Library and Information Service, 2016, 32(9): 88-94.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2016.09.11     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2016/V32/I9/88

[1] 宋莉莉. 年鉴信息化的思考与探索[J] .兰台世界, 2013(2): 11-12.
[1] (Song Lili.Consideration and Exploration on Informatics in Yearbook[J]. Lantai World, 2013(02): 11-12.)
[2] 樊胜. C/S与B/S的结构比较及Web数据库的访问方式[J]. 情报科学, 2001, 19(4): 443-445.
[2] (Fan Sheng.The Comparison Between C/S Structure and B/S Structure and the Ways to Access Web Database[J]. Information Science, 2001, 19(4): 443-445.)
[3] Alomari O, Othman Z.Bees Algorithm for Feature Selection in Network Anomaly Detection[J]. Journal of Applied Sciences Research, 2012(8): 1748-1756.
[4] 王春雨. 基于编辑距离的字符串模式匹配算法研究[D]. 秦皇岛: 燕山大学, 2015.
[4] (Wang Chunyu.The String Pattern Matching Algorithm Based on Edit Distance [D]. Qinhuangdao: Yanshan University, 2015.)
[5] Knuth D E, Morris Jr J H, Pratt V R. Fast Pattern Matching in String[J]. SIAM Journal on Computing, 1977, 6(2): 323-350.
[6] Boyer R S, Moore J S.A Fast String Searching Algorithm[J]. Communications of the ACM, 1977, 20(10): 762-772.
[7] Yao A C.The Complexity of Pattern Matching for a Random String[J]. SIAM Journal on Computing, 1979, 8(3): 368-387.
[8] Faro S, Lecroq T. The Exact Online String Matching Problem: A Review of the Most Recent Results [J]. ACM Computing Surveys (CSUR), 2013, 45(2): Article No.13.
[9] 侯淼. 并行串匹配算法研究[D]. 哈尔滨: 哈尔滨工业大学, 2014.
[9] (Hou Miao.Research of Parallel String Matching Algorithm [D]. Harbin: Harbin Institute of Technology, 2014.)
[10] Aho A V, Corasick M J.Efficient String Matching:An Aid to Bibliographic Search[J]. Communication of the ACM, 1975, 18(6): 333-340.
[11] Wu S, Manber U.A Fast Algorithm for Multi-Pattern Searching[R]. Report TR-94-17. Tucson, AZ: Department of Computer Science, University of Arizona, 1994.
[12] 王一霈, 石春, 戴上静, 等. 一种改进的针对中文编码的Wu-Manber多模式匹配算法[J]. 小型微型计算机系统, 2015, 36(4): 779-781.
[12] (Wang Yipei, Shi Chun, Dai Shangjing, et al.An Improved Wu-Manber Multi-pattern Matching Algorithm for Chinese Encoding[J]. Journal of Chinese Computer Systems, 2015, 36(4): 778-781.)
[13] 张华平. ICTCLAS2011接口文档[K]. 北京理工大学, 2011.
[13] (Zhang Huaping.ICTCLAS2011 API Document [K]. Beijing Institute of Technology, 2011.)
[14] 宋敏. 基于SOA图书馆数字资源整合平台关键技术的研究与实现[J]. 现代图书情报技术, 2009(9): 22-27.
[14] (Song Min.Research and Realization of Key Techniques of Library’s Digital Resource Integration Platform Based on SOA[J]. New Technology of Library and Information Services, 2009(9): 22-27.)
No related articles found!
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn