Please wait a minute...
New Technology of Library and Information Service  2009, Vol. Issue (9): 70-75    DOI: 10.11925/infotech.1003-3513.2009.09.12
Current Issue | Archive | Adv Search |
Sham Battle Information Extraction Based on Pattern Matching
Jia Meiying1,3  Yang BingruZheng Dequan2,3 Cao HongqiangYang JingZhang Lian2
1(School of Information Engineering, University of Science and Technology Beijing, Beijing 100083,China)
2(MOE-MS Key Laboratory of Natural Language Processing and Speech, Harbin Institute of Technology, Harbin 150001,China)
3(Beijing Graphic Institution, Beijing 100029,China)
Export: BibTeX | EndNote (RIS)      

 This paper starts from the sham battle intelligence information, and uses the method based on pattern matching to extract sham battle intelligence information.In different steps of information extraction, hierarchical text categorization method is used to filter the target text; seed pattern bootstrapping method together with field dictionary are used to recognize the sham battle blocks; corpus-based method is used to learn and acquire the event patterns.The experimental result shows that this method is effective in special application field, and it is usable in real projects.

Key wordsInformation Extraction      Sham battle      Pattern Matching      Semi-automatic Studying      Event Information Extraction     
Received: 12 June 2009      Published: 25 September 2009


Corresponding Authors: Jia Meiying     E-mail:
About author:: Jia Meiying,Yang Bingru,Zheng Dequan,Cao Hongqiang,Yang Jing,Zhang Lian

Cite this article:

Jia Meiying,Yang Bingru,Zheng Dequan,Cao Hongqiang,Yang Jing,Zhang Lian. Sham Battle Information Extraction Based on Pattern Matching. New Technology of Library and Information Service, 2009, (9): 70-75.

URL:     OR

[1] 李保利,陈玉忠,俞士汶.信息抽取研究综述[J].计算机工程与应用,2003,39(10):1-5.
[2] Doddington G,Mitchell A, Przybocki M,et al. The Automatic Content Extraction(ACE) Program-Tasks, Data, and Evaluation[C].In: Proceedings of LREC.2004:837-840.
[3]Hai Leong Chieu, Hwee Tou Ng. A Maximum Entropy Approach to Information Extraction from Semi-structured and Free Text[C]. In: Proceedings of the 18th National Conference on Artificial Intelligence, Edmonton, Alberta,Canada. Menlo Park, CA, USA: American Association for Artificial Intelligence,2002:786-791.
[4] 李向阳,苗壮.自由文本信息抽取技术[J].情报科学,2004,22(7):817-820.
[5] 郑家恒,王兴义,李飞.信息抽取模式自动生成方法的研究[J].中文信息学报,2004,18(1):48-54.
[6] 姜吉发.自由文本的信息抽取模式获取的研究[D].北京:中国科学院计算技术研究所,2005.
[7] Riloff E.Automatically Constructing a Dictionary for Information Extraction Tasks[C].In: Proceedings of the 11th National Conference on Artificial Intelligence, Washington DC.1993:811-816.
[8] Riloff E,Shoen J. Automatically Acquiring Conceptual Answer Patterns Without an Annotated Corpus[C]. In: Proceedings of the 3rd Workshop on Very Garge Corpora, MIT.1995:148-161.
[9] Yangarber R. Scenario Customization for Information Extraction[D]. New York: New York University, 2001.
[10] Joyce Yue Chai. Learning and Generalization in the Creation of Information Extraction Systems[D]. Duke:Graduate School of Duke University,1998.
[11] Breiman L. Bagging Predictors[J]. Machine Learning,1996,24(2):123-240.
[12] Freund Y, Schapire R E. Experiments with a New Boosting Algorithm[C]. In: Proceedings of the 13th International Conference on Machine Learning. Morgan Kaufmann,1996:148-156.
[13] 边肇祺,张学工.模式识别[M].北京:清华大学出版社,2000.
[14] Richard O Duda,Peter E Hart. 模式分类[M].李宏东,姚天翔译.北京:机械工业出版社,2003.
[15] 宗成庆.统计、自然语言处理[M].北京:清华大学出版社,2008.
[17] 计算两串字符相似度算法[EB/OL].[2008-09-12].

[1] Tan Ying, Tang Yifei. Extracting Citation Contents with Coreference Resolution[J]. 数据分析与知识发现, 2021, 5(8): 25-33.
[2] Wang Yi,Shen Zhe,Yao Yifan,Cheng Ying. Domain-Specific Event Graph Construction Methods:A Review[J]. 数据分析与知识发现, 2020, 4(10): 1-13.
[3] Tao Yue,Yu Li,Zhang Runjie. Active Learning Strategies for Extracting Phrase-Level Topics from Scientific Literature[J]. 数据分析与知识发现, 2020, 4(10): 134-143.
[4] Zhiqiang Liu,Yuncheng Du,Shuicai Shi. Extraction of Key Information in Web News Based on Improved Hidden Markov Model[J]. 数据分析与知识发现, 2019, 3(3): 120-128.
[5] Chengzhi Zhang,Zheng Li. Extracting Sentences of Research Originality from Full Text Academic Articles[J]. 数据分析与知识发现, 2019, 3(10): 12-18.
[6] Mu Dongmei,Jin Shan,Ju Yuanhong. Finding Association Between Diseases and Genes from Literature Abstracts[J]. 数据分析与知识发现, 2018, 2(8): 98-106.
[7] Shi Liting,Zhang Qian,Zhong Yongheng,Hu Sisi,Li Zhenzhen. Using Bidirectional Pattern Matching Model to Pre-Process Yearbook Data[J]. 现代图书情报技术, 2016, 32(9): 88-94.
[8] Yufeng Duan,Sisi Huang. Information Extraction from Chinese Plant Species Diversity Description Text[J]. 现代图书情报技术, 2016, 32(1): 87-96.
[9] Liu Wei, Wang Xing, Song Peiyan. A Noise Cleaning Method for Synonym Extraction Results[J]. 现代图书情报技术, 2015, 31(6): 64-70.
[10] Jiang Chuntao. Automatic Annotation of Bibliographical References in Chinese Patent Documents[J]. 现代图书情报技术, 2015, 31(10): 81-87.
[11] Li Xiangdong, Huo Yayong, Huang Li. Study of Book Pages Automatic Identification and Bibliographic Information Extraction[J]. 现代图书情报技术, 2014, 30(4): 71-77.
[12] Liu Yajing, Wang Yanxi, Hao Dan, Zhou Jinhui. Study on the Methods of Institutional Repository Supporting Research Services[J]. 现代图书情报技术, 2014, 30(3): 1-7.
[13] Zhang Han, Liu Shuangmei. Comparative Analysis of Centrality Indices in Extracting Concepts from Semantic Predication Network——Based on Disease Treatment Research[J]. 现代图书情报技术, 2013, (6): 30-35.
[14] Huang Xun, You Hongliang, Yu Yang. A Review of Relation Extraction[J]. 现代图书情报技术, 2013, 29(11): 30-39.
[15] He Lin, He Juan, Shen Gengyu, Yang Bo, Huang Shuiqing. An Approach to Discovery of Reference Control Gene for qRT-PCR Experiment Based on Texting Mining[J]. 现代图书情报技术, 2012, 28(7): 109-114.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938