Sham Battle Information Extraction Based on Pattern Matching
Jia Meiying1,3 Yang Bingru1 Zheng Dequan2,3 Cao Hongqiang3 Yang Jing2 Zhang Lian2
1(School of Information Engineering, University of Science and Technology Beijing, Beijing 100083,China) 2(MOE-MS Key Laboratory of Natural Language Processing and Speech, Harbin Institute of Technology, Harbin 150001,China) 3(Beijing Graphic Institution, Beijing 100029,China)
This paper starts from the sham battle intelligence information, and uses the method based on pattern matching to extract sham battle intelligence information.In different steps of information extraction, hierarchical text categorization method is used to filter the target text; seed pattern bootstrapping method together with field dictionary are used to recognize the sham battle blocks; corpus-based method is used to learn and acquire the event patterns.The experimental result shows that this method is effective in special application field, and it is usable in real projects.
贾美英,杨炳儒,郑德权,曹鸿强,杨靖,张练. 基于模式匹配的军事演习情报信息抽取*[J]. 现代图书情报技术, 2009, (9): 70-75.
Jia Meiying,Yang Bingru,Zheng Dequan,Cao Hongqiang,Yang Jing,Zhang Lian. Sham Battle Information Extraction Based on Pattern Matching. New Technology of Library and Information Service, 2009, (9): 70-75.
[1] 李保利,陈玉忠,俞士汶.信息抽取研究综述[J].计算机工程与应用,2003,39(10):1-5.
[2] Doddington G,Mitchell A, Przybocki M,et al. The Automatic Content Extraction(ACE) Program-Tasks, Data, and Evaluation[C].In: Proceedings of LREC.2004:837-840.
[3]Hai Leong Chieu, Hwee Tou Ng. A Maximum Entropy Approach to Information Extraction from Semi-structured and Free Text[C]. In: Proceedings of the 18th National Conference on Artificial Intelligence, Edmonton, Alberta,Canada. Menlo Park, CA, USA: American Association for Artificial Intelligence,2002:786-791.
[4] 李向阳,苗壮.自由文本信息抽取技术[J].情报科学,2004,22(7):817-820.
[5] 郑家恒,王兴义,李飞.信息抽取模式自动生成方法的研究[J].中文信息学报,2004,18(1):48-54.
[6] 姜吉发.自由文本的信息抽取模式获取的研究[D].北京:中国科学院计算技术研究所,2005.
[7] Riloff E.Automatically Constructing a Dictionary for Information Extraction Tasks[C].In: Proceedings of the 11th National Conference on Artificial Intelligence, Washington DC.1993:811-816.
[8] Riloff E,Shoen J. Automatically Acquiring Conceptual Answer Patterns Without an Annotated Corpus[C]. In: Proceedings of the 3rd Workshop on Very Garge Corpora, MIT.1995:148-161.
[9] Yangarber R. Scenario Customization for Information Extraction[D]. New York: New York University, 2001.
[10] Joyce Yue Chai. Learning and Generalization in the Creation of Information Extraction Systems[D]. Duke:Graduate School of Duke University,1998.
[11] Breiman L. Bagging Predictors[J]. Machine Learning,1996,24(2):123-240.
[12] Freund Y, Schapire R E. Experiments with a New Boosting Algorithm[C]. In: Proceedings of the 13th International Conference on Machine Learning. Morgan Kaufmann,1996:148-156.
[13] 边肇祺,张学工.模式识别[M].北京:清华大学出版社,2000.
[14] Richard O Duda,Peter E Hart. 模式分类[M].李宏东,姚天翔译.北京:机械工业出版社,2003.
[15] 宗成庆.统计、自然语言处理[M].北京:清华大学出版社,2008.
[16] Kernel-Machines.org[EB/OL].[2009-02-10].http://www.kernel-machines.org.
[17] 计算两串字符相似度算法[EB/OL].[2008-09-12].http://www.blogjava.net/chenlb/archive/2008/06/25/210456.html.
[18]李跃进,赵晶,林鸿飞.基于Internet的军事演习信息抽取系统[J].计算机工程与应用,2006,42(14):214-218.