|
|
Sham Battle Information Extraction Based on Pattern Matching |
Jia Meiying1,3 Yang Bingru1 Zheng Dequan2,3 Cao Hongqiang3 Yang Jing2 Zhang Lian2 |
1(School of Information Engineering, University of Science and Technology Beijing, Beijing 100083,China)
2(MOE-MS Key Laboratory of Natural Language Processing and Speech, Harbin Institute of Technology, Harbin 150001,China)
3(Beijing Graphic Institution, Beijing 100029,China) |
|
|
Abstract This paper starts from the sham battle intelligence information, and uses the method based on pattern matching to extract sham battle intelligence information.In different steps of information extraction, hierarchical text categorization method is used to filter the target text; seed pattern bootstrapping method together with field dictionary are used to recognize the sham battle blocks; corpus-based method is used to learn and acquire the event patterns.The experimental result shows that this method is effective in special application field, and it is usable in real projects.
|
Received: 12 June 2009
Published: 25 September 2009
|
|
Corresponding Authors:
Jia Meiying
E-mail: jmy100@tom.com
|
About author:: Jia Meiying,Yang Bingru,Zheng Dequan,Cao Hongqiang,Yang Jing,Zhang Lian |
[1] 李保利,陈玉忠,俞士汶.信息抽取研究综述[J].计算机工程与应用,2003,39(10):1-5.
[2] Doddington G,Mitchell A, Przybocki M,et al. The Automatic Content Extraction(ACE) Program-Tasks, Data, and Evaluation[C].In: Proceedings of LREC.2004:837-840.
[3]Hai Leong Chieu, Hwee Tou Ng. A Maximum Entropy Approach to Information Extraction from Semi-structured and Free Text[C]. In: Proceedings of the 18th National Conference on Artificial Intelligence, Edmonton, Alberta,Canada. Menlo Park, CA, USA: American Association for Artificial Intelligence,2002:786-791.
[4] 李向阳,苗壮.自由文本信息抽取技术[J].情报科学,2004,22(7):817-820.
[5] 郑家恒,王兴义,李飞.信息抽取模式自动生成方法的研究[J].中文信息学报,2004,18(1):48-54.
[6] 姜吉发.自由文本的信息抽取模式获取的研究[D].北京:中国科学院计算技术研究所,2005.
[7] Riloff E.Automatically Constructing a Dictionary for Information Extraction Tasks[C].In: Proceedings of the 11th National Conference on Artificial Intelligence, Washington DC.1993:811-816.
[8] Riloff E,Shoen J. Automatically Acquiring Conceptual Answer Patterns Without an Annotated Corpus[C]. In: Proceedings of the 3rd Workshop on Very Garge Corpora, MIT.1995:148-161.
[9] Yangarber R. Scenario Customization for Information Extraction[D]. New York: New York University, 2001.
[10] Joyce Yue Chai. Learning and Generalization in the Creation of Information Extraction Systems[D]. Duke:Graduate School of Duke University,1998.
[11] Breiman L. Bagging Predictors[J]. Machine Learning,1996,24(2):123-240.
[12] Freund Y, Schapire R E. Experiments with a New Boosting Algorithm[C]. In: Proceedings of the 13th International Conference on Machine Learning. Morgan Kaufmann,1996:148-156.
[13] 边肇祺,张学工.模式识别[M].北京:清华大学出版社,2000.
[14] Richard O Duda,Peter E Hart. 模式分类[M].李宏东,姚天翔译.北京:机械工业出版社,2003.
[15] 宗成庆.统计、自然语言处理[M].北京:清华大学出版社,2008.
[16] Kernel-Machines.org[EB/OL].[2009-02-10].http://www.kernel-machines.org.
[17] 计算两串字符相似度算法[EB/OL].[2008-09-12].http://www.blogjava.net/chenlb/archive/2008/06/25/210456.html.
[18]李跃进,赵晶,林鸿飞.基于Internet的军事演习信息抽取系统[J].计算机工程与应用,2006,42(14):214-218. |
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|