This paper introduces the technology of Finite State Transducer, and references to the thinking of development of Penn Treebank, through the analysis of rules and the results of comprehensive utilization of POS tagging, recognition of discourse connectives,punctuations, vocabulary mapping, and chunk to simplify the complicated sentences. Final results are expressed in the form of proposition.
申春艳,王惠临. 基于规则的英语复句关联词自动标注技术*[J]. 现代图书情报技术, 2008, 24(3): 40-44.
Shen Chunyan,Wang Huilin. Rule-based Automatic Annotating for the Discourse of English Complicated Sentences. New Technology of Library and Information Service, 2008, 24(3): 40-44.
[1] Wei-Chuan L, Tzusheng P, Bing-Huang L, et al. Parsing Long English Sentences with Pattern Rules[C]. In:Proc. of COLING-90, 1990, 410-412.
[2] L G 亚历山大. 朗文英语语法[M]. 雷航,甘美华,田路一,等(译). 北京:外语教学与研究出版社,1991:21.
[3] Marcus M, Santorini B, Marcinkiewicz M A. Building a Large Annotated Corpus of English: the Penn Treebank[J]. Computational Linguistics,1993,19(2):313-330.
[4] Palmer M, Gildea D, Kingsbcory P.The Proposition Bank: An Annotated Coupus of Semantic Roles[J].Computational Lingusitics, 2005,31(1):71-106.
[5] Miltsakaki E. Annotating Discourse Connectives and Their Arguments[J]. Association for Comutational Linguistics, 2004,147(2):9-16.
[6] Abney S. Partial Parsing via Finite-state Cascades[J]. Language and Information,1996,2(4):337-344.