Please wait a minute...
New Technology of Library and Information Service  2013, Vol. 29 Issue (9): 23-29    DOI: 10.11925/infotech.1003-3513.2013.09.04
Current Issue | Archive | Adv Search |
Decoding Optimization in Tree Transducer based Translation Model
Shi Chongde, Qiao Xiaodong, Wang Huilin
Institute of Scientific & Technical Information of China, Beijing 100038, China
Download: PDF(487 KB)   HTML  
Export: BibTeX | EndNote (RIS)      
Abstract  This paper proposes two methods to improve the efficiency of rule binarization and decoding in tree transducer based translation model. The authors convert synchronous transducer rules to four kinds of binary rules to reduce the temporary items, and propose RR-CKY decoding algorithm, which can avoid part of redundant items along with decoding. The experiments show that these two methods can reduce the number of temporary items and make decoding faster. They can also improve the quality of machine translation.
Key wordsMachine translation      Tree transducer based translation model      Parsing      RR-CKY algorithm     
Received: 19 June 2013      Published: 27 September 2013
:  TP391.2  

Cite this article:

Shi Chongde, Qiao Xiaodong, Wang Huilin. Decoding Optimization in Tree Transducer based Translation Model. New Technology of Library and Information Service, 2013, 29(9): 23-29.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2013.09.04     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2013/V29/I9/23

[1] Wu D. Toward Machine Translation with Statistics and Syntax and Semantics[C].In: Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU’09), Merano, Italy. 2009: 12-21.
[2] Chiang D. Hierarchical Phrase-based Translation[J].Computational Linguistics,2007, 33(2):201-228.
[3] Marcu D, Wang W, Echihabi A, et al. SPMT: Statistical Machine Translation with Syntactified Target Language Phrases[C].In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing,Sydney,Australia. 2006:44-52.
[4] 刘洋. 树到串统计翻译模型研究[D].北京:中国科学院计算技术研究所,2007.(Liu Yang. Research on Tree-to-String Statistical Translation Models[D]. Beijing: Institute of Computing Technology, Chinese Academy of Sciences,2007.)
[5] 蒋宏飞. 基于同步树替换文法的统计机器翻译方法研究[D]. 哈尔滨:哈尔滨工业大学,2010. (Jiang Hongfei. Research on Synchronous Tree Substitution Grammar Based Statistical Machine Translation Methods[D]. Harbin: Harbin Institute of Technology,2010.)
[6] 宗成庆. 统计自然语言处理[M].北京:清华大学出版社,2008.(Zong Chengqing. Statistical Natural Language Processing[M]. Beijing: Tsinghua University Press,2008.)
[7] Zhang H, Huang L, Gildea D, et al. Synchronous Binarization for Machine Translation[C].In: Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics,2006:256-263.
[8] Wang W, Knight K, Marcu D. Binarizing Syntax Trees to Improve Syntax-based Machine Translation Accuracy[C].In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning,Prague, Czech Republic. Stroudsburg, PA, USA: Association for Computational Linguistics, 2007: 746-754.
[9] Fang L, Chung T, Gildea D. Terminal-aware Synchronous Binarization[C].In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, Portland, Oregon, USA. 2011: 401-406.
[10] The Penn Treebank Project[DB/OL]. [2013-06-15]. http://www.cis.upenn.edu/~treebank/.
[11] Collins M. Head-driven Statistical Models for Natural Language Parsing[D]. Philadelphia: University of Pennsylvania,1999.
[12] Charniak E. A Maximum-Entropy-Inspired Parser[C].In: Proceedings of the 1st North American Chapter of the Association for Computational Linguistics Conference. 2000:132-139.
[13] Klein D, Manning C D. Accurate Unlexicalized Parsing[C].In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics.2003:423-430.
[14] Song X, Ding S, Lin C Y. Better Binarization for the CKY Parsing[C].In: Proceedings of the Conference on Empirical Methods in Natural Language Processing,Honolulu,Hawaii,USA. 2008:167-176.
[15] Schmid H. Efficient Parsing of Highly Ambiguous Context-free Grammars with Bit Vectors[C].In: Proceedings of the 20th International Conference on Computational Linguistics.2004.
[16] Fox H J. Phrasal Cohesion and Statistical Machine Translation[C].In: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing. Stroudsburg,PA,USA: Association for Computational Linguistics,2002:304-311.
[17] Galley M, Hopkins M, Knight K, et al. What’s in a Translation Rule?[C].In: Proceedings of the Human Language Technology and North American Association for Computational Linguistics Conference (HLT-NAACL 2004),Boston, Massachusetts, USA. 2004:273-280.
[18] Graehl J, Knight K, May J. Training Tree Transducers[J]. Computational Linguistics, 2008,34(3):391-427.
[19] Goodman J. Semiring Parsing[J]. Computational Linguistics,1999,25(4):573-605.
[20] Venugopal A,Zollmann A,Vogel S. An Efficient Two-Pass Approach to Synchronous-CFG Driven Statistical MT[C].In: Proceedings of Human Language Technology and North American Association for Computational Linguistics Conference,Rocherster, NY, USA. 2007:500-507.
[21] GIZA + +[CP/OL]. [2013-06-15]. http://code.google.com/p/giza-pp/.
[22] The Stanford Parser[CP/OL]. [2013-06-15]. http://nlp.stanford.edu/software/lex-parser.shtml.
[23] SRILM[CP/OL]. [2013-06-15]. http://www.speech.sri.com/projects/srilm/.
[24] NIST Open Machine Translation (OpenMT) Evaluation[DB/OL]. [2013-06-15]. http://www.itl.nist.gov/iad/mig//tests/mt/.
[25] Och F J. Minimum Error Rate Training in Statistical Machine Translation[C]. In: Proceedings of the 41st Annual meeting on Association for Computational Linguistics, Sapporo, Japan. Stroudsburg, PA, USA: Association for Computational Linguistics, 2003:160-167.
[1] Qingmin Liu,Changqing Yao,Chongde Shi,Xiaojie Wen,Yueying Sun. Vocabulary Optimization of Neural Machine Translation for Scientific and Technical Document[J]. 数据分析与知识发现, 2019, 3(3): 76-82.
[2] Shuang Yang,Fen Chen. Analyzing Sentiments of Micro-blog Posts Based on Support Vector Machine[J]. 数据分析与知识发现, 2017, 1(2): 73-79.
[3] Lan Qiujun,Liu Wenxing,Li Weikang,Hu Xingye. Sentiment Analysis of Financial Forum Textual Message[J]. 现代图书情报技术, 2016, 32(4): 64-71.
[4] Du Siqi, Li Honglian, Lv Xueqiang. Research of Chinese Chunk Parsing in Application of the Product Feature Extraction[J]. 现代图书情报技术, 2015, 31(9): 26-30.
[5] Zhang Fan, Le Xiaoqiu. Research on Recognition of Concept Attribute Instances in Innovation Sentences of Scientific Research Paper[J]. 现代图书情报技术, 2015, 31(5): 15-23.
[6] Shao Jian, Zhang Chengzhi. Automatic Acquisition of Domain Parallel Corpora from Internet[J]. 现代图书情报技术, 2014, 30(12): 36-43.
[7] Nie Hui, Du Jiazhong. Using Dependency Parsing Pattern to Extract Product Feature Tags[J]. 现代图书情报技术, 2014, 30(12): 44-50.
[8] Tang Xiaobo, Xiao Lu. Research of Text Feature Extraction on Dependency Parsing Network[J]. 现代图书情报技术, 2014, 30(11): 31-37.
[9] Shi Chongde, Wang Huilin. Research on Chinese Word Segmentation Optimization in Statistical Machine Translation[J]. 现代图书情报技术, 2012, 28(4): 29-34.
[10] Dong Gui. Research on PostgreSQL-based TMX Storage and Implementation of Corpus Retrieval Platform[J]. 现代图书情报技术, 2011, 27(7/8): 47-55.
[11] Sun Zhen Wang Huilin. Overview on the Advance of the Research on Named Entity Recognition[J]. 现代图书情报技术, 2010, 26(6): 42-47.
[12] Zhong Xia Zhang Zhiping Wang Huilin. Survey on Lexicalized Tree Adjoining Grammar and Its Application in Chinese[J]. 现代图书情报技术, 2010, 26(5): 35-42.
[13] Zhai Dongsheng,Liu Chen,Ouyang Yihui. The Design and Implementation of Patent Information Acquiring and Analysis System[J]. 现代图书情报技术, 2009, 25(5): 55-60.
[14] Jia Junzhi,Dong Gang. The Study on Integration of CFN and VerbNet,WordNet[J]. 现代图书情报技术, 2008, 24(6): 6-10.
[15] Zhang Liang,Chen Zhaoxiong,Huang Heyan,Ma Yuzhi. Design and Implementation of Chinese SyntacticParsing-Oriented Assistant System[J]. 现代图书情报技术, 2006, 22(1): 47-50.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn