[1] |
常宝宝, 俞士汶. 语料库技术及其应用[J]. 外语研究, 2009 (5): 43-51.
|
[1] |
(Chang Baobao, Yu Shiwen. Corpus Technology and Its Application[J]. Foreign Languages Research, 2009 (5): 43-51.)
|
[2] |
梁茂成. 什么是语料库语言学[M]. 上海: 上海外语教育出版社, 2016.
|
[2] |
(Liang Maocheng. What is Corpus Linguistics?[M]. Shanghai: Shanghai Foreign Language Education Press, 2016.)
|
[3] |
王克非. 语料库翻译学探索[M]. 上海: 上海交通大学出版社, 2012.
|
[3] |
(Wang Kefei. Exploring Corpus-based Translation Studies[M]. Shanghai: Shanghai Jiao Tong University Press, 2012.)
|
[4] |
Zanettin F. Translation-Driven Corpora: Corpus Resources for Descriptive and Applied Translation Studies[M]. London: Routledge, 2014.
|
[5] |
Koehn P. Neural Machine Translation[M]. Cambridge: Cambridge University Press, 2020.
|
[6] |
李晓倩, 胡开宝. 《习近平谈治国理政》多语平行语料库的建设与应用[J]. 外语电化教学, 2021 (3): 83-88, 13.
|
[6] |
(Li Xiaoqian, Hu Kaibao. The Multilingual Parallel Corpus of Xi Jinping: The Governance of China: Compilation and Applications[J]. Technology Enhanced Foreign Language Education, 2021 (3): 83-88, 13.)
|
[7] |
梁继文, 江川, 王东波. 基于多特征融合的先秦典籍汉英句子对齐研究[J]. 数据分析与知识发现, 2020, 4(9): 123-132.
|
[7] |
(Liang Jiwen, Jiang Chuan, Wang Dongbo. Chinese-English Sentence Alignment of Ancient Literature Based on Multi-feature Fusion[J]. Data Analysis and Knowledge Discovery, 2020, 4(9): 123-132.)
|
[8] |
王克非. 以汉语为中心语的多语汉外平行语料库集群的研制与应用[J]. 外语教学, 2022, 43(6): 1-7.
|
[8] |
(Wang Kefei. Development and Application of a Multilingual Sino-Foreign Parallel Corpora Group with Chinese as the Pivot Language[J]. Foreign Language Education, 2022, 43(6): 1-7.)
|
[9] |
Goyal N, Gao C, Chaudhary V, et al. The Flores-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation[J]. Transactions of the Association for Computational Linguistics, 2022, 10: 522-538.
|
[10] |
Simard M. Building and Using Parallel Text for Translation[M]// The Routledge Handbook of Translation and Technology. London: Routledge, 2019: 78-90.
|
[11] |
Frankenberg-Garcia A. A Corpus Study of Splitting and Joining Sentences in Translation[J]. Corpora, 2019, 14(1): 1-30.
|
[12] |
黄佳跃, 熊德意. 句对齐研究综述[J]. 中文信息学报, 2021, 35(8): 16-27.
|
[12] |
(Huang Jiayue, Xiong Deyi. A Survey of Sentence Alignment[J]. Journal of Chinese Information Processing, 2021, 35(8): 16-27.)
|
[13] |
刘文斌, 何彦青, 吴振峰, 等. 基于BERT和多相似度融合的句子对齐方法研究[J]. 数据分析与知识发现, 2021, 5(7): 48-58.
|
[13] |
(Liu Wenbin, He Yanqing, Wu Zhenfeng, et al. Sentence Alignment Method Based on BERT and Multi-similarity Fusion[J]. Data Analysis and Knowledge Discovery, 2021, 5(7): 48-58.)
|
[14] |
Gale W A, Church K W. A Program for Aligning Sentences in Bilingual Corpora[J]. Computational Linguistics, 1993, 19(1): 75-102.
|
[15] |
Indurkhya N, Damerau F J. Handbook of Natural Language Processing[M]. The 2nd Edition. Boca Raton: CRC Press, 2010: 367-408.
|
[16] |
熊文新. 英汉环保领域平行语料的句对齐与再对齐[J]. 现代图书情报技术, 2013 (6): 36-41.
|
[16] |
(Xiong Wenxin. Sentence Alignment and Re-Alignment for Environmental Protection Texts in English-Chinese Parallel Corpus[J]. New Technology of Library and Information Service, 2013(6): 36-41.)
|
[17] |
Varga D, Halácsy P, Kornai A, et al. Parallel Corpora for Medium Density Languages[M]// Recent Advances in Natural Language Processing IV. Amsterdam: John Benjamins Publishing Company, 2007: 247-258.
|
[18] |
Sennrich R, Volk M. MT-Based Sentence Alignment for OCR-Generated Parallel Texts[C]// Proceedings of the 9th Conference of the Association for Machine Translation in the Americas:Research Papers. 2010.
|
[19] |
Ziemski M, Junczys-Dowmunt M, Pouliquen B. The United Nations Parallel Corpus v1.0[C]// Proceedings of the 10th International Conference on Language Resources and Evaluation. 2016: 3530-3534.
|
[20] |
Esplà-Gomis M, Forcada M L, Ramírez-Sánchez G, et al. ParaCrawl: Web-Scale Parallel Corpora for the Languages of the EU[C]// Proceedings of Machine Translation Summit XVII: Translator, Project and User Tracks. 2019: 118-119.
|
[21] |
Thompson B, Koehn P. Vecalign: Improved Sentence Alignment in Linear Time and Space[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019: 1342-1348.
|
[22] |
Artetxe M, Schwenk H. Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond[J]. Transactions of the Association for Computational Linguistics, 2019, 7: 597-610.
|
[23] |
Johnson J, Douze M, Jégou H. Billion-Scale Similarity Search with GPUs[J]. IEEE Transactions on Big Data, 2021, 7(3): 535-547.
|
[24] |
Zamani H, Faili H, Shakery A. Sentence Alignment Using Local and Global Information[J]. Computer Speech & Language, 2016, 39: 88-107.
|
[25] |
肖桐, 朱靖波. 机器翻译:基础与模型[M]. 北京: 电子工业出版社, 2021.
|
[25] |
(Xiao Tong, Zhu Jingbo. Machine Translation: Foundations and Models[M]. Beijing: Publishing House of Electronics Industry, 2021.)
|
[26] |
Kocmi T, Federmann C, Grundkiewicz R, et al. To Ship or Not to Ship: An Extensive Evaluation of Automatic Metrics for Machine Translation[C]// Proceedings of the 6th Conference on Machine Translation. 2021: 478-494.
|
[27] |
Freitag M, Rei R, Mathur N, et al. Results of WMT22 Metrics Shared Task: Stop Using BLEU - Neural Metrics Are Better and More Robust[C]// Proceedings of the 7th Conference on Machine Translation. 2022: 46-68.
|
[28] |
Popović M. chrF: Character n-gram F-Score for Automatic MT Evaluation[C]// Proceedings of the 10th Workshop on Statistical Machine Translation. 2015: 392-395.
|
[29] |
Popović M. chrF++: Words Helping Character n-Grams[C]// Proceedings of the 2nd Conference on Machine Translation. 2017: 612-618.
|
[30] |
Rei R, Stewart C, Farinha A C, et al. COMET: A Neural Framework for MT Evaluation[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 2020: 2685-2702.
|
[31] |
Lample G, Conneau A. Cross-Lingual Language Model Pretraining[OL]. arXiv Preprint, arXiv: 1901.07291.
|
[32] |
Vondřička P. Aligning Parallel Texts with InterText[C]// Proceedings of the 9th International Conference on Language Resources and Evaluation. 2014: 1875-1879.
|
[33] |
Klein G, Hernandez F, Nguyen V, et al. The OpenNMT Neural Machine Translation Toolkit: 2020 Edition[C]// Proceedings of the 14th Conference of the Association for Machine Translation in the Americas (Volume 1:Research Track). 2020: 102-109.
|
[34] |
Feng F X Y, Yang Y F, Cer D, et al. Language-Agnostic BERT Sentence Embedding[C]// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers). 2022: 878-891.
|
[35] |
Wolf T, Debut L, Sanh V, et al. Transformers: State-of-the-Art Natural Language Processing[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing:System Demonstrations. 2020: 38-45.
|
[36] |
Xu Y, Max A, Yvon F. Sentence Alignment for Literary Texts: The State-of-the-Art and Beyond[J]. Linguistic Issues in Language Technology, 2015, 12(6): 1-29.
|
[37] |
Graën J. Exploiting Alignment in Multiparallel Corpora for Applications in Linguistics and Language Learning[D]. Zurich: University of Zurich, 2018.
|
[38] |
Tiedemann J. Bitext Alignment[M]. San Rafael, CA: Morgan & Claypool, 2011.
|
[39] |
Khayrallah H, Koehn P. On the Impact of Various Types of Noise on Neural Machine Translation[C]// Proceedings of the 2nd Workshop on Neural Machine Translation and Generation. 2018: 74-83.
|
[40] |
Herold C, Rosendahl J, Vanvinckenroye J, et al. Detecting Various Types of Noise for Neural Machine Translation[C]// Findings of the Association for Computational Linguistics:ACL 2022. 2022: 2542-2551.
|