New Technology of Library and Information Service  2012, Vol. 28 Issue (2): 28-33    DOI: 10.11925/infotech.1003-3513.2012.02.05
Chinese-English Comparable Corpus Construction for Bilingual Terminology Extraction
Kang Xiaoli1, Zhang Chengzhi2
1. Library of Nanchang University, Nanchang 330031, China;
2. Department of Information Management, Nanjing University of Science and Technology, Nanjing 210094, China
Abstract  In this paper, the process of building comparable corpus in special domain for bilingual terminology is designed. Firstly, bilingual sample corpus in a special domain is collected, and keywords are extracted from the sample corpus based on word co-occurrence method. Then, these keywords are used to be a query to a scholar search engine, and the searched result is used to be candidate comparable corpus. Finally, the comparable corpus in the special domain is obtained after filtering noise documents by quantitative evaluation.
Key wordsComparable corpus      Corpus construction      Bilingual terminology extraction     
Received: 04 January 2012      Published: 23 March 2012



