The Research and Analysis on Automatic Extraction of Science and Technology Literature Terms

doi:10.11925/infotech.1003-3513.2014.01.08

New Technology of Library and Information Service

2014, Vol. 30

Issue (1): 51-55 DOI: 10.11925/infotech.1003-3513.2014.01.08

KNOWLEDGE ORGANIZATION AND KNOWLEDGE MANAGEMENT

Current Issue | Archive | Adv Search

The Research and Analysis on Automatic Extraction of Science and Technology Literature Terms

Zeng Wen, Xu Shuo, Zhang Yunliang, Zhai Juanhua

Institute of Scientific & Technical Information of China,Beijing 100038,China

Download:
Export: BibTeX | EndNote (RIS)

Abstract [Objective] In order to improve the efficiency of science and technology literature information organization and retrieval,extraction of science and technology terms is the basic research problem. [Methods] The paper proposes an automatic extraction method based on science and technology terms characteristics and statistical computing. The method fully combines language characteristics and statistical information of terms such as the combination strength between words and the position that appeared in the literature to realize automatic extraction algorithm. [Results] Experimental results show that the average accuracy of scientific terms extraction can reach 51.2%. [Limitations] Statistical computing algorithm and data processing still need further improve for the algorithm and the quality of data. [Conclusions] The proposed method is effective.

Key words： Technical term Term characteristic Statistical calculation Automatic extraction

Received: 14 February 2014 Published: 14 February 2014

TP391

	Service

	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Zeng Wen
	Xu Shuo
	Zhang Yunliang
	Zhai Juanhua

Cite this article:

Zeng Wen,Xu Shuo,Zhang Yunliang,Zhai Juanhua. The Research and Analysis on Automatic Extraction of Science and Technology Literature Terms. New Technology of Library and Information Service, 2014, 30(1): 51-55.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2014.01.08 OR https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2014/V30/I1/51

[1] Frantzi K T,Ananiadou S,Mima H．Automatic Recognition of Multi-word Terms：The C-value/NC-value Method [J]. International Journal on Digital Libraries,2000,3（2）：115-130.
[2]常鹏,马辉.高效的短文本主题词抽取方法[J]. 计算机工程与应用,2011,47（20）：126-128,154.（Chang Peng,Ma Hui. Efficient Short Texts Keyword Extraction Method Analysis[J]. Computer Engineering and Applications,2011,47（20）：126-128,154.）
[3]李鹏,王斌,石志伟,等. Tag-TextRank：一种基于Tag的网页关键词抽取方法[J]. 计算机研究与发展,2012,49（11）：2344-2351.（Li Peng,Wang Bin,Shi Zhiwei,et al. Tag-TextRank：A Webpage Keyword Extraction Method Based on Tags[J]. Journal of Computer Research and Development,2012,49（11）：2344-2351.）
[4]陈文亮,朱靖波,姚天顺,等. 基于Bootstrapping的领域词汇自动获取[C]. 见：全国第7届计算语言学联合学术会议论文集.2003：67-72.（Chen Wenliang,Zhu Jingbo,Yao Tianshun,et al. Automatic Learning Field Words by Bootstrapping[C]. In：Proceedings of the 7th Computational Linguistics in China. 2003：67-72.）
[5]王裴岩,张桂平,蔡东风,等. 一种用于专利主题词抽取的模板自动生成方法[J]. 沈阳航空工业学院学报,2010,27（3）：46-49.（Wang Peiyan ,Zhang Guiping,Cai Dongfeng,et al. An Automation Pattern Generation Method for Patent Topic Keyword Extraction[J]. Journal of Shenyang Institute of Aeronautical Engineering,2010,27（3）：46-49.）
[6]邢红兵. 信息领域汉语术语的特征及其在语料中的分布规律[J]. 术语标准化与信息技术,2000（3）：17-21.（Xing Hongbing. Structural Features and Distributions of Chinese- English Terms in the Corpus from Information Field[J]. Terminology Standardization and Information Technology,2000（3）：17-21.）
[7]张榕. 术语定义抽取、聚类与术语识别研究[D]. 北京：北京语言大学,2006.（Zhang Rong. The Term Definition Extraction, Clustering and Terminology Recognition Research [D]. Beijing：Beijing Language and Culture University,2006.
[8]国家技术监督局. 汉语叙词表编制规则GB13190-91 [S]. 北京：中国标准出版社,1992：1-17.（ State Bureau of Technical Supervision. Guidelines for Establishment and Development of Chinese Thesauri[S]. Beijing：China Standards Press,1992：1-17.）

[1]	Liu Qingxiang,Zhang Pengzhu,Zhang Xiaoyan,Liu Jingfang. Automatically Extracting Talents’ Knowledge Structure Online[J]. 现代图书情报技术, 2016, 32(4): 56-63.
[2]	Zhang Xiuxiu ,Ma Jianxia. Automatic Extraction of Semantic Metadata from PDF Research Papers[J]. 现代图书情报技术, 2009, 3(2): 102-106.
[3]	Zeng Su,Ma Jianxia,Zhang Xiuxiu. New Development of Automatic Metadata Extraction[J]. 现代图书情报技术, 2008, 24(4): 7-11.
[4]	He Lin. Research on the Relation Extraction of Domain Ontology[J]. 现代图书情报技术, 2008, 24(4): 35-38.
[5]	Tan Chunmei,Yan Shiwei,Liu Zimu. Design and Realization of Knowledge Element Automatic Extraction of Network Special Subject Knowledge Organization[J]. 现代图书情报技术, 2008, 24(3): 62-67.

Viewed

Full text

Abstract

Cited

Shared

Discussed