Please wait a minute...
Advanced Search
现代图书情报技术  2013, Vol. Issue (6): 68-75    DOI: 10.11925/infotech.1003-3513.2013.06.11
  情报分析与研究 本期目录 | 过刊浏览 | 高级检索 |
针对中文学术文献的情报方法术语抽取
化柏林
南京大学信息管理学院 南京 210093;中国科学技术信息研究所 北京 100038
Extracting Information Method Term from Chinese Academic Literature
Hua Bolin
School of Information Management, Nanjing University, Nanjing 210093, China;Institute of Scientific & Technical Information of China, Beijing 100038, China
全文: PDF(557 KB)   HTML  
输出: BibTeX | EndNote (RIS)      
摘要 采用规则的方法,从学术文献中识别方法类句子,然后运用词表与规则相合的方法从句子中抽取方法术语,对抽取出的方法术语进行同义归并,形成情报方法术语库。选取《情报学报》2012年全文作为实验数据进行实验,实验结果表明,利用该方法进行术语抽取是有效的。
服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
化柏林
关键词 知识抽取学术文献情报方法术语抽取    
Abstract:This paper identifies sentences on method from academic literature using rules, then extracts method terminology from these sentences using lexicon and rules, among which synonymous terminology is merged. The author makes an experiment to extract method knowledge from full text of papers published on Journal of the China Society for Scientific and Technical Information, then builds a set of information method system by a statistical analysis on experiment result, which testifies the method is effective.
Key wordsKnowledge extraction    Academic literature    Information method    Terminology extraction
收稿日期: 2013-05-05     
:  G35  
通讯作者: 化柏林     E-mail: huabolin@istic.ac.cn
引用本文:   
化柏林. 针对中文学术文献的情报方法术语抽取[J]. 现代图书情报技术, 2013, (6): 68-75.
Hua Bolin. Extracting Information Method Term from Chinese Academic Literature. New Technology of Library and Information Service, DOI:10.11925/infotech.1003-3513.2013.06.11.
链接本文:  
http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2013.06.11
[1] 化柏林,武夷山.情报方法面面观[J]. 情报学报, 2012,31(3):225.(Hua Bolin,Wu Yishan. Multi-dimensional View of Information Study Method[J].Journal of the China Society for Scientific and Technical Information, 2012,31(3):225.)
[2] 程岚岚. 基于正则表达式的大规模网页术语对抽取研究[J]. 情报杂志,2008,27(11):62-64,68.(Cheng Lanlan. The Study of Large-scale Web Term-pairs Extraction Based on Regular Expressions[J].Journal of Information, 2008,27(11):62-64,68)
[3] 何燕,穗志方,段慧明,等. 基于专业术语词典的自动领域本体构造[J]. 情报学报,2007,26(1): 65-70.(He Yan, Sui Zhifang,Duan Huiming, et al. Automatic Construction of Domain Ontology Based on Terminology Dictionary[J]. Journal of the China Society for Scientific and Technical Information, 2007,26(1):65-70.)
[4] 韩红旗,朱东华,汪雪锋,等.专利技术术语的抽取方法[J]. 情报学报,2011,30(12):1280-1285.(Han Hongqi, Zhu Donghua, Wang Xuefeng, et al. Technical Term Extraction Method for Patent Document[J]. Journal of the China Society for Scientific and Technical Information,2011,30(12):1280-1285.)
[5] 韩红旗,安小米. C-value值和Unithood指标结合的中文科技术语抽取[J]. 图书情报工作, 2012,56(19):85-89. (Han Hongqi, An Xiaomi. Chinese Scientific and Technical Term Extraction by Using C-value and Unithood Measure[J]. Library and Information Service, 2012,56(19):85-89.)
[6] 傅继彬,刘杰,贾可亮,等. 基于知网和术语相关度的本体关系抽取研究[J]. 现代图书情报技术, 2008(9):36-40.(Fu Jibin, Liu Jie, Jia Keliang, et al. Ontology Relationship Extraction Research Based on HowNet and Term Relevancy Degree[J].New Technology of Library and Information Service, 2008(9):36-40.)
[7] 王璐,朱东华,任智军,等. 科技术语属性抽取方法研究[J]. 现代图书情报技术, 2007(5):69-72. (Wang Lu, Zhu Donghua, Ren Zhijun, et al. A Study on Extraction Method of Term’s Attributes [J]. New Technology of Library and Information Service, 2007(5):69-72.)
[8] 王昊. 基于层次模式匹配的命名实体识别模型[J]. 现代图书情报技术, 2007(5):62-68.(Wang Hao. Named Entity Extraction Model Based on Hierarchical Pattern Matching[J]. New Technology of Library and Information Service, 2007(5):62-68.)
[9] 谷俊,严明,王昊,等. 基于改进关联规则的本体关系获取研究[J]. 情报理论与实践, 2011,34(12):121-125.(Gu Jun, Yan Ming, Wang Hao, et al. Extracting the Non-taxonomic Relationships Based on Improved Method of Association Rules[J].Information Studies: Theory & Application, 2011,34(12):121-125.)
[10] 潘虹,徐朝军. LCS算法在术语抽取中的应用研究[J]. 情报学报, 2010,29(5): 853-857. (Pan Hong, Xu Chaojun. Application of LCS-based Algorithm in Chinese Term Extraction[J]. Journal of the China Society for Scientific and Technical Information, 2010,29(5): 853-857.)
[11] 姜韶华,党延忠. 无词典中英文混合术语抽取及算法研究[J]. 情报学报,2006,25(3):301-305. (Jiang Shaohua, Dang Yanzhong. Research on Terms Combined with Chinese and English Extracting and Algorithm with No Thesaurus[J]. Journal of the China Society for Scientific and Technical Information,2006,25(3):301-305.)
[12] 陈士超,郁滨.面向术语抽取的双阈值互信息过滤方法[J]. 计算机应用,2011,31(4):1070-1073.(Chen Shichao, Yu Bin. Method of Mutual Information Filtration with Dual-threshold for Term Extraction[J].Journal of Computer Applications,2011,31(4):1070-1073.)
[13] 刘桃,刘秉权,徐志明,等.领域术语自动抽取及其在文本分类中的应用[J]. 电子学报,2007,35(2):328-332.(Liu Tao, Liu Bingquan, Xu Zhiming, et al. Automatic Domain-specific Term Extraction and Its Application in Text Classification[J]. Acta Electronica Sinica,2007,35(2):328-332.)
[14] 胡文敏,何婷婷,张勇.基于卡方检验的汉语术语抽取[J]. 计算机应用,2007,27(12): 3019-3020,3025.(Hu Wenmin, He Tingting, Zhang Yong. Extraction of Chinese Term Based on Chi-square Test[J]. Journal of Computer Applications, 2007,27(12): 3019-3020,3025.)
[15] Pantel P, Lin D. A Statistical Corpus-based Term Extractor[EB/OL].[2013-05-27]. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.118.4032&rep=rep1&type=pdf.
[16] Li L S, Dang Y Z, Zhang J, et al. Domain Term Extraction Based on Conditional Random Fields Combined with Active Learning Strategy[J]. Journal of Information & Computational Science,2012,9(7):1931-1940.
[17] 傅继彬,樊孝忠,毛金涛,等. 基于语言特性的中文领域术语抽取算法[J]. 北京理工大学学报, 2010,30(3):307-310.(Fu Jibin, Fan Xiaozhong, Mao Jintao, et al. An Algorithm of Chinese Domain Term Extraction Based on Language Feature[J]. Transactions of Beijing Institute of Technology, 2010,30(3):307-310.)
[18] 张希府,戴云徽,高志强,等.利用句法模式从术语词典中抽取语义关系[J]. 南京师范大学学报:工程技术版,2008,8(4):43-45.(Zhang Xifu, Dai Yunhui, Gao Zhiqiang, et al. Applying Syntactic Patterns to Semantic Relation Extraction from a Terminology Dictionary[J]. Journal of Nanjing Normal University: Engineering and Technology,2008,8(4):43-45.)
[19] Déjean H, Gaussier é,Sadat F. Bilingual Terminology Extraction: An Approach Based on a Multilingual Thesaurus Applicable to Comparable Corpora [EB/OL].[2013-05-27]. http://www.xrce.xerox.com/content/download/23595/171307/file/dejean.pdf.
[20] 梁健,吴丹. 种子概念方法及其在基于文本的本体学习中的应用[J]. 图书情报工作,2006,50(9):18-21.(Liang Jian, Wu Dan. Seed Concept Method and Its Application in Texts-based Ontology Learning[J]. Library and Information Service, 2006,50(9):18-21.)
[21] 章成志. 基于多层术语度的一体化术语抽取研究[J]. 情报学报,2011,30(3):275-285.(Zhang Chengzhi. Using Integration Strategy and Multi-level Termhood to Extract Terminology[J].Journal of the China Society for Scientific and Technical Information, 2011,30(3):275-285.)
[22] Zhang C Z, Wu D. Bilingual Terminology Extraction Using Multi-level Termhood[J].Electronic Library,2012,30(2):295-308.
[23] 杜波,田怀凤,王立,等.基于多策略的专业领域术语抽取器的设计[J]. 计算机工程, 2005,31(14):159-160.(Du Bo, Tian Huaifeng, Wang Li, et al. Design of Domain-specific Term Extractor Based on Multi-strategy[J]. Computer Engineering, 2005,31(14):159-160.)
[24] 何琳.基于多策略的领域本体术语抽取研究[J]. 情报学报, 2012,31(8):798-804.(He Lin. Domain Ontology Terminology Extraction Based on Integrated Strategy Method[J]. Journal of the China Society for Scientific and Technical Information, 2012,31(8):798-804.)
[25] 周浪,史树敏,冯冲,等.基于多策略融合的中文术语抽取方法[J]. 情报学报, 2010,29(3):460-467.(Zhou Lang, Shi Shumin, Feng Chong, et al. A Chinese Term Extraction System Based on Multi-strategies Integration[J]. Journal of the China Society for Scientific and Technical Information, 2010,29(3):460-467.)
[26] 化柏林,赵亮. 知识抽取中的嵌套向量分词技术[J]. 现代图书情报技术,2007(7):50-53.(Hua Bolin, Zhao Liang. Nested Vector Segmentation Technique in Knowledge Extraction[J].New Technology of Library and Information Service, 2007(7):50-53.)
[27] 化柏林.网络海量信息环境下的情报方法体系研究[J]. 情报理论与实践,2012,35(11):1-5.(Hua Bolin. Study on Information Methodology in Web and Magnanimity Information[J].Information Studies: Theory & Application,2012,35(11):1-5.)
[28] 冷伏海,冯璐.情报研究方法发展现状与趋势[J]. 图书情报工作,2009,53(2):29-33.(Leng Fuhai, Feng Lu. The Present Development and Trends of the Methods About Intelligence Research[J]. Library and Information Service,2009,53(2):29-33.)
[1] 徐红霞,李春旺. 科技文献内容知识点抽取研究综述[J]. 数据分析与知识发现, 2019, 3(3): 14-24.
[2] 王颖,钱力,谢靖,常志军,孔贝贝. 科技大数据知识图谱构建模型与方法研究*[J]. 数据分析与知识发现, 2019, 3(1): 15-26.
[3] 刘建华,王颖,张智雄,李传席. 植物物种多样性语义知识抽取研究*[J]. 数据分析与知识发现, 2017, 1(1): 37-46.
[4] 吴丹,陆柳杏. 移动阅读工具对大学生学术文献阅读效率的影响研究*[J]. 数据分析与知识发现, 2017, 1(1): 64-72.
[5] 王密平,王昊,邓三鸿,吴志祥. 基于CRFs的冶金领域中文专利术语抽取研究*[J]. 现代图书情报技术, 2016, 32(6): 28-36.
[6] 姜霖,王东波. 采用连续词袋模型(CBOW)的领域术语自动抽取研究*[J]. 现代图书情报技术, 2016, 32(2): 9-15.
[7] 何宇, 吕学强, 徐丽萍. 新能源汽车领域中文术语抽取方法[J]. 现代图书情报技术, 2015, 31(10): 88-94.
[8] 张杰, 张海超, 翟东升. 面向中文专利权利要求书的分词方法研究[J]. 现代图书情报技术, 2014, 30(9): 91-98.
[9] 唐守利, 徐宝祥. 基于本体的云服务语义检索系统研究[J]. 现代图书情报技术, 2014, 30(12): 27-35.
[10] 汤青,吕学强,李卓,施水才,. 领域本体术语抽取研究*[J]. 现代图书情报技术, 2014, 30(1): 43-50.
[11] 熊李艳, 谭龙, 钟茂生. 基于有效词频的改进C-value自动术语抽取方法[J]. 现代图书情报技术, 2013, 29(9): 54-59.
[12] 胡阿沛, 张静, 刘俊丽. 基于改进C-value方法的中文术语抽取[J]. 现代图书情报技术, 2013, 29(2): 24-29.
[13] 李振清, 刘建毅, 王枞, 吴旭. 同行评议专家遴选系统研究与实现[J]. 现代图书情报技术, 2012, 28(5): 81-86.
[14] 康小丽, 章成志. 用于双语术语抽取的专业领域中英文可比语料库构建[J]. 现代图书情报技术, 2012, 28(2): 28-33.
[15] 吴夙慧, 成颖, 郑彦宁, 潘云涛. 基于N元语法的英文学术文献聚类标签抽取算法[J]. 现代图书情报技术, 2011, 27(7/8): 68-75.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn