Please wait a minute...
New Technology of Library and Information Service  2007, Vol. 2 Issue (7): 50-53    DOI: 10.11925/infotech.1003-3513.2007.07.12
Current Issue | Archive | Adv Search |
Nested Vector Segmentation Technique in Knowledge Extraction
Hua Bolin   Zhao Liang
(Institute of Scientific and Technical Information of China, Beijing 100038, China)
Download: PDF (406 KB)  
Export: BibTeX | EndNote (RIS)      
Abstract  

Well-known algorithm of maximum matching method is implemented in the process of knowledge extraction, and drawn a conclusion about critical techniques of vector segmentation. Nested vector segmentation is designed and implemented on account of disadvantage of once scanning. According to experiment, nested vector segmentation is used in knowledge extraction, it not only improves precision and recall, which resolves the problem of word in word radically, but also provides convenience to following syntactic analysis.

Key wordsKnowledge extraction      Maximum matching method      Lexical analysis      Segmenting technique      Nested vector segmentation     
Received: 11 May 2007      Published: 25 July 2007
ZTFLH: 

TP391 

 
     
  G356

 
Corresponding Authors: Hua Bolin     E-mail: huabolin@istic.ac.cn
About author:: Hua Bolin,Zhao Liang

Cite this article:

Hua Bolin,Zhao Liang. Nested Vector Segmentation Technique in Knowledge Extraction. New Technology of Library and Information Service, 2007, 2(7): 50-53.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2007.07.12     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2007/V2/I7/50

1] 梁南元.书面汉语的自动分词与一个自动分词系统—CDWS[J].北京航空学院学报,1984,(4):97-104.
[2] 揭春雨,刘源,梁南元.论汉语自动分词方法[J].中文信息学报,1989,3(1):1-9.
[3] 关英春,秦蓓.汉语文字自动统计系统[J].中文信息学报,1986,(1):26-32.
[4] 揭春雨,刘源,梁南元.汉语自动分词实用系统CASS的设计和实现[J].中文信息学报,1991,5(4):27-34.
[5] 骆正清,陈增武,胡上序.一种改进的MM分词方法的算法设计[J].中文信息学报,1996,10(3):30-37.
[6] 王兰成.基于EMM中文抽词算法的XMARC主题信息挖掘[J].情报学报,2005,24(1):82-86.
[7] 赵元正,戴尔晗.基于递归式最大匹配法的数据库查询接口的实现[J].计算机时代,2006(12):38-40.
[8] 苏芳仲,林世平.Web文本挖掘中的一种中文分词算法研究及其实现[J].福州大学学报(自然科学版),2004,32(增刊):67-71.
[9] 路永刚,赵伟.一种改进的MM分词方法的研究与实现[J].长春工业大学学报(自然科学版),2006,27(4):320-323.
[10] 郑逢斌,付征叶,乔保军,等.HENU汉语自动分词系统中歧义字段消除算法[J].河南大学学报(自然科学版),2004,34(4):49-52.
[11] 马玉春,宋瀚涛.Web 中文文本分词技术研究[J].计算机应用,2004,24(4):134-136.

[1] Hongxia Xu,Chunwang Li. Review of Knowledge Extraction of Scientific Literature[J]. 数据分析与知识发现, 2019, 3(3): 14-24.
[2] Ying Wang,Li Qian,Jing Xie,Zhijun Chang,Beibei Kong. Building Knowledge Graph with Sci-Tech Big Data[J]. 数据分析与知识发现, 2019, 3(1): 15-26.
[3] Liu Jianhua,Wang Ying,Zhang Zhixiong,Li Chuanxi. Extracting Semantic Knowledge from Plant Species Diversity Collections[J]. 数据分析与知识发现, 2017, 1(1): 37-46.
[4] Hua Bolin. Extracting Information Method Term from Chinese Academic Literature[J]. 现代图书情报技术, 2013, (6): 68-75.
[5] Jiang Caihong,Qiao Xiaodong ,Zhu Lijun. Ontology-based Patent Abstracts' Knowledge Extraction[J]. 现代图书情报技术, 2009, 3(2): 23-28.
[6] Mai Fanjin,Wang Ting. Sense Disambiguation of Chinese Segmentation Based on Bi-direction Matching Method and HMM[J]. 现代图书情报技术, 2008, 24(8): 37-41.
[7] Zhang Zhixiong,Wu Zhenxin,Liu Jianhua,Xu Jian,Hong Na,Zhao Qi. Analysis of State-of-the-Art Knowledge Extraction Technologies[J]. 现代图书情报技术, 2008, 24(8): 2-11.
[8] Zhou Ning,Wang Miao. Research on Special Domain Oriented Knowledge Management Model Based on MUDs[J]. 现代图书情报技术, 2008, 24(5): 33-38.
[9] Hua Bolin. Stop-word Processing Technique in Knowledge Extraction[J]. 现代图书情报技术, 2007, 2(8): 48-51.
[10] Hua Bolin. Architecture of Knowledge Extraction Based on NLP[J]. 现代图书情报技术, 2007, 2(10): 38-41.
[11] Zhang Han,Lu Zhenyu,Cui Lei . Knowledge Extraction from Medical Literature Database Using Association Rule Mining
——Taking Four Anti-neoplastic Medicines as an Example
[J]. 现代图书情报技术, 2006, 1(9): 49-52.
[12] Tang Yanli,Lai Maosheng. Study of the Application of Ontology in Natural  Language Information Retrieval[J]. 现代图书情报技术, 2005, 21(2): 33-36.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn