Please wait a minute...
New Technology of Library and Information Service  2010, Vol. 26 Issue (10): 59-64    DOI: 10.11925/infotech.1003-3513.2010.10.10
article Current Issue | Archive | Adv Search |
Application on Information Extraction from Factual Information Based on Conditional Random Fields Method
Wu Shuai
China Defense Science & Technology Information Center, Beijing 100142, China
Export: BibTeX | EndNote (RIS)      

A method based on the Conditional Random Fields (CRFs) is proposed to extract the information of unstructured factual information text, and the method of parameter estimation and feature selection is also anlyzed. During information extraction, the author blocks the text firstly with the help of format information such as separator and special identifier, and then extracts the designated block with Conditional Random Fields. The proposed method is applied in Global Weapon Knowledge Base System (GWKBS), and experiment results show that it has a better precision and recall performance.

Key wordsInformation      extraction      Conditional      random      fields      Parameter      estimation      Feature      selection     
Received: 11 March 2010      Published: 04 January 2011



Cite this article:

Wu Shuai. Application on Information Extraction from Factual Information Based on Conditional Random Fields Method. New Technology of Library and Information Service, 2010, 26(10): 59-64.

URL:     OR

[1] 李保利,陈玉忠,俞士汶.信息抽取研究综述
[J]. 计算机工程与应用 ,2003,39(10):1-5.

[2] Seymore K, McCallum A, Rosenfeld R. Learning Hidden Markov Model Structure for Information Extraction . In: Proceedings of the AAAI’99 Workshop on Machine Learning for Information Extraction. 1999:37-42.

[3] 林亚平,刘云中,周顺先,等. 基于最大熵的隐马尔可夫模型文本信息抽取
[J]. 电子学报 , 2005,33 (2):236-240.

[4] 刘云中,林亚平,陈治平. 基于隐马尔可夫模型的文本信息抽取
[J]. 系统仿真学报 , 2004,16(3):507-510.

[5] 张玲,黄铁军,高文. 基于隐马尔可夫模型的引文信息提取
[J]. 计算机工程 , 2003,29(20):33-34,54.

[6] Han H, Giles C,Manavoglu E, et al. Automatic Document Metadata Extraction Using Support Vector Machines . In: Proceedings of Joint Conference on Digital Libraries. 2003:37-48.

[7] Lafferty J, McCallum A, Pereira F. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data . In: Proceedings of the 18th International Conference on Machine Learning. 2001:282-289.

[8] Byrd R H, Nocedal J, Schnabel R B. Representations of Quasi-Newton Matrices and Their Use in Limited Memory Methods
[J]. Mathematical Programming, 1994 (2):129-156.

[9] Darroch J N, Ratcliff D. Generalized Iterative Scaling for Log-linear Models
[J]. Annals of Mathematical Statistics,1972,43(5):1470-1480.

[10] Della Pietra S, Della Pietra V, Lafferty J. Inducing Features of Random Fields
[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997,19(4):380-393.

[11] Peng F, McCallum A. Accurate Information Extraction from Research Papers Using Conditional Random Fields
[J]. Information Processing & Management,2006,42(4):963-979.

[12] Sha F, Pereira F. Shallow Parsing with Conditional Random Fields . In: Proceedings of Human Language Technology NAACL. 2003:134-141.

[1] Fan Shaoping,Zhao Yuxuan,An Xinying,Wu Qingqiang. Classification Model for Medical Entity Relations with Convolutional Neural Network[J]. 数据分析与知识发现, 2021, 5(9): 75-84.
[2] Chen Jie,Ma Jing,Li Xiaofeng. Short-Text Classification Method with Text Features from Pre-trained Models[J]. 数据分析与知识发现, 2021, 5(9): 21-30.
[3] Shan Xiaohong,Wang Chunwen,Liu Xiaoyan,Han Shengxi,Yang Juan. Identifying Lead Users in Open Innovation Community from Knowledge-based Perspectives[J]. 数据分析与知识发现, 2021, 5(9): 85-96.
[4] Wang Yifan,Li Bo,Shi Hua,Miao Wei,Jiang Bin. Annotation Method for Extracting Entity Relationship from Ancient Chinese Works[J]. 数据分析与知识发现, 2021, 5(9): 63-74.
[5] Ma Jiangwei, Lv Xueqiang, You Xindong, Xiao Gang, Han Junmei. Extracting Relationship Among Military Domains with BERT and Relation Position Features[J]. 数据分析与知识发现, 2021, 5(8): 1-12.
[6] Liu Yuanchen, Wang Hao, Gao Yaqi. Predicting Online Music Playbacks and Influencing Factors[J]. 数据分析与知识发现, 2021, 5(8): 100-112.
[7] Han Hui, Liu Xiuwen. Automatic Scoring for Subjective Questions in Maritime Competency Assessment[J]. 数据分析与知识发现, 2021, 5(8): 113-121.
[8] Wang Ruolin, Niu Zhendong, Lin Qika, Zhu Yifan, Qiu Ping, Lu Hao, Liu Donglei. Disambiguating Author Names with Embedding Heterogeneous Information and Attentive RNN Clustering Parameters[J]. 数据分析与知识发现, 2021, 5(8): 13-24.
[9] Chai Qingfeng, Shi Linyan, Mei Shan, Xiong Haitao, He Huixin. Extracting Knowledge Elements of Sci-Tech Literature Based on Artificial and Machine Features[J]. 数据分析与知识发现, 2021, 5(8): 132-144.
[10] Tan Ying, Tang Yifei. Extracting Citation Contents with Coreference Resolution[J]. 数据分析与知识发现, 2021, 5(8): 25-33.
[11] Zhang Jiandong, Chen Shiji, Xu Xiaoting, Zuo Wenge. Extracting PDF Tables Based on Word Vectors[J]. 数据分析与知识发现, 2021, 5(8): 34-44.
[12] Wang Qinjie, Qin Chunxiu, Ma Xubu, Liu Huailiang, Xu Cunzhen. Recommending Scientific Literature Based on Author Preference and Heterogeneous Information Network[J]. 数据分析与知识发现, 2021, 5(8): 54-64.
[13] Wang Hao, Lin Kerou, Meng Zhen, Li Xinlei. Identifying Multi-Type Entities in Legal Judgments with Text Representation and Feature Generation[J]. 数据分析与知识发现, 2021, 5(7): 10-25.
[14] Yu Xuehan, He Lin, Xu Jian. Extracting Events from Ancient Books Based on RoBERTa-CRF[J]. 数据分析与知识发现, 2021, 5(7): 26-35.
[15] Zhao Danning,Mu Dongmei,Bai Sen. Automatically Extracting Structural Elements of Sci-Tech Literature Abstracts Based on Deep Learning[J]. 数据分析与知识发现, 2021, 5(7): 70-80.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938