Please wait a minute...
New Technology of Library and Information Service  2011, Vol. 27 Issue (10): 29-33    DOI: 10.11925/infotech.1003-3513.2011.10.06
article Current Issue | Archive | Adv Search |
Research on Complex Time Information Extraction Based on CRF Model
Lu Wanhui1,2, Ma Jianxia1
1. Lanzhou Branch of National Science Library, Chinese Academy of Sciences, Lanzhou 730000, China;
2. Graduate University of Chinese Academy of Sciences, Beijing 100049, China
Download: PDF(520 KB)   HTML  
Export: BibTeX | EndNote (RIS)      
Abstract  Because of the characteristic of time-serial and polymorphism of the network information, this paper presents a model of extracting the complex time information based on Conditional Random Fields(CRF), and verifies the feasibility of this model through an experiment, compares the results through choosing the features of words (contexts) and word-POS. The experiment shows that the result will be much improved if adding the POS feature.
Key wordsComplex time information extraction      CRF      Feature selection     
Received: 12 August 2011      Published: 03 December 2011
: 

TP391.1

 

Cite this article:

Lu Wanhui, Ma Jianxia. Research on Complex Time Information Extraction Based on CRF Model. New Technology of Library and Information Service, 2011, 27(10): 29-33.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2011.10.06     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2011/V27/I10/29

[1] 贺瑞芳.时序多文本文摘相关技术研究 .哈尔滨:哈尔滨工业大学,2009.

[2] 赵国荣,杨尔弘.事件类时间短语识别 .见: 全国第八届计算语言学联合学术会议, 2005:335-340.

[3] Chinchor N,Brown E, Ferro L, et al.1999 Named Entity Recognition Task Definition Version1.4.ftp://jaguar.ncsl.nist.gov/ace/phase1/ne99_taskdef_v1_4.pdf.

[4] 车万翔,刘挺,李生.实体关系的自动抽取[J]. 中文信息学报, 2005,19(2):1-6.

[5] Past TAC(Text Analysis Conference)Data . .http://www.nist.gov/tac/data/.

[6] Stevenson S, Merlo P.Automatic Verb Classification Using Distributions of Grammatical Features .In:Proceedings of the 9th Conference on European Chapter of the Association for Computational Linguistics.1999:45-52.

[7] 徐永东,徐志明,王晓龙,等. 中文文本时间信息获取及语义计 算[J]. 哈尔滨工业大学学报, 2007,39(3):438-442.

[8] 赵国荣.中文新闻语料中的时间短语识别方法研究 .太原:山西大学,2006.

[9] 贾自艳.Web信息智能获取若干关键问题研究 .北京:中国科学院研究生院,2004.

[10] 王昀,苑春法.基于转换的时间-事件关系映射[J]. 中文信息学报, 2004,18(4):23-30.

[11] Banko M, Cafarella M J, Soderland S, et al. Open Information Extraction from the Web . In:Proceedings of IJCAI. 2007:2670-2676.

[12] 语言技术平台.http://ir.hit.edu.cn/demo/ltp/.

[13] Lafferty J, McCallum A, Pereira F. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data .In:Proceedings of the 18th International Conference on Machine Learning. 2001.

[14] 基于CRF的中文分词.http://blog.csdn.net/wen718/article/details/5960820.

[15] 谈论三种CRF实现的比较.http://hi.baidu.com/jrckkyy/blog/item/18ec6bf93b231255252df 29e. html.

[16] 丁晟春,刘逶迤,熊霞,等.基于领域本体和语块分析的信息抽取的研究与实现[J]. 情报学报, 2008,27(1):53-58.

[17] 宗萍,施水才,王涛,等.基于条件随机场的英文地理行政实体识别[J]. 现代图书情报技术, 2009(2):51-55.

[18] CRF+ +:Yet Another CRF Toolkit.http://crfpp.sourceforge.net/.

[19] CoNLL-2000评测工具 . .http://www.cnts.ua.ac.be/conll2000/chunking/output.html.
[1] Cheng Zhou,Hongqin Wei. Evaluating and Classifying Patent Values Based on Self-Organizing Maps and Support Vector Machine[J]. 数据分析与知识发现, 2019, 3(5): 117-124.
[2] Jiaming Liang,Jie Zhao,Zhou Jianlong,Zhenning Dong. Detecting Collusive Fraudulent Online Transaction with Implicit User Behaviors[J]. 数据分析与知识发现, 2019, 3(5): 125-138.
[3] Tingxin Wen,Yangzi Li,Jingshuang Sun. News Hotspots Discovery Method Based on Multi Factor Feature Selection and AFOA/K-means[J]. 数据分析与知识发现, 2019, 3(4): 97-106.
[4] Zhanglu Tan,Zhaogang Wang,Han Hu. Study on a Method of Feature Classification Selection Based on χ2 Statistics[J]. 数据分析与知识发现, 2019, 3(2): 72-78.
[5] Xiaoxiao Zhu,Zunqi Yang,Jing Liu. Construction of an Adverse Drug Reaction Extraction Model Based on Bi-LSTM and CRF[J]. 数据分析与知识发现, 2019, 3(2): 90-97.
[6] Li Yu,Li Qian,Changlei Fu,Huaming Zhao. Extracting Fine-grained Knowledge Units from Texts with Deep Learning[J]. 数据分析与知识发现, 2019, 3(1): 38-45.
[7] Guoming Feng,Xiaodong Zhang,Suhui Liu. DBLC Model for Word Segmentation Based on Autonomous Learning[J]. 数据分析与知识发现, 2018, 2(5): 40-47.
[8] Tingxin Wen,Yangzi Li,Jingshuang Sun. Extracting Text Features with Improved Fruit Fly Optimization Algorithm[J]. 数据分析与知识发现, 2018, 2(5): 59-69.
[9] Huiying Qi,Jianguang Guo. Integrating Multi-Source Clinical Research Data Based on CDISC Standard[J]. 数据分析与知识发现, 2018, 2(5): 88-93.
[10] Zhipeng Li,Weizhong Li. Feature Selection Based on Modified QPSO Algorithm[J]. 数据分析与知识发现, 2017, 1(7): 82-89.
[11] Yue Zhang,Dongbo Wang,Danhao Zhu. Segmenting Chinese Words from Food Safety Emergencies[J]. 数据分析与知识发现, 2017, 1(2): 64-72.
[12] Xiangdong Li,Tao Ruan,Kang Liu. Automatic Classification of Documents from Wikipedia[J]. 数据分析与知识发现, 2017, 1(10): 43-52.
[13] Yonghe Lu,Jinghuang Chen. Optimizing Feature Selection Method for Text Classification with Shuffled Frog Leaping Algorithm[J]. 数据分析与知识发现, 2017, 1(1): 91-101.
[14] Liu Hongguang,Ma Shuanggang,Liu Guifeng. Classifying Chinese News Texts with Denoising Auto Encoder[J]. 现代图书情报技术, 2016, 32(6): 12-19.
[15] Wang Miping,Wang Hao,Deng Sanhong,Wu Zhixiang. Extracting Chinese Metallurgy Patent Terms with Conditional Random Fields[J]. 现代图书情报技术, 2016, 32(6): 28-36.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn