Please wait a minute...
Advanced Search
数据分析与知识发现  2022, Vol. 6 Issue (10): 128-141     https://doi.org/10.11925/infotech.2096-3467.2022.0261
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
面向小样本命名实体识别的数据增强算法改进策略研究*
刘兴丽1,2(),范俊杰2,马海群1
1黑龙江大学信息资源管理研究中心 哈尔滨 150080
2黑龙江科技大学计算机与信息工程学院 哈尔滨 150020
Improvement of Data Augment Algorithm for Named Entity Recognition with Small Samples
Liu Xingli1,2(),Fan Junjie2,Ma Haiqun1
1Research Center of Information Resources Management, Heilongjiang University, Harbin 150080, China
2School of Computer and Information Engineering, Heilongjiang University of Science and Technology, Harbin 150020, China
全文: PDF (1778 KB)   HTML ( 19
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】 提出适用于小样本命名实体识别的数据增强改进算法策略。【方法】 以领域命名实体识别任务为例,基于简单数据增强(EDA)算法提出多维度的改进策略:多种领域词典混合的实体替换、领域语义分类词典的词性替换、语义保护机制的随机删除、词性保护的随机插入策略以及4种方法改进的组合策略,并分别进行命名实体识别模型训练。【结果】 领域小样本数据增强改进策略的命名实体识别实验结果显示:单策略简单数据增强改进前后效果有所提升,F值分别提升3.2、4.6、4.5和2.5个百分点;相比之下,两种以上的混合策略F值提升效果欠佳。在应用人民日报及微博小样本数据集的扩展实验中,单策略简单数据增强改进的提升效果显著:基于多种领域词典混合的实体替换改进策略在两份数据集上F值最高提升6.7个百分点。【局限】 在多种策略组合实验中,增强参数 αN调控难度加大,组合策略命名实体识别效果受到影响。【结论】 本研究的简单数据增强算法改进策略有效改善了小样本命名实体识别模型效果。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
刘兴丽
范俊杰
马海群
关键词 数据增强简单数据增强小样本命名实体识别    
Abstract

[Objective] This paper proposes a strategy to improve data augment algorithm for named entities recognition with small samples. [Methods] Taking the task of domain named entity recognition as an example, a multi-dimensional improvement strategy based on easy data augment (EDA) algorithm is proposed: the entity replacement of mixed multiple domain dictionaries, the replacement of part of speech in domain semantic classification dictionaries, the random deletion based on semantic protection mechanism, the random insertion strategy of part of speech protection and the improved combination strategy of the four methods mentioned above, and the improved combination strategy of the four methods are respectively trained with named entity recognition(NER) model. [Results] The domain NER experimental results with small samples show that on the one hand, the efficiency was improved through a single strategy EDA: the F value is increased by 3.2, 4.6, 4.5 and 2.5 percentage points respectively. In contrast, the F value showed poor performance when applying two or more hybrid strategies. In the expansion experiment of the People’s Daily and Weibo datasets with small samples, the improvement effect was significant. The F value of the Entity Replacement Strategy Based on Multi-Domain Dictionary Mixing improvement strategy on the two datasets increased by 6.7 percentage points at the most. [Limitations] In the multiple strategy combination experiment, the regulation of the parameters α、N becomes more difficult, and the NER improvement of the combined strategy is affected. [Conclusions] The improvement strategy of EDA algorithm suggested in this paper effectively improves the results of named entity recognition model with small samples.

Method

Key wordsData Augment    EDA    Small Sample    Name Entity Recognition
收稿日期: 2022-03-27      出版日期: 2022-11-09
ZTFLH:  TP393 G250  
基金资助:国家社会科学基金重大项目(21&ZD336);国家社会科学基金重点项目(20ATQ004)
通讯作者: 刘兴丽,ORCID:0000-0001-6126-9837      E-mail: liuxingli@usth.edu.cn
引用本文:   
刘兴丽, 范俊杰, 马海群. 面向小样本命名实体识别的数据增强算法改进策略研究*[J]. 数据分析与知识发现, 2022, 6(10): 128-141.
Liu Xingli, Fan Junjie, Ma Haiqun. Improvement of Data Augment Algorithm for Named Entity Recognition with Small Samples. Data Analysis and Knowledge Discovery, 2022, 6(10): 128-141.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2022.0261      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2022/V6/I10/128
Fig.1  基于EDA算法改进的小样本命名实体识别框架
Fig.2  改进的MDDM实体替换策略设计
Fig.3  改进策略实例
Fig.4  语料依存树示例
策略 优势 不足
基于多种领域词典混合的实体替换策略 在不调整句式结构的情况下,采用多源领域数据对目标实体进行了大量扩充,丰富了样本质量和数量 由于未调整句式,句子结构无变化,泛化能力欠缺
基于领域语义分类词典的词性替换策略 通过构建领域词性词典保证了替换词在当前环境下的语义依赖问题,完成非目标词的替换
通过相似度计算使替换词更精准,词义上贴近原词
未补充目标实体的语料;
小样本领域词性词典不够丰富使候选词不够精准;
基于语义保护机制的随机删除策略 基于依存句法对目标实现了语义保护,避免了传统随机删除时引发的语义缺失问题;
改变了原有的句式,增强了模型的泛化性和鲁棒性
未能补充目标实体,不能丰富语料;
可能存在语义保护过度,导致句式变化有限
基于词性保护的随机插入策略 对NER重要的上下文语义特征的保护的情况下,引入合理的噪声;
调整了句式结构,增强了模型的泛化性和鲁棒性
未能补充目标实体,不能丰富语料;
噪声可能影响模型效果
Table 1  EDA算法单策略改进优劣分析
Fig.5  RoBERTa-WWM-BiLSTM-CRF框架
实体类型 类型名称 数量 实体示例
NAT 国家 2 665 英格兰
PLA 飞机 231 F-22战斗机
VES 舰船 158 诺森伯兰号
WAT 海域 125 波罗的海
STA 95 俄勒冈州
CIT 城市 121 莫斯科
MIS 导弹 70 烈火导弹
RAD 雷达 76 北极超视距雷达
ARM 部队 99 美海军第七舰队
REG 区域 96 纳卡地区
BAS 基地 86 欣登基地
ISL 岛屿 82 西沙群岛
AIR 航母 80 伊丽莎白女王号
AIP 机场 60 喀布尔机场
POR 港口 50 那霸港
Table 2  领域实体标注标准及样本量
配置项 配置
操作系统 Ubuntu
GPU NVIDIA GeForce RTX 2080 Ti
Python 3.6.0
TensorFlow 2.2.0
内存 62GB
显存 11GB
硬盘 200GB
Table 3  实验配置
参数名称 参数值
Batch_size 128
Seq_max_len 256
Dropout 0.4
learning rate 8e-3
LSTM unit 128
epoch 5
optimizer RAdam
Random embedding size 300
Word2vec embedding size 300
Word2vec window 5
Word2vec iter 5
Table 4  实验的参数设置
Fig.6  不同增长系数 N α的MDDM实体替换策略实验
DA Techniques P w - a v g(%) R w - a v g ( % ) F w - a v g ( % )
E D A _ ( S R ) α = 0.4 N = 6 78.3 80.6 79.3
E D A _ s e l f ( M D D M ) α = 0.4 N = 6 78.8 86.2 82.2
E D A _ e x t e r n a l ( M D D M ) α = 0.4 N = 6 80.6 84.4 82.4
Table 5  MDDM实体替换策略的实验评估
Fig.7  不同增长系数 N α的DSCD词性替换策略实验
EDA Techniques P w - a v g ( % ) R w - a v g ( % ) F w - a v g ( % )
E D A _ ( R S ) α = 0.4 N = 10 78.89 79.75 79.04
E D A _ ( D S C D ) α = 0.4 N = 10 81.8 85.7 83.6
Table 6  DSCD词性替换策略实验评估
Fig.8  不同增长系数 N α的SPM随机删除策略实验
DA Techniques P w - a v g ( % ) R w - a v g ( % ) F w - a v g ( % )
E D A _ ( R D ) α = 0.6 N = 10 74.4 81.4 77.5
E D A _ ( S P M ) α = 0.6 N = 10 80.7 84.4 82.0
Table 7  SPM随机删除策略实验评估
Fig.9  不同增长系数 N α下RSI策略评估
DA Techniques P w - a v g ( % ) R w - a v g ( % ) F w - a v g ( % )
E D A _ ( R I ) α = 0.4 N = 10 80.1 81.8 80.8
E D A _ ( R S I ) α = 0.4 N = 10 80.9 86.1 83.3
Table 8  RSI策略实验对比评估
DA Techniques P w - a v g ( % ) R w - a v g ( % ) F w - a v g ( % )
E D A _ ( M D D M ) α = 0.4 78.8 86.2 82.2
E D A _ ( R P S ) α = 0.2 79.1 84.1 81.4
E D A _ ( S P M ) α = 0.2 79.4 82.7 80.9
E D A _ ( R S I ) α = 0.2 79.7 83.9 81.6
E D A _ ( M D D M _ R S I ) α M D D M = 0.4 ? ? α R S I = 0.6 79.0 86.6 82.4
? ? E D A _ ( M D D M _ S P M ) α S R D = 0.4 α S P M = 0.3 77.8 84.3 80.7
? E D A _ ( M D D M _ D S C D ) α M D D M = 0.4 α R P S = 0.2 79.9 85.1 82.3
? ? E D A _ ( D S C D _ S P M ) α D S C D = 0.2 ? α S P M = 0.3 77.9 83.6 80.4
? ? ? ? ? E D A _ ( R S I _ R P S _ S P M ) α R S I = 0.6 α R P S = 0.2 α S P M = 0.3 78.0 84.0 80.6
E D A _ ( M D D M _ R P S _ S P M _ R S I ) α M D D M = 0.4 , α R S I = 0.6 , α R P S = 0.2 , α S P M = 0.3 80.2 83.2 81.5
Table 9  多策略组合实验评估
Dataset LOC ORG PER
PeopleDailyNER_S 206 124 96
PeopleDailyNER_M 324 183 170
PeopleDailyNER_L 661 350 349
PeopleDailyNER_F 16 571 9 722 8 144
Table10  PeopleDaily小样本NER数据集
Dataset PER.NOM PER.NAM LOC.NOM LOC.NAM ORG.NOM ORG.NAM GPE.NAM
WeiBo_S 99 103 4 13 6 24 40
WeiBo_M 198 291 14 33 4 73 76
WeiBo_L 313 358 20 49 8 105 132
WeiBo_F 416 522 28 62 17 148 177
Table11  WeiBo小样本NER数据集
Method PeopleDaily WeiBo
S(%) M(%) L(%) F(%) S(%) M(%) L(%) F(%)
No augmentation 69.7 72.0 78.5 90.9 32.3 37.4 41.8 43.6
SR 70.0 71.4 78.7 88.7 29.0 40.1 43.6 43.8
M D D M α = 0.4 72.4 73.4 78.9 90.6 33.45 44.1 43.9 43.2
RS 69.9 67.6 76.3 87.8 33.2 37.7 43.2 41.8
R P S α = 0.4 71.4 74.1 79.8 91.3 32.0 36.9 40.3 40.3
RD 69.2 70.5 79.6 90.2 31.9 40.6 41.7 45.3
S P M α = 0.3 70.1 72.3 74.3 90.5 35.3 40.6 44.2 45.5
RI 64.5 66.6 73.2 90.9 35.4 42.1 39.7 46.1
R S I α = 0.4 71.4 72.2 79.7 91.3 34.9 39.8 42.8 46.2
E D A _ ( M D D M _ D S C D ) α = 0.4 0.2 73.2 72.5 79.5 90.6 37.7 42.5 38.7 42.7
E D A _ ( M D D M _ R P S _ S P M _ R S I ) 70.3 72.2 79.2 91.2 32.8 37.7 43.2 43.9
Table12  不同比例数据集的实验评估
[1] 邓依依, 邬昌兴, 魏永丰, 等. 基于深度学习的命名实体识别综述[J]. 中文信息学报, 2021, 35(9): 30-45.
[1] (Deng Yiyi, Wu Changxing, Wei Yongfeng, et al. A Survey on Named Entity Recognition Based on Deep Learning[J]. Journal of Chinese Information Processing, 2021, 35(9): 30-45.)
[2] Wei J, Zou K. EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks[OL]. arXiv Preprint, arXiv: 1901.11196.
[3] 肖中华. 一种基于群智的语料库数据标注方法及系统:中国,CN108874763A[P]. 2018-11-23[2022-08-14].
[3] (Xiao Zhonghua. Corpus Data Labeling Method Based on Swarm Intelligence: China, CN108874763A[P]. 2018-11-23[2022-08-14].)
[4] 李贺, 刘嘉宇, 李世钰, 等. 基于疾病知识图谱的自动问答系统优化研究[J]. 数据分析与知识发现, 2021, 5(5): 115-126.
[4] (Li He, Liu Jiayu, Li Shiyu, et al. Optimizing Automatic Question Answering System Based on Disease Knowledge Graph[J]. Data Analysis and Knowledge Discovery, 2021, 5(5): 115-126.)
[5] 钱力, 谢靖, 常志军, 等. 基于科技大数据的智能知识服务体系研究设计[J]. 数据分析与知识发现, 2019, 3(1): 4-14.
[5] (Qian Li, Xie Jing, Chang Zhijun, et al. Designing Smart Knowledge Services with SCI-Tech Big Data[J]. Data Analysis and Knowledge Discovery, 2019, 3(1): 4-14.)
[6] Nadler B, Srebro N, Srebro N, Birch A. Improving Neural Machine Translation Models with Monolingual Data[OL]. arXiv Preprint, arXiv: 1511.06709.
[7] Park D S, Chan W, Zhang Y, et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition[OL]. arXiv Preprint, arXiv: 1904.08779.
[8] 冯晓硕, 沈樾, 王冬琦. 基于图像的数据增强方法发展现状综述[J]. 计算机科学与应用, 2021(2): 370-382.
[8] (Feng Xiaoshuo, Shen Yue, Wang Dongqi. A Survey on the Development of Image Data Augmentation[J]. Computer Science and Application, 2021(2): 370-382.)
[9] 黄法秀, 张世杰, 吴志红, 等. 数据增广下的人脸识别研究[J]. 计算机技术与发展, 2020, 30(3): 67-72.
[9] (Huang Faxiu, Zhang Shijie, Wu Zhihong, et al. Research on Face Recognition Based on Data Augmentation[J]. Computer Technology and Development, 2020, 30(3): 67-72.)
[10] Shorten C, Khoshgoftaar T M. A Survey on Image Data Augmentation for Deep Learning[J]. Journal of Big Data, 2019, 6: Article No.60.
[11] Xie Q Z, Dai Z H, Hovy E, et al. Unsupervised Data Augmentation for Consistency Training[OL]. arXiv Preprint, arXiv: 1904.12848.
[12] 张卫, 王昊, 陈玥彤, 等. 融合迁移学习与文本增强的中文成语隐喻知识识别与关联研究[J]. 数据分析与知识发现, 2022, 6(2/3): 167-183.
[12] (Zhang Wei, Wang Hao, Chen Yuetong, et al. Identifying Metaphors and Association of Chinese Idioms with Transfer Learning and Text Augmentation[J]. Data Analysis and Knowledge Discovery, 2022, 6(2/3): 167-183.)
[13] 刘彤, 刘琛, 倪维健. 多层次数据增强的半监督中文情感分析方法[J]. 数据分析与知识发现, 2021, 5(5): 51-58.
[13] (Liu Tong, Liu Chen, Ni Weijian. A Semi-Supervised Sentiment Analysis Method for Chinese Based on Multi-Level Data Augmentation[J]. Data Analysis and Knowledge Discovery, 2021, 5(5): 51-58.)
[14] 李健, 张克亮, 唐亮, 等. 面向中文命名实体识别任务的数据增强[J]. 计算机与现代化, 2022(4): 1-6.
[14] (Li Jian, Zhang Keliang, Tang Liang, et al. Data Augmentation for Chinese Named Entity Recognition Task[J]. Computer and Modernization, 2022(4): 1-6.)
[15] 杨鹤, 于红, 刘巨升, 等. 基于BERT+BiLSTM+CRF深度学习模型和多元组合数据增广的渔业标准命名实体识别[J]. 大连海洋大学学报, 2021, 36(4): 661-669.
[15] (Yang He, Yu Hong, Liu Jusheng, et al. Fishery Standard Named Entity Recognition Based on BERT+BiLSTM+CRF Deep Learning Model and Multivariate Combination Data Augmentation[J]. Journal of Dalian Ocean University, 2021, 36(4): 661-669.)
[16] 毕佳晶, 李敏, 郑蕊蕊, 等. 面向满文字符识别的训练数据增广方法研究[J]. 大连民族大学学报, 2018, 20(1): 73-78.
[16] (Bi Jiajing, Li Min, Zheng Ruirui, et al. Research on Training Data Augmentation Methods for Manchu Character Recognition[J]. Journal of Dalian Minzu University, 2018, 20(1): 73-78.)
[17] 王蓬辉, 李明正, 李思. 基于数据增强的中文医疗命名实体识别[J]. 北京邮电大学学报, 2020, 43(5): 84-90.
doi: 10.13190/j.jbupt.2020-032
[17] (Wang Penghui, Li Mingzheng, Li Si. Data Augmentation for Chinese Clinical Named Entity Recognition[J]. Journal of Beijing University of Posts and Telecommunications, 2020, 43(5): 84-90.)
doi: 10.13190/j.jbupt.2020-032
[18] Keraghel A, Benabdeslem K, Canitia B. Data Augmentation Process to Improve Deep Learning-Based NER Task in the Automotive Industry Field[C]// Proceedings of the 2020 International Joint Conference on Neural Networks. IEEE, 2020: 1-8.
[19] 马晓琴, 郭小鹤, 薛峪峰, 等. 针对命名实体识别的数据增强技术[J]. 华东师范大学学报(自然科学版), 2021(5): 14-23.
[19] (Ma Xiaoqin, Guo Xiaohe, Xue Yufeng, et al. Data Augmentation Technology for Named Entity Recognition[J]. Journal of East China Normal University (Natural Science), 2021(5): 14-23.)
[20] Dai X, Adel H. An Analysis of Simple Data Augmentation for Named Entity Recognition[OL]. arXiv Preprint, arXiv: 2010.11683.
[21] Chen J A, Wang Z H, Tian R, et al. Local Additivity Based Data Augmentation for Semi-Supervised NER[OL]. arXiv Preprint, arXiv: 2010.01677.
[22] 刘卫平, 张豹, 陈伟荣, 等. 基于迁移表示学习的军事命名实体识别[J]. 指挥信息系统与技术, 2020, 11(2): 64-69.
[22] (Liu Weiping, Zhang Bao, Chen Weirong, et al. Military Named Entity Recognition Based on Transfer Representation Learning[J]. Command Information System and Technology, 2020, 11(2): 64-69.)
[23] 徐建, 阮国庆, 李晓冬, 等. 基于迁移学习的小样本军事文本命名实体识别[C]. 见: 第九界中国指挥控制大会论文集. 2021: 288-291.
[23] (Xu Jian, Ruan Gouqing, Li Xiaodong, et al. Transfer Learning Based Few-Shot Learning for Military Name Entity Recognition[C]// Proceedings of the 9th China Command and Control Conference. 2021:288-291.)
[24] Yadav V, Sharp R, Bethard S. Deep Affix Features Improve Neural Named Entity Recognizers[C]// Proceedings of the 7th Joint Conference on Lexical and Computational Semantics. 2018: 167-172.
[25] Sabty C, Omar I, Wasfalla F, et al. Data Augmentation Techniques on Arabic Data for Named Entity Recognition[J]. Procedia Computer Science, 2021, 189: 292-299.
doi: 10.1016/j.procs.2021.05.092
[26] 刘焕勇. 开源军事武器装备知识图谱[EB/OL].(2020-04-19). [2022-06-16]. http://openkg.cn/dataset/military-weapon-kg.
[26] (Liu Huanyong. Open Source Military Weaponry Knowledge Graph[DB/OL]. (2020-04-19). [2022-06-16]. http://openkg.cn/dataset/military-weapon-kg.)
[27] Dai X, Adel H. An Analysis of Simple Data Augmentation for Named Entity Recognition[C]// Proceedings of the 28th International Conference on Computational Linguistics. 2020: 3861-3867.
[28] Abid F, Li C, Alam M. Multi-Source Social Media Data Sentiment Analysis Using Bidirectional Recurrent Convolutional Neural Networks[J]. Computer Communications, 2020, 157: 102-115.
doi: 10.1016/j.comcom.2020.04.002
[29] Xie Z, Wang S I, Li J, et al. Data Noising as Smoothing in Neural Network Language Models[OL]. arXiv Preprint, arXiv: 1703.02573.
[30] Cui Y, Che W, Liu T, et al. Pre-Training with Whole Word Masking for Chinese BERT[OL]. arXiv Preprint, arXiv: 1906.08101.
[31] Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 5998-6008.
[32] Graves A, Jaitly N, Mohamed A R. Hybrid Speech Recognition with Deep Bidirectional LSTM[C]// Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding. IEEE: 273-278.
[33] Lafferty J, Mccallum A, Pereira F, et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data[C]// Proceedings of the 18th International Conference on Machine Learning. 2001: 282-289.
[34] 马孟铖, 杨晴雯, 艾斯卡尔·艾木都拉, 等. 基于词向量和条件随机场的中文命名实体分类[J]. 计算机工程与设计, 2020, 41(9): 2515-2522.
[34] (Ma Mengcheng, Yang Qingwen, Askar Hamdulla, et al. Chinese Named Entity Classification Based on Word2Vec and Conditional Random Fields[J]. Computer Engineering and Design, 2020, 41(9): 2515-2522.)
[35] 樊高月, 宫旭平. 美国全球军事基地览要[M]. 第1版. 北京: 解放军出版社, 2014.
[35] (Fan Gaoyue, Gong Xuping. The Overview of Global U.S. Military Bases[M]. The1st Edition. Beijing: PLA Publishing House, 2014.)
[1] 胡吉明, 钱玮, 文鹏, 吕晓光. 基于结构功能和实体识别的文本语义表示——以病历领域为例*[J]. 数据分析与知识发现, 2022, 6(8): 110-121.
[2] 张云秋, 汪洋, 李博诚. 基于RoBERTa-wwm动态融合模型的中文电子病历命名实体识别*[J]. 数据分析与知识发现, 2022, 6(2/3): 242-250.
[3] 余传明, 林虹君, 张贞港. 基于多任务深度学习的实体和事件联合抽取模型*[J]. 数据分析与知识发现, 2022, 6(2/3): 117-128.
[4] 张芳丛, 秦秋莉, 姜勇, 庄润涛. 基于RoBERTa-WWM-BiLSTM-CRF的中文电子病历命名实体识别研究[J]. 数据分析与知识发现, 2022, 6(2/3): 251-262.
[5] 刘彤,刘琛,倪维健. 多层次数据增强的半监督中文情感分析方法*[J]. 数据分析与知识发现, 2021, 5(5): 51-58.
[6] 徐晨飞, 叶海影, 包平. 基于深度学习的方志物产资料实体自动识别模型构建研究*[J]. 数据分析与知识发现, 2020, 4(8): 86-97.
[7] 高原,施元磊,张蕾,曹天奕,冯筠. 基于游记文本的游客游览行程重构*[J]. 数据分析与知识发现, 2020, 4(2/3): 165-172.
[8] 马建霞,袁慧,蒋翔. 基于Bi-LSTM+CRF的科学文献中生态治理技术相关命名实体抽取研究*[J]. 数据分析与知识发现, 2020, 4(2/3): 78-88.
[9] 刘婧茹,宋阳,贾睿,张翼鹏,罗勇,马敬东. 基于BiLSTM-CRF中文临床文本中受保护的健康信息识别*[J]. 数据分析与知识发现, 2020, 4(10): 124-133.
[10] 黄菡,王宏宇,王晓光. 结合主动学习的条件随机场模型用于法律术语的自动识别*[J]. 数据分析与知识发现, 2019, 3(6): 66-74.
[11] 陈美杉,夏晨曦. 肝癌患者在线提问的命名实体识别研究:一种基于迁移学习的方法 *[J]. 数据分析与知识发现, 2019, 3(12): 61-69.
[12] 余丽,钱力,付常雷,赵华茗. 基于深度学习的文本中细粒度知识元抽取方法研究*[J]. 数据分析与知识发现, 2019, 3(1): 38-45.
[13] 唐慧慧, 王昊, 张紫玄, 王雪颖. 基于汉字标注的中文历史事件名抽取研究*[J]. 数据分析与知识发现, 2018, 2(7): 89-100.
[14] 范馨月, 崔雷. 基于文本挖掘的药物副作用知识发现研究[J]. 数据分析与知识发现, 2018, 2(3): 79-86.
[15] 隋明爽,崔雷. 结合多种特征的CRF模型用于化学物质-疾病命名实体识别[J]. 现代图书情报技术, 2016, 32(10): 91-97.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn