Please wait a minute...
Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (3): 12-24    DOI: 10.11925/infotech.2096-3467.2019.1031
Current Issue | Archive | Adv Search |
An Integrated Platform for Food Safety Incident Entities Based on Deep Learning
Hu Haotian1,2,Ji Jinfeng3,Wang Dongbo3,4(),Deng Sanhong1,2
1School of Information Management, Nanjing University, Nanjing 210023, China
2Jiangsu Key Laboratory of Data Engineering and Knowledge Service, Nanjing 210023, China
3School of Information Management, Nanjing Agricultural University, Nanjing 210095, China
4Research Center for Correlation of Domain Knowledge, Nanjing Agricultural University,Nanjing 210095, China
Download: PDF (14352 KB)   HTML ( 16
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper tries to promote the national administration of food safety, and strengthen the prediction, warning and response of related emergencies. It not only facilitates research, but also informs the public on food safety issues concisely and intuitively. [Methods] We collected news reports on food safety incidents from leading websites and constructed a corpus for the food safety incident entities through data cleansing, annotation, and organization. Then, we compared performance of Bi-LSTM, Bi-LSTM-CRF, IDCNN, IDCNN-CRF and BERT models on entity recognition. [Results] In the 10-fold cross validation, the highest F-score of the BERT model reached 81.39%, while its average F-score was 5.50% and 2.58% higher than those of IDCNN-CRF and Bi-LSTM-CRF models respectively. We built the integrated presentation platform for food safety incident entities based on the Bi-LSTM-CRF model. [Limitations] More research is needed to identify location entities from complex administrative regions. [Conclusions] The constructed platform supports policy formulation and food industry administration.

Key wordsDeep Learning      Food Safety Incident Entity      Bi-LSTM-CRF      BERT     
Received: 11 September 2019      Published: 12 April 2021
ZTFLH:  G255  
Fund:Philosophy and Social Science Research Fund of Jiangsu Education Department and Central University Fund of Nanjing Agricultural University(2018SJA0034);National Social Science Fund of China(15ZDB168);Hubei Collaborative Innovation Center(JD20150101)
Corresponding Authors: Wang Dongbo     E-mail: db.wang@njau.edu.cn

Cite this article:

Hu Haotian,Ji Jinfeng,Wang Dongbo,Deng Sanhong. An Integrated Platform for Food Safety Incident Entities Based on Deep Learning. Data Analysis and Knowledge Discovery, 2021, 5(3): 12-24.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2019.1031     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2021/V5/I3/12

Manual Labeling of Food Safety Incident Entities
The Architecture of CRF Model
The Architecture of Bi-LSTM-CRF Model
The Architecture of BERT Model
Flow Chart of the Entity Recognition Experiment
标记 含义
B-fd 食品与诱因实体起始字
I-fd 食品与诱因实体中间字
E-fd 食品与诱因实体结束字
B-ot 时间与地点实体起始字
I-ot 时间与地点实体中间字
E-ot 时间与地点实体结束字
O 非食品安全事件实体字
Tags and Their Meanings
序号 准确率 召回率 F值
1 73.65% 77.58% 75.57%
2 74.03% 77.82% 75.88%
3 73.25% 77.50% 75.31%
4 73.75% 77.34% 75.50%
5 71.78% 77.42% 74.49%
6 71.86% 77.77% 74.70%
7 72.50% 77.44% 74.89%
8 72.88% 79.02% 75.83%
9 72.98% 78.96% 75.85%
10 71.15% 79.49% 75.09%
平均值 72.78% 78.03% 75.31%
10-fold Cross-validation of Food Safety Incident Entity Recognition Based on IDCNN-CRF
标记 准确率 召回率 F值
fd 75.56% 79.30% 77.38%
ot 70.56% 74.45% 72.45%
全部 74.03% 77.82% 75.88%
Entity Recognition Result of the Best Performing IDCNN-CRF
序号 准确率 召回率 F值
1 73.92% 80.34% 77.00%
2 76.60% 81.08% 78.78%
3 74.60% 82.09% 78.17%
4 76.15% 81.24% 78.61%
5 75.98% 79.70% 77.79%
6 76.24% 79.94% 78.05%
7 74.38% 81.85% 77.94%
8 76.98% 81.57% 79.21%
9 74.82% 81.55% 78.04%
10 75.43% 82.31% 78.72%
平均值 75.51% 81.17% 78.23%
10-fold Cross-validation of Food Safety Incident Entity Recognition Based on Bi-LSTM-CRF
标记 准确率 召回率 F值
fd 77.02% 82.87% 79.84%
ot 76.92% 79.62% 78.25%
全部 76.98% 81.57% 79.21%
Entity Recognition Result of the Best Performing Bi-LSTM-CRF
序号 准确率 召回率 F值
1 77.46% 82.85% 80.06%
2 78.89% 82.96% 80.87%
3 77.79% 83.91% 80.74%
4 78.21% 84.30% 81.14%
5 77.80% 83.00% 80.32%
6 78.52% 83.75% 81.05%
7 77.42% 83.11% 80.16%
8 79.28% 83.61% 81.39%
9 78.18% 84.04% 81.00%
10 78.64% 84.25% 81.35%
平均值 78.22% 83.58% 80.81%
10-fold Cross-validation of Food Safety Incident Entity Recognition Based on BERT
标记 准确率 召回率 F值
fd 81.71% 85.23% 83.44%
ot 75.65% 81.13% 78.29%
全部 79.28% 83.61% 81.39%
Entity Recognition Result of the Best Performing BERT
模型 准确率 召回率 F值
IDCNN 59.20% 74.55% 65.99%
IDCNN-CRF 72.78% 78.03% 75.31%
Bi-LSTM 54.08% 78.83% 64.15%
Bi-LSTM-CRF 75.51% 81.17% 78.23%
BERT 78.22% 83.58% 80.81%
Comparison of Recognition Effects
Distribution of China's Food Safety Incidents in Different Provinces from 2007 to 2017
Distribution of China’s Food Safety Incidents from 2007 to 2017
Monthly Distribution of Food Safety Incidents in China from 2007 to 2017
Food and Inducement Entities
Time and Place Entities
Screenshot of API Call Interface
Screenshot of the Database Interface
[1] 高小于. “新常态”下我国食品安全监管面临的问题及解决措施[J]. 现代食品, 2019(11):104-106.
[1] ( Gao Xiaoyu. Problems and Solutions of Food Safety Supervision in China Under the New Normal[J]. Modern Food, 2019(11):104-106.)
[2] 黄水清, 王东波, 何琳. 基于先秦语料库的古汉语地名自动识别模型构建研究[J]. 图书情报工作, 2015,59(12):135-140.
[2] ( Huang Shuiqing, Wang Dongbo, He Lin. Research on Constructing Automatic Recognition Model for Ancient Chinese Place Names Based on Pre-Qi Corpus[J]. Library and Information Service, 2015,59(12):135-140.)
[3] 江美辉, 安海忠, 高湘昀, 等. 基于复杂网络的食品安全事件新闻文本可视化及分析[J]. 情报杂志, 2015,34(12):121-127.
[3] ( Jiang Meihui, An Haizhong, Gao Xiangyun, et al. The Visualization and Analysis of News Texts About Food Safety Incidents Based on Complex Networks[J]. Journal of Intelligence, 2015,34(12):121-127.)
[4] 王东波, 吴毅, 叶文豪, 等. 多特征知识下的食品安全事件实体抽取研究[J]. 数据分析与知识发现, 2017,1(3):54-61.
[4] ( Wang Dongbo, Wu Yi, Ye Wenhao, et al. Extracting Events of Food Safety Emergencies with Characteristics Knowledge[J]. Data Analysis and Knowledge Discovery, 2017,1(3):54-61.)
[5] 向晓雯, 史晓东, 曾华琳. 一个统计与规则相结合的中文命名实体识别系统[J]. 计算机应用, 2005,25(10):2404-2406.
[5] ( Xiang Xiaowen, Shi Xiaodong, Zeng Hualin. Chinese Named Entity Recognition System Using Statistics-Based and Rules-Based Method[J]. Journal of Computer Applications, 2005,25(10):2404-2406.)
[6] 赵军. 命名实体识别、排歧和跨语言关联[J]. 中文信息学报, 2009,23(2):3-17.
[6] ( Zhao Jun. A Survey on Named Entity Recognition, Disambiguation and Cross-Lingual Co-reference Resolution[J]. Journal of Chinese Information Processing, 2009,23(2):3-17.)
[7] 张剑, 吴青, 羊昕旖, 等. 基于条件随机场的农业命名实体识别[J]. 计算机与现代化, 2018(1):123-126.
[7] ( Zhang Jian, Wu Qing, Yang Xinyi, et al. Chinese Agricultural Named Entity Recognition Based on Conditional Random Fields[J]. Computer and Modernization, 2018(1):123-126.)
[8] 乔维, 孙茂松. 基于M~3N的中文分词与命名实体识别一体化[J]. 清华大学学报(自然科学版), 2010,50(5):758-762, 767.
[8] ( Qiao Wei, Sun Maosong. Joint Chinese Word Segmentation and Named Entity Recognition Based on Max-Margin Markov Networks[J]. Journal of Tsinghua University (Science and Technology), 2010,50(5):758-762, 767.)
[9] 黄诗琳, 郑小林, 陈德人. 针对产品命名实体识别的半监督学习方法[J]. 北京邮电大学学报, 2013, 36(2): 20-23, 54.
[9] ( Huang Shilin, Zheng Xiaolin, Chen Deren. A Semi-Supervised Learning Method for Product Named Entity Recognition[J]. Journal of Beijing University of Posts and Telecommunications, 2013,36(2):20-23, 54.)
[10] 王国昱. 基于深度学习的中文命名实体识别研究[D]. 北京: 北京工业大学, 2015.
[10] ( Wang Guoyu. Research of Chinese Named Entity Recognition Based on Deep Learning[D]. Beijing: Beijing University of Technology, 2015.)
[11] 徐晨飞, 叶海影, 包平. 基于深度学习的方志物产资料实体自动识别模型构建研究[J]. 数据分析与知识发现, 2020,4(8):86-97.
[11] ( Xu Chenfei, Ye Haiying, Bao Ping. Automatic Recognition of Produce Entities from Local Chronicles with Deep Learning[J]. Data Analysis and Knowledge Discovery, 2020,4(8):86-97.)
[12] 冯蕴天, 张宏军, 郝文宁, 等. 基于深度信念网络的命名实体识别[J]. 计算机科学, 2016,43(4):224-230.
[12] ( Feng Yuntian, Zhang Hongjun, Hao Wenning, et al. Named Entity Recognition Based on Deep Belief Net[J]. Computer Science, 2016,43(4):224-230.)
[13] 沈思, 朱丹浩. 基于深度学习的中文地名识别研究[J]. 北京理工大学学报, 2017,37(11):1150-1155.
[13] ( Shen Si, Zhu Danhao. Chinese Place Name Recognition Based on Deep Learning[J]. Transactions of Beijing Institute of Technology, 2017,37(11):1150-1155.)
[14] Lafferty J D, McCallum A, Pereira F C N. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data[C]// Proceedings of the 18th International Conference on Machine Learning. 2001: 282-289.
[15] Pham T H, Phuong L H . End-to-End Recurrent Neural Network Models for Vietnamese Named Entity Recognition: Word-level vs. Character-level[C]// Proceedings of the 15th International Conference of the Pacific Association for Computational Linguistics. Springer, 2017: 534-542.
[16] Lample G, Ballesteros M, Subramanian S, et al. Neural Architectures for Named Entity Recognition[C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
[17] Strubell E, Verga P, Belanger D, et al. Fast and Accurate Entity Recognition with Iterated Dilated Convolutions[OL]. arXiv Preprint, arXiv: 1702.02098.
[18] Collobert R, Weston J, Bottou L, et al. Natural Language Processing (Almost) from Scratch[J]. Journal of Machine Learning Research, 2011,12(1):2493-2537.
[19] Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of Deep Bidirectional Transformers for Language Understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019.
[20] Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 5998-6008.
[1] Cheng Bin,Shi Shuicai,Du Yuncheng,Xiao Shibin. Keyword Extraction for Journals Based on Part-of-Speech and BiLSTM-CRF Combined Model[J]. 数据分析与知识发现, 2021, 5(3): 101-108.
[2] Chang Chengyang,Wang Xiaodong,Zhang Shenglei. Polarity Analysis of Dynamic Political Sentiments from Tweets with Deep Learning Method[J]. 数据分析与知识发现, 2021, 5(3): 121-131.
[3] Feng Yong,Liu Yang,Xu Hongyan,Wang Rongbing,Zhang Yonggang. Recommendation Model Incorporating Neighbor Reviews for GRU Products[J]. 数据分析与知识发现, 2021, 5(3): 78-87.
[4] Zhang Qi,Jiang Chuan,Ji Youshu,Feng Minxuan,Li Bin,Xu Chao,Liu Liu. Unified Model for Word Segmentation and POS Tagging of Multi-Domain Pre-Qin Literature[J]. 数据分析与知识发现, 2021, 5(3): 2-11.
[5] Wang Qian,Wang Dongbo,Li Bin,Xu Chao. Deep Learning Based Automatic Sentence Segmentation and Punctuation Model for Massive Classical Chinese Literature[J]. 数据分析与知识发现, 2021, 5(3): 25-34.
[6] Lv Xueqiang,Luo Yixiong,Li Jiaquan,You Xindong. Review of Studies on Detecting Chinese Patent Infringements[J]. 数据分析与知识发现, 2021, 5(3): 60-68.
[7] Li Danyang, Gan Mingxin. Music Recommendation Method Based on Multi-Source Information Fusion[J]. 数据分析与知识发现, 2021, 5(2): 94-105.
[8] Liu Huan,Zhang Zhixiong,Wang Yufei. A Review on Main Optimization Methods of BERT[J]. 数据分析与知识发现, 2021, 5(1): 3-15.
[9] Huang Lu,Zhou Enguo,Li Daifeng. Text Representation Learning Model Based on Attention Mechanism with Task-specific Information[J]. 数据分析与知识发现, 2020, 4(9): 111-122.
[10] Yu Chuanming, Wang Manyi, Lin Hongjun, Zhu Xingyu, Huang Tingting, An Lu. A Comparative Study of Word Representation Models Based on Deep Learning[J]. 数据分析与知识发现, 2020, 4(8): 28-40.
[11] Zhao Yang, Zhang Zhixiong, Liu Huan, Ding Liangping. Classification of Chinese Medical Literature with BERT Model[J]. 数据分析与知识发现, 2020, 4(8): 41-49.
[12] Xu Chenfei, Ye Haiying, Bao Ping. Automatic Recognition of Produce Entities from Local Chronicles with Deep Learning[J]. 数据分析与知识发现, 2020, 4(8): 86-97.
[13] Wang Xinyun,Wang Hao,Deng Sanhong,Zhang Baolong. Classification of Academic Papers for Periodical Selection[J]. 数据分析与知识发现, 2020, 4(7): 96-109.
[14] Jiao Qihang,Le Xiaoqiu. Generating Sentences of Contrast Relationship[J]. 数据分析与知识发现, 2020, 4(6): 43-50.
[15] Wang Mo,Cui Yunpeng,Chen Li,Li Huan. A Deep Learning-based Method of Argumentative Zoning for Research Articles[J]. 数据分析与知识发现, 2020, 4(6): 60-68.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn