Please wait a minute...
Data Analysis and Knowledge Discovery  2023, Vol. 7 Issue (9): 64-77    DOI: 10.11925/infotech.2096-3467.2022.0825
Current Issue | Archive | Adv Search |
Few-Shot Language Understanding Model for Task-Oriented Dialogues
Xiang Zhuoyuan1,Chen Hao1,Wang Qian1,Li Na2()
1School of Information and Safety Engineering,Zhongnan University of Economics and Law, Wuhan 430073, China
2Hubei Tobacco Company, Huangshi City Company, Huangshi 435000, China
Download: PDF (1251 KB)   HTML ( 16
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper aims to apply dialogue language understanding tasks to dialogue systems with frequent domain updates without sufficient annotated data for model learning. [Methods] We proposed an Information Augmentation Model for Few-shot Spoken Language Understanding (IAM-FSLU). It uses few-shot learning to address the challenges of data scarcity and model adaptability in new and across-domain scenarios with varying intent types and quantities. Additionally, we constructed an explicit relationship between the two tasks of few-shot intent recognition and few-shot slot extraction. [Results] Compared with the non-joint modeling approaches, the F1 score of slot extraction was improved by nearly 30%, and the sentence accuracy rate was improved by nearly 10% in the 1-shot setting. The F1 score of slot extraction was improved by nearly 35%, and the sentence accuracy rate was improved by 12%~16% in the 3-shot setting. [Limitations] The IAM-FSLU model needs further improvement in intention recognition. The sentence accuracy improvement needs to be improved for the slot extraction task. [Conclusions] The overall performance of the IAM-FSLU model is better than other mainstream models.

Key wordsDeep Learning      Dialogue System      Intention Detection      Dialog Language Understanding      Joint Modeling     
Received: 06 August 2022      Published: 24 October 2023
ZTFLH:  TP391  
Fund:The National Natural Science Foundation of China(61702553);The Hubei Tobacco Company Science and Technology Project(027Y2022-031);The Discipline of Innovation Base for Introducing Talents in Colleges and Universities(B21038)
Corresponding Authors: Li Na, ORCID:0000-0003-2749-5958, E-mail: 549655554@qq.com。   

Cite this article:

Xiang Zhuoyuan, Chen Hao, Wang Qian, Li Na. Few-Shot Language Understanding Model for Task-Oriented Dialogues. Data Analysis and Knowledge Discovery, 2023, 7(9): 64-77.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2022.0825     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2023/V7/I9/64

Improved Prototypical Networks Model for Few-shot Intent Detection
Information Augmentation Model for Few-shot Spoken Language Understanding
Few-shot Slot Tagging Model Based on CRF
An Example of a Dynamic Attention Vector
统计项 统计值
语句总数 6 694
平均语句长度 9.9
总领域数 52
训练集领域数 38
验证集领域数 5
测试集领域数 9
总意图数 141
领域平均意图数 2.38
总槽位数 416
领域平均槽位数 8
Statistics of the Original Dataset
Few-shot 支持集
大小
查询集
大小
平均
意图数
平均
槽位数
1-shot 训练集 7 464 7 600 2.2 8.1
验证集 22 556 3.2 7.0
测试集 55 1 068 4.5 9.1
3-shot 训练集 22 338 7 600 2.2 8.1
验证集 66 511 3.2 7.0
测试集 147 1 061 4.5 9.1
Different Settings of Data Statistics after the Refactoring
项目 规格
操作系统 Windows 10
外存 1.5TB
内存 16GB
CPU AMD Ryzen 5 2600 6-Core Processor
GPU型号 NVIDA GeForce RTX 2070 Super
GPU显存 11GB
编程语言 Python 3.6
深度学习框架 PyTorch 1.7.1
Experimental Environment
模型 意图识别
准确率/%
槽位提取
F 1分数/%
句准确率/
%
Proto-IS 75.77 26.30 19.98
FSLU 67.95 61.38 32.10
SAMGM-SLU 68.12 62.13 33.21
SGM-SLU 69.35 62.42 34.65
IAM-FSLU 70.31 62.99 35.30
Compare Experiments under 3-shot
模型 意图识别
准确率/%
槽位提取
F 1分数/%
句准确率/
%
Proto-IS 67.88 20.86 16.10
FSLU 64.79 48.93 24.34
SAMGM-SLU 65.02 49.14 25.58
SGM-SLU 64.58 49.55 25.63
IAM-FSLU 64.70 50.45 26.12
Compare Experiments under 1-shot
模型 准确率/% F 1分数/% 时长
IAM-FSLU 70.31 - 3h07m
-FSTM-CRF 62.17 71.40 1h20m
-Bi-GRU 58.38 67.25 54m34s
-CNN 61.02 70.05 1h19m
Ablation Experiment Results
[1] Hoy M B. Alexa, Siri, Cortana, and More: An Introduction to Voice Assistants[J]. Medical Reference Services Quarterly, 2018, 37(1): 81-88.
doi: 10.1080/02763869.2018.1404391 pmid: 29327988
[2] Li F F, Fergus R, Perona P. One-Shot Learning of Object Categories[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(4): 594-611.
doi: 10.1109/TPAMI.2006.79
[3] Ravi S, Larochelle H. Optimization as a Model for Few-Shot Learning[C]// Proceedings of the 5th International Conference on Learning Representations. 2017: 1-11.
[4] Wang Y X, Ramanan D, Hebert M. Learning to Model the Tail[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 7032-7042.
[5] Koch G. Siamese Neural Networks for One-Shot Image Recognition[C]// Proceedings of the 32nd International Conference on Machine Learning. 2015.
[6] Vinyals O, Blundell C, Lillicrap T, et al. Matching Networks for one Shot Learning[C]// Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016: 3637-3645.
[7] Sung F, Yang Y X, Zhang L, et al. Learning to Compare: Relation Network for Few-Shot Learning[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018: 1199-1208.
[8] Yang H H, Li Y Y. Identifying User Needs from Social Media:RJ10513(ALM1309-013)[R]. IBM Research Division, 2013.
[9] Subramani S, Vu H Q, Wang H. Intent Classification Using Feature Sets for Domestic Violence Discourse on Social Media[C]// Proceedings of the 4th Asia-Pacific World Congress on Computer Science and Engineering. 2017: 129-136.
[10] Abeywickrama T, Cheema M A, Taniar D. k-Nearest Neighbors on Road Networks: A Journey in Experimentation and in Memory Implementation[J]. Proceedings of the VLDB Endowment, 2016, 9(6): 492-503.
doi: 10.14778/2904121.2904125
[11] Chen R C, Hsieh C H. Web Page Classification Based on a Support Vector Machine Using a Weighted Vote Schema[J]. Expert Systems with Applications, 2006, 31(2): 427-435.
doi: 10.1016/j.eswa.2005.09.079
[12] Kibriya A M, Frank E, Pfahringer B, et al. Multinomial Naive Bayes for Text Categorization Revisited[C]// Proceedings of Australasian Joint Conference on Artificial Intelligence. 2004: 488-499.
[13] Bengio Y, Ducharme R, Vincent P. A Neural Probabilistic Language Model[J]. Journal of Machine Learning Research, 2000, 3(6): 932-938.
[14] Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[C]// Proceedings of the 1st International Conference on Learning Representations. 2013: 1-12.
[15] Collobert R, Weston J, Bottou L, et al. Natural Language Processing (Almost) from Scratch Ronan[J]. Journal of Machine Learning Research, 2011, 12: 2493-2537.
[16] Kim Y. Convolutional Neural Networks for Sentence Classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014: 1746-1751.
[17] Lai S W, Xu L H, Liu K, et al. Recurrent Convolutional Neural Networks for Text Classification[C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence. 2015: 2267-2273.
[18] Yang Z C, Yang D Y, Dyer C, et al. Hierarchical Attention Networks for Document Classification[C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. 2016: 1480-1489.
[19] Yu M, Guo X X, Yi J F, et al. Diverse Few-Shot Text Classification with Multiple Metrics[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies, Volume 1 (Long Papers). 2018: 1206-1215.
[20] Deng S M, Zhang N Y, Sun Z L, et al. When Low Resource NLP Meets Unsupervised Language Model: Meta-Pretraining then Meta-Learning for Few-Shot Text Classification (Student Abstract)[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. 2020: 13773-13774.
[21] Gao T Y, Han X, Liu Z Y, et al. Hybrid Attention-Based Prototypical Networks for Noisy Few-Shot Relation Classification[C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence and 31st Innovative Applications of Artificial Intelligence Conference and 9th AAAI Symposium on Educational Advances in Artificial Intelligence. 2019: 6407-6414.
[22] Sun S L, Sun Q F, Zhou K, et al. Hierarchical Attention Prototypical Networks for Few-Shot Text Classification[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019: 476-485.
[23] Kumar V, Glaude H, de Lichy C, et al. A Closer Look at Feature Space Data Augmentation for Few-Shot Intent Classification[C]// Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP. 2019: 1-10.
[24] Luo B F, Feng Y S, Wang Z, et al. Marrying up Regular Expressions with Neural Networks: A Case Study for Spoken Language Understanding[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers). 2018: 2083-2093.
[25] Fritzler A, Logacheva V, Kretov M. Few-Shot Classification in Named Entity Recognition Task[C]// Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing. 2019: 993-1000.
[26] Hou Y T, Che W X, Lai Y K, et al. Few-Shot Slot Tagging with Collapsed Dependency Transfer and Label-Enhanced Task-Adaptive Projection Network[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 1381-1393.
[27] Marslen-Wilson W, Tyler L K. The Temporal Structure of Spoken Language Understanding[J]. Cognition, 1980, 8(1): 1-71.
pmid: 7363578
[28] 宗成庆. 统计自然语言处理[M]. 北京: 清华大学出版社, 2008.
[28] (Zong Chengqing. Statistical Natural Language Processing[M]. Beijing: Tsinghua University Press, 2008.)
[29] Seneff S. TINA: A Natural Language System for Spoken Language Applications[J]. Computational Linguistics, 1992, 18(1): 61-86.
[30] Epstein J, Klinkenberg W D. From Eliza to Internet: A Brief History of Computerized Assessment[J]. Computers in Human Behavior, 2001, 17(3): 295-314.
doi: 10.1016/S0747-5632(01)00004-8
[31] Jeong M, Lee G G. Triangular-Chain Conditional Random Fields[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2008, 16(7): 1287-1302.
doi: 10.1109/TASL.2008.925143
[32] Xu P Y, Sarikaya R. Convolutional Neural Network Based Triangular CRF for Joint Intent Detection and Slot Filling[C]// Proceedings of 2013 IEEE Workshop on Automatic Speech Recognition and Understanding. 2014: 78-83.
[33] Guo D, Tur G, Yih W T, et al. Joint Semantic Utterance Classification and Slot Filling with Recursive Neural Networks[C]// Proceedings of 2014 IEEE Spoken Language Technology Workshop. 2014: 554-559.
[34] Hakkani-Tür D, Tur G, Celikyilmaz A, et al. Multi-domain Joint Semantic Frame Parsing Using Bi-directional RNN-LSTM[C]// Proceedings of the 17th Annual Meeting of the International Speech Communication Association. 2016.
[35] Li C L, Li L, Qi J. A Self-attentive Model with Gate Mechanism for Spoken Language Understanding[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018: 3824-3833.
[36] Chen Q, Zhuo Z, Wang W. BERT for Joint Intent Classification and Slot Filling[OL]. arXiv Preprint, arXiv: 1902.10909.
[37] 周奇安, 李舟军. 基于BERT的任务导向对话系统自然语言理解的改进模型与调优方法[J]. 中文信息学报, 2020, 34(5): 82-90.
[37] (Zhou Qi’an, Li Zhoujun. BERT Based Improved Model and Tuning Techniques for Natural Language Understanding in Task-oriented Dialog System[J]. Journal of Chinese Information Processing, 2020, 34(5): 82-90.)
[38] Goo C W, Gao G, Hsu Y K, et al. Slot-Gated Modeling for Joint Slot Filling and Intent Prediction[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies, Volume 2 (Short Papers). 2018: 753-757.
[39] Yazdani M, Henderson J. A Model of Zero-Shot Learning of Spoken Language Understanding[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 244-249.
[40] Zhang X D, Wang H F. A Joint Model of Intent Determination and Slot Filling for Spoken Language Understanding[C]// Proceedings of the 25th International Joint Conference on Artificial Intelligence. 2016: 2993-2999.
[41] Hou Y, Mao J, Lai Y, et al. FewJoint: A Few-Shot Learning Benchmark for Joint Language Understanding[OL]. arXiv Preprint, arXiv: 2009.08138.
[42] Li B H, Zhou H, He J X, et al. On the Sentence Embeddings from Pre-trained Language Models[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 2020: 9119-9130.
[43] Cui Y M, Che W X, Liu T, et al. Pre-Training with Whole Word Masking for Chinese BERT[J]. ACM Transactions on Audio, Speech, and Language Processing, 2021, 29: 3504-3514.
[1] Nie Hui, Cai Ruisheng. Online Doctor Recommendation System with Attention Mechanism[J]. 数据分析与知识发现, 2023, 7(8): 138-148.
[2] Li Guangjian, Yuan Yue. Review of Knowledge Elements Extraction in Scientific Literature Based on Deep Learning[J]. 数据分析与知识发现, 2023, 7(7): 1-17.
[3] Wang Nan, Wang Qi. Evaluating Student Engagement with Deep Learning[J]. 数据分析与知识发现, 2023, 7(6): 123-133.
[4] Wu Jialun, Zhang Ruonan, Kang Wulin, Yuan Puwei. Deep Learning Model of Drug Recommendation Based on Patient Similarity Analysis[J]. 数据分析与知识发现, 2023, 7(6): 148-160.
[5] Wang Xiaofeng, Sun Yujie, Wang Huazhen, Zhang Hengzhang. Construction and Verification of Type-Controllable Question Generation Model Based on Deep Learning and Knowledge Graphs[J]. 数据分析与知识发现, 2023, 7(6): 26-37.
[6] Liu Yang, Zhang Wen, Hu Yi, Mao Jin, Huang Fei. Hotel Stock Prediction Based on Multimodal Deep Learning[J]. 数据分析与知识发现, 2023, 7(5): 21-32.
[7] Huang Xuejian, Ma Tinghuai, Wang Gensheng. Detecting Weibo Rumors Based on Hierarchical Semantic Feature Learning Model[J]. 数据分析与知识发现, 2023, 7(5): 81-91.
[8] Wang Yinqiu, Yu Wei, Chen Junpeng. Automatic Question-Answering in Chinese Medical Q & A Community with Knowledge Graph[J]. 数据分析与知识发现, 2023, 7(3): 97-109.
[9] Zhang Zhengang, Yu Chuanming. Knowledge Graph Completion Model Based on Entity and Relation Fusion[J]. 数据分析与知识发现, 2023, 7(2): 15-25.
[10] Shen Lining, Yang Jiayi, Pei Jiaxuan, Cao Guang, Chen Gongzheng. A Fine-Grained Sentiment Recognition Method Based on OCC Model and Triggering Events[J]. 数据分析与知识发现, 2023, 7(2): 72-85.
[11] Wang Weijun, Ning Zhiyuan, Du Yi, Zhou Yuanchun. Identifying Interdisciplinary Sci-Tech Literature Based on Multi-Label Classification[J]. 数据分析与知识发现, 2023, 7(1): 102-112.
[12] Xiao Yuhan, Lin Huiping. Mining Differentiated Demands with Aspect Word Extraction: Case Study of Smartphone Reviews[J]. 数据分析与知识发现, 2023, 7(1): 63-75.
[13] Cheng Quan, She Dexin. Drug Recommendation Based on Graph Neural Network with Patient Signs and Medication Data[J]. 数据分析与知识发现, 2022, 6(9): 113-124.
[14] Wang Lu, Le Xiaoqiu. Research Progress on Citation Analysis of Scientific Papers[J]. 数据分析与知识发现, 2022, 6(4): 1-15.
[15] Zheng Xiao, Li Shuqing, Zhang Zhiwang. Measuring User Item Quality with Rating Analysis for Deep Recommendation Model[J]. 数据分析与知识发现, 2022, 6(4): 39-48.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn