Few-Shot Language Understanding Model for Task-Oriented Dialogues
Xiang Zhuoyuan1,Chen Hao1,Wang Qian1,Li Na2()
1School of Information and Safety Engineering,Zhongnan University of Economics and Law, Wuhan 430073, China 2Hubei Tobacco Company, Huangshi City Company, Huangshi 435000, China
[Objective] This paper aims to apply dialogue language understanding tasks to dialogue systems with frequent domain updates without sufficient annotated data for model learning. [Methods] We proposed an Information Augmentation Model for Few-shot Spoken Language Understanding (IAM-FSLU). It uses few-shot learning to address the challenges of data scarcity and model adaptability in new and across-domain scenarios with varying intent types and quantities. Additionally, we constructed an explicit relationship between the two tasks of few-shot intent recognition and few-shot slot extraction. [Results] Compared with the non-joint modeling approaches, the F1 score of slot extraction was improved by nearly 30%, and the sentence accuracy rate was improved by nearly 10% in the 1-shot setting. The F1 score of slot extraction was improved by nearly 35%, and the sentence accuracy rate was improved by 12%~16% in the 3-shot setting. [Limitations] The IAM-FSLU model needs further improvement in intention recognition. The sentence accuracy improvement needs to be improved for the slot extraction task. [Conclusions] The overall performance of the IAM-FSLU model is better than other mainstream models.
向卓元, 陈浩, 王倩, 李娜. 面向任务型对话的小样本语言理解模型研究*[J]. 数据分析与知识发现, 2023, 7(9): 64-77.
Xiang Zhuoyuan, Chen Hao, Wang Qian, Li Na. Few-Shot Language Understanding Model for Task-Oriented Dialogues. Data Analysis and Knowledge Discovery, 2023, 7(9): 64-77.
Hoy M B. Alexa, Siri, Cortana, and More: An Introduction to Voice Assistants[J]. Medical Reference Services Quarterly, 2018, 37(1): 81-88.
doi: 10.1080/02763869.2018.1404391
pmid: 29327988
[2]
Li F F, Fergus R, Perona P. One-Shot Learning of Object Categories[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(4): 594-611.
doi: 10.1109/TPAMI.2006.79
[3]
Ravi S, Larochelle H. Optimization as a Model for Few-Shot Learning[C]// Proceedings of the 5th International Conference on Learning Representations. 2017: 1-11.
[4]
Wang Y X, Ramanan D, Hebert M. Learning to Model the Tail[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 7032-7042.
[5]
Koch G. Siamese Neural Networks for One-Shot Image Recognition[C]// Proceedings of the 32nd International Conference on Machine Learning. 2015.
[6]
Vinyals O, Blundell C, Lillicrap T, et al. Matching Networks for one Shot Learning[C]// Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016: 3637-3645.
[7]
Sung F, Yang Y X, Zhang L, et al. Learning to Compare: Relation Network for Few-Shot Learning[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018: 1199-1208.
[8]
Yang H H, Li Y Y. Identifying User Needs from Social Media:RJ10513(ALM1309-013)[R]. IBM Research Division, 2013.
[9]
Subramani S, Vu H Q, Wang H. Intent Classification Using Feature Sets for Domestic Violence Discourse on Social Media[C]// Proceedings of the 4th Asia-Pacific World Congress on Computer Science and Engineering. 2017: 129-136.
[10]
Abeywickrama T, Cheema M A, Taniar D. k-Nearest Neighbors on Road Networks: A Journey in Experimentation and in Memory Implementation[J]. Proceedings of the VLDB Endowment, 2016, 9(6): 492-503.
doi: 10.14778/2904121.2904125
[11]
Chen R C, Hsieh C H. Web Page Classification Based on a Support Vector Machine Using a Weighted Vote Schema[J]. Expert Systems with Applications, 2006, 31(2): 427-435.
doi: 10.1016/j.eswa.2005.09.079
[12]
Kibriya A M, Frank E, Pfahringer B, et al. Multinomial Naive Bayes for Text Categorization Revisited[C]// Proceedings of Australasian Joint Conference on Artificial Intelligence. 2004: 488-499.
[13]
Bengio Y, Ducharme R, Vincent P. A Neural Probabilistic Language Model[J]. Journal of Machine Learning Research, 2000, 3(6): 932-938.
[14]
Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[C]// Proceedings of the 1st International Conference on Learning Representations. 2013: 1-12.
[15]
Collobert R, Weston J, Bottou L, et al. Natural Language Processing (Almost) from Scratch Ronan[J]. Journal of Machine Learning Research, 2011, 12: 2493-2537.
[16]
Kim Y. Convolutional Neural Networks for Sentence Classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014: 1746-1751.
[17]
Lai S W, Xu L H, Liu K, et al. Recurrent Convolutional Neural Networks for Text Classification[C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence. 2015: 2267-2273.
[18]
Yang Z C, Yang D Y, Dyer C, et al. Hierarchical Attention Networks for Document Classification[C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. 2016: 1480-1489.
[19]
Yu M, Guo X X, Yi J F, et al. Diverse Few-Shot Text Classification with Multiple Metrics[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies, Volume 1 (Long Papers). 2018: 1206-1215.
[20]
Deng S M, Zhang N Y, Sun Z L, et al. When Low Resource NLP Meets Unsupervised Language Model: Meta-Pretraining then Meta-Learning for Few-Shot Text Classification (Student Abstract)[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. 2020: 13773-13774.
[21]
Gao T Y, Han X, Liu Z Y, et al. Hybrid Attention-Based Prototypical Networks for Noisy Few-Shot Relation Classification[C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence and 31st Innovative Applications of Artificial Intelligence Conference and 9th AAAI Symposium on Educational Advances in Artificial Intelligence. 2019: 6407-6414.
[22]
Sun S L, Sun Q F, Zhou K, et al. Hierarchical Attention Prototypical Networks for Few-Shot Text Classification[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019: 476-485.
[23]
Kumar V, Glaude H, de Lichy C, et al. A Closer Look at Feature Space Data Augmentation for Few-Shot Intent Classification[C]// Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP. 2019: 1-10.
[24]
Luo B F, Feng Y S, Wang Z, et al. Marrying up Regular Expressions with Neural Networks: A Case Study for Spoken Language Understanding[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers). 2018: 2083-2093.
[25]
Fritzler A, Logacheva V, Kretov M. Few-Shot Classification in Named Entity Recognition Task[C]// Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing. 2019: 993-1000.
[26]
Hou Y T, Che W X, Lai Y K, et al. Few-Shot Slot Tagging with Collapsed Dependency Transfer and Label-Enhanced Task-Adaptive Projection Network[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 1381-1393.
[27]
Marslen-Wilson W, Tyler L K. The Temporal Structure of Spoken Language Understanding[J]. Cognition, 1980, 8(1): 1-71.
pmid: 7363578
[28]
宗成庆. 统计自然语言处理[M]. 北京: 清华大学出版社, 2008.
[28]
(Zong Chengqing. Statistical Natural Language Processing[M]. Beijing: Tsinghua University Press, 2008.)
[29]
Seneff S. TINA: A Natural Language System for Spoken Language Applications[J]. Computational Linguistics, 1992, 18(1): 61-86.
[30]
Epstein J, Klinkenberg W D. From Eliza to Internet: A Brief History of Computerized Assessment[J]. Computers in Human Behavior, 2001, 17(3): 295-314.
doi: 10.1016/S0747-5632(01)00004-8
[31]
Jeong M, Lee G G. Triangular-Chain Conditional Random Fields[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2008, 16(7): 1287-1302.
doi: 10.1109/TASL.2008.925143
[32]
Xu P Y, Sarikaya R. Convolutional Neural Network Based Triangular CRF for Joint Intent Detection and Slot Filling[C]// Proceedings of 2013 IEEE Workshop on Automatic Speech Recognition and Understanding. 2014: 78-83.
[33]
Guo D, Tur G, Yih W T, et al. Joint Semantic Utterance Classification and Slot Filling with Recursive Neural Networks[C]// Proceedings of 2014 IEEE Spoken Language Technology Workshop. 2014: 554-559.
[34]
Hakkani-Tür D, Tur G, Celikyilmaz A, et al. Multi-domain Joint Semantic Frame Parsing Using Bi-directional RNN-LSTM[C]// Proceedings of the 17th Annual Meeting of the International Speech Communication Association. 2016.
[35]
Li C L, Li L, Qi J. A Self-attentive Model with Gate Mechanism for Spoken Language Understanding[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018: 3824-3833.
[36]
Chen Q, Zhuo Z, Wang W. BERT for Joint Intent Classification and Slot Filling[OL]. arXiv Preprint, arXiv: 1902.10909.
(Zhou Qi’an, Li Zhoujun. BERT Based Improved Model and Tuning Techniques for Natural Language Understanding in Task-oriented Dialog System[J]. Journal of Chinese Information Processing, 2020, 34(5): 82-90.)
[38]
Goo C W, Gao G, Hsu Y K, et al. Slot-Gated Modeling for Joint Slot Filling and Intent Prediction[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies, Volume 2 (Short Papers). 2018: 753-757.
[39]
Yazdani M, Henderson J. A Model of Zero-Shot Learning of Spoken Language Understanding[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 244-249.
[40]
Zhang X D, Wang H F. A Joint Model of Intent Determination and Slot Filling for Spoken Language Understanding[C]// Proceedings of the 25th International Joint Conference on Artificial Intelligence. 2016: 2993-2999.
[41]
Hou Y, Mao J, Lai Y, et al. FewJoint: A Few-Shot Learning Benchmark for Joint Language Understanding[OL]. arXiv Preprint, arXiv: 2009.08138.
[42]
Li B H, Zhou H, He J X, et al. On the Sentence Embeddings from Pre-trained Language Models[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 2020: 9119-9130.
[43]
Cui Y M, Che W X, Liu T, et al. Pre-Training with Whole Word Masking for Chinese BERT[J]. ACM Transactions on Audio, Speech, and Language Processing, 2021, 29: 3504-3514.