Please wait a minute...
Advanced Search
数据分析与知识发现  2024, Vol. 8 Issue (6): 16-29     https://doi.org/10.11925/infotech.2096-3467.2023.0839
  综述评介 本期目录 | 过刊浏览 | 高级检索 |
基于大语言模型的问答技术研究进展综述*
文森1,2,钱力1,2,3(),胡懋地1,2,常志军1,2
1中国科学院文献情报中心 北京 100190
2中国科学院大学经济与管理学院信息资源管理系 北京 100190
3国家新闻出版署学术期刊新型出版与知识服务重点实验室 北京 100190
Review of Research Progress on Question-Answering Techniques Based on Large Language Models
Wen Sen1,2,Qian Li1,2,3(),Hu Maodi1,2,Chang Zhijun1,2
1National Science Library, Chinese Academy of Sciences, Beijing 100190, China
2Department of Information Resources Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China
3Key Laboratory of New Publishing and Knowledge Services for Scholarly Journals, Beijing 100190, China
全文: PDF (813 KB)   HTML ( 20
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】全面回顾和概述基于大语言模型的问答技术发展现状、机制原理以及应用趋势。【文献范围】选取与基于大语言模型的问答技术相关的73篇文献。【方法】系统梳理大语言模型的发展现状、参数高效微调策略,分别从面向简单问题的检索增强生成问答推理以及面向复杂问题的提示工程问题推理两方面,深入解析各技术的原理机制、应用价值与存在问题。通过定性分析,全面概述基于大语言模型的问答技术研究进展,并提出未来研究方向。【结果】开源预训练大语言模型不断涌现,高效微调策略可显著提升模型垂直领域适配性。借助文本嵌入与近似最近邻检索技术,检索增强生成技术可有效提升问答可解释性与可信度。借助精心构造的提示工程,可大幅拓展大语言模型的复杂问题推理能力。【局限】大语言模型相关研究发展迅速,调研工作未全面覆盖。【结论】基于大语言模型的问答技术在语义表示、复杂推理等多个方面均取得显著进展,融合外部知识的检索增强生成技术与提示工程技术是当前大语言模型领域的主要研究热点,未来研究工作可在生成内容可控、可信等方面展开深入探索。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
文森
钱力
胡懋地
常志军
关键词 大语言模型问答技术向量检索提示工程    
Abstract

[Objective] This paper aims to comprehensively review and summarize the current development status, mechanism principles, and application trends of question-answering techniques based on large language models. [Coverage] We retrieved a total of 73 relevant papers. [Methods] The study systematically reviews the development status of large language models and efficient parameter fine-tuning strategies. It analyzes the principles, mechanisms, application value, and existing issues of various techniques. It focuses on retrieval-enhanced generation question-answering inference for simple questions and prompt engineering question inference for complex questions. Through qualitative analysis, the research progress of question-answering techniques based on large language models is comprehensively summarized, and future research directions are proposed. [Results] Open-sourced pre-trained large language models continue to emerge, and efficient fine-tuning strategies can significantly improve model adaptability in vertical domains. Retrieval-augmented generation techniques, aided by text embeddings and approximate nearest neighbor retrieval technology, effectively enhance the interpretability and credibility of question-answering. With carefully crafted prompt engineering, the inference capabilities of large models for complex questions can be significantly expanded. [Limitations] The rapid development of research related to large models may result in incomplete coverage of relevant survey work. [Conclusions] Question-answering techniques based on large language models have made remarkable progress in semantic representation, complex reasoning, and other aspects. Retrieval-enhanced generation techniques and prompt engineering, which integrate external knowledge, are the main research hotspots in large models. Future research may focus on exploring aspects such as controllable and credible content generation.

Key wordsLarge Language Models    Q&A Technology    Vector Retrieval    Prompt Engineering
收稿日期: 2023-08-29      出版日期: 2024-04-18
ZTFLH:  TP391  
  G350  
基金资助:*国家重点研发计划(2022YFF0711902);国家社科基金重大项目(21&ZD329)
通讯作者: 钱力,ORCID:0000-0002-0931-2882,E-mail:qianl@mail.las.ac.cn。   
引用本文:   
文森, 钱力, 胡懋地, 常志军. 基于大语言模型的问答技术研究进展综述*[J]. 数据分析与知识发现, 2024, 8(6): 16-29.
Wen Sen, Qian Li, Hu Maodi, Chang Zhijun. Review of Research Progress on Question-Answering Techniques Based on Large Language Models. Data Analysis and Knowledge Discovery, 2024, 8(6): 16-29.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2023.0839      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2024/V8/I6/16
名称 发布时间 发布主体 参数规模/B 模型架构 训练数据
PaLM[21] 2022-04 Google 8/62/540 解码器架构,使用SwiGLU激活函数、并行层技术、多查询注意力机制、旋转向量嵌入等技术进行优化 公开可用文本数据集,包括英语CommonCrawl、C4、Wikipedia、Books1、Books2TB等,以及多种编程语言的源代码,共计7800亿个标记
Flan T5[22] 2022-10 Google 0.06/0.22/0.77/3/11 基于T5的自回归语言模型,使用了指令微调和思维链技术进行优化 使用了超过1 800个不同类型的自然语言处理任务进行微调,包括问答、文本生成、文本分类、文本摘要等
LLaMA[23] 2023-02 Meta 7/13/33/65 解码器架构,使用预正则化、SwiGLU激活函数和旋转向量嵌入等技术进行优化 Common Crawl、Github、Wikipedia、Books、arXiv、StackExchange等近4.7TB数据
Falcon[24] 2023-05 Technology
innovation
institute
1/7/40 解码器架构,使用旋转位置编码、多查询注意力、Flash Attention等技术 公开可用文本数据集,包括英语和欧洲语言的RefinedWeb、Books、Conversations、Code、Technical等,共计约2.8TB
ChatGLM2 2023-06 清华大学 6/12/32/
66/130
GLM架构,结合FlashAttention算法、多查询注意力机制、混合目标函数、人类偏好对齐训练等技术进行优化 6B级训练数据为 1.4TB 中英标识符的预训练,中英文比约为1:1
BaiChuan 2023-07 百川智能 7/13 解码器架构,采用了和LLaMA一样的模型设计,使用FlashAttention算法、多查询注意力机制、RMSNorm、混合目标函数等技术进行优化 公开可用的中英文数据和自行抓取的中文互联网数据,以及部分高质量知识性数据,共计约1.4万亿个标记
LLaMA2[25] 2023-07 Meta 7/13/70 解码器架构,在LLaM技术框架技术上,采用了Ghost Attention算法,以改善模型的多轮对话一致性 在2万亿个数据标记上训练
Table 1  开源大语言模型研究现状
名称 提出时间 技术简介
Adapter Tuning[32] 2019 在预训练模型内部的网络层之间添加新的网络层或模块适配下游任务
Prefix-Tuning[33] 2021-01 在模型输入前添加一个连续的且任务特定的向量序列,固定预训练语言模型的所有参数,只更新优化特定任务的前缀
P-Tuning[34] 2021-03 固定模型参数,利用多层感知机和LSTM对提示词进行编码,编码之后与其他向量进行拼接之后正常输入模型中
Prompt Tuning[35] 2021-04 固定整个预训练模型参数,只允许将每个下游任务的额外k个可更新的标记前置到输入文本中,也没有使用额外的编码层或任务特定的输出层
LoRA(Low-Rank
Adaptation)[36]
2021-06 冻结预训练模型的权重,并在每个Transformer块中注入可训练层(称为秩分解矩阵),通过学习小参数的低秩矩阵近似模型权重矩阵 W 的参数更新,训练时只优化低秩矩阵参数
P-Tuning v2[37] 2021-10 在多层加入了提示词标记作为输入,使得Prompt Tuning能够在不同参数规模的预训练模型、针对不同下游任务的结果上都达到类似精调的结果
AdaLoRA[38] 2023-03 对LoRA的一种改进,根据重要性评分动态分配参数预算给权重矩阵,AdaLoRA将关键的增量矩阵分配高秩以捕捉更精细和任务特定的信息,而将较不重要的矩阵的秩降低
QLoRA[39] 2023-05 使用一种新颖的高精度技术将预训练模型量化为 4 bit,然后添加一小组可学习的低秩适配器权重,这些权重通过量化权重的反向传播梯度进行微调
Table 2  参数高效微调技术总结
Fig.1  三类参数高效微调策略图析
名称 提出时间 模型特点 预训练语料库
Word2Vec[47] 2013 两种基本模型(Skip-Gram和CBOW),通过最大化词与其上下文词之间的共现概率,学习词嵌入表示 谷歌新闻数据集等
GloVe[48] 2014 基于全局词频统计,充分利用全局统计信息,融合了局部窗口方法和全局矩阵分解方法 维基百科、新闻文章、书籍等
CoVe[49] 2017 使用序列到序列模型的编码器预训练词嵌入,能获取丰富的上下文信息 英文维基百科、书籍语料、WMT机器翻译数据集
ELMo[50] 2018 考虑了词的上下文信息,每个词的嵌入是所有层的加权和,可以生成更丰富的词嵌入 维基百科等数据
BERT[51] 2018 通过Transformer架构,能够获取句子的双向上下文信息 BooksCorpus、维基百科等
GPT 2018 基于Transformer架构,通过预测下一个词的方式学习词嵌入 BooksCorpus、维基百科等
XLNet[52] 2019 通过自回归方法,解决了BERT中的掩蔽问题,能获取更全面的上下文信息 BooksCorpus、Wikipedia、Giga5、ClueWeb2012-B、CommonCrawl等
ERNIE 3.0[53] 2021 融合了各种知识图谱,支持多模态输入,增强了模型的理解能力 中文百科、百度百科、百度新闻、百度贴吧等
Instructor[54] 2022 一种指令微调文本嵌入模型,可以通过简单地提供任务指令生成针对任何任务(如分类、检索、聚类、文本评估等)和领域(如科学、金融等)的文本嵌入,而无需任何微调 包含330个来自Super Natural Instructions的数据集与30个用于句向量训练的数据集的数据集集合MEDI
E5[55] 2022 采用两阶段的训练方式,基于弱监督预训练和对比学习训练而成 2.7亿文本对数据集CCPairs
Table 3  文本嵌入模型调研
类别 典型算法 基本原理 优点 缺点 检索速度 检索准确率
基于局部敏感哈希[59] 符号随机投影、查询感知局部敏感哈希等 使用哈希函数将高维数据映射到桶中,相似数据更可能被映射到同一个桶 在高维空间中具有较好的效果,适用于图像、音频等数据 对于维度较低的数据检索效果较差 几毫秒到数百毫秒 60%~80%,受到哈希函数、数据规模等因素的影响
基于空间
划分[60]
随机k维树、k均值树、倒排索引等 将数据空间通过树形结构递归划分,以在有限范围内进行搜索 对于维度较低的数据和均匀分布的数据点效果良好 在高维空间中存在“维数灾难”问题 几毫秒到几百毫秒 检索准确率在低维数据上可达95%以上。但在高维数据上可能下降到30~50%
基于向量
压缩[61]
正交乘积量化、标量量化等 使用向量量化或标量量化方法对数据进行降维,以减少索引的大小并节省内存 能够在一定程度上减少存储和计算成本 压缩后的向量可能会失去一些信息,从而影响准确性 几百毫秒到秒级范围 50%~85%,但随着维数的增加,检索准确率会降低
基于图[62] 导航小世界、近邻图搜索等 构建数据之间的图结构,沿着图中相似节点的路径搜索 检索召回率和速度最优,能够捕捉数据点之间的复杂关系,适用于结构化数据 构造图难度较大,搜索速度较慢 几百毫秒到秒级及以上 70%~90%,但随着数据量的增加,检索速度和检索准确率都有可能下降
Table 4  4类近似最近邻搜索方法的对比
[1] Tony Hey, Stewart Tansley, Kristin Tolle. 第四范式: 数据密集型科学发现[M]. 潘教峰, 张晓林等译. 北京: 科学出版社, 2012.
[1] (Tony Hey, Stewart Tansley, Kristin Tolle. The Fourth Paradigm: Data-Intensive Scientific Discovery[M]. Translated by Pan Jiaofeng, Zhang Xiaolin, et al. Beijing: Science Press, 2012.)
[2] Bawden D, Robinson L. The Dark Side of Information: Overload, Anxiety and Other Paradoxes and Pathologies[J]. Journal of Information Science, 2009, 35(2): 180-191.
[3] Green B F, Wolf A K, Chomsky C, et al. Baseball: An Automatic Question-Answerer[C]// Proceedings of Western Joint IRE-AIEE-ACM Computer Conference. 1961: 219-224.
[4] Woods W A. Lunar Rocks in Natural English: Explorations in Natural Language Question Answering[M]// Linguistic Structures Processing. Amsterdam: North Holland. 1977: 521-569.
[5] Li X, Roth D. Learning Question Classifiers[C]// Proceedings of the 19th International Conference on Computational Linguistics-Volume 1. 2002: 1-7.
[6] Mihalcea R, Tarau P. TextRank: Bringing Order into Text[C]// Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing. 2004: 404-411.
[7] Tellex S, Katz B, Lin J, et al. Quantitative Evaluation of Passage Retrieval Algorithms for Question Answering[C]// Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2003: 41-47.
[8] Wallace R S. The Anatomy of A.L.I.C.E.[M]// Parsing the Turing Test: Philosophical and Methodological Issues in the Quest for the Thinking Computer. Berlin: Springer Netherlands, 2009.
[9] Turian J, Ratinov L, Bengio Y. Word representations: a simple and general method for semi-supervised learning[C]// Proceedings of the 48th annual meeting of the association for computational linguistics. 2010: 384-394.
[10] Jeon J, Croft W B, Lee J H. Finding Similar Questions in Large Question and Answer Archives[C]// Proceedings of the 14th ACM International Conference on Information and Knowledge Management. 2005: 84-90.
[11] Chen B, Sun L, Han X P. Sequence-to-Action: End-to-End Semantic Graph Generation for Semantic Parsing[OL]. arXiv Preprint, arXiv:1809.00773.
[12] Shen W, Wang J Y, Han J W. Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions[J]. IEEE Transactions on Knowledge and Data Engineering, 2015, 27(2): 443-460.
[13] Zeng D J, Liu K, Lai S W, et al. Relation Classification via Convolutional Deep Neural Network[C]// Proceedings of COLING 2014 - 25th International Conference on Computational Linguistics. 2014: 2335-2344.
[14] Berant J, Chou A, Frostig R, et al. Semantic Parsing on Freebase from Question-Answer Pairs[C]// Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 2013: 1533-1544.
[15] Singhal A. Introducing the Knowledge Graph: Things, not Strings[EB/OL].[2023-09-01]. Official Google Blog, https://blog.google/products/search/introducing-knowledge-graph-things-not/.
[16] Liu Y H, Han T L, Ma S Y, et al. Summary of ChatGPT/GPT-4 Research and Perspective Towards the Future of Large Language Models[OL]. arXiv Preprint, arXiv:2304.01852.
[17] Carlini N, Tramer F, Wallace E, et al. Extracting Training Data from Large Language Models[C]// Proceedings of the 30th USENIX Security Symposium. 2021: 2633-2650.
[18] Zhao W X, Zhou K, Li J Y, et al. A Survey of Large Language Models[OL]. arXiv Preprint, arXiv:2303.18223.
[19] Hoffmann J, Borgeaud S, Mensch A, et al. Training Compute-Optimal Large Language Models[OL]. arXiv Preprint, arXiv:2203.15556.
[20] 钱力, 刘熠, 张智雄, 等. ChatGPT的技术基础分析[J]. 数据分析与知识发现, 2023, 7(3): 6-15.
[20] (Qian Li, Liu Yi, Zhang Zhixiong, et al. An Analysis on the Basic Technologies of ChatGPT[J]. Data Analysis and Knowledge Discovery, 2023, 7(3): 6-15.)
[21] Chowdhery A, Narang S R, Devlin J, et al. PaLM: Scaling Language Modeling with Pathways[OL]. arXiv Preprint, arXiv: 2204.02311.
[22] Chung H W, Hou L, Longpre S, et al. Scaling Instruction-Finetuned Language Models[OL]. arXiv Preprint, arXiv: 2210.11416.
[23] Touvron H, Lavril T, Izacard G, et al. LLaMA: Open and Efficient Foundation Language Models[OL]. arXiv Preprint, arXiv: 2302.13971.
[24] Penedo G, Malartic Q, Hesslow D, et al. The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only[OL]. arXiv Preprint, arXiv: 2306.01116.
[25] Touvron H, Martin L, Stone K, et al. LLaMA2: Open Foundation and Fine-Tuned Chat Models[OL]. arXiv Preprint, arXiv: 2307.09288.
[26] Radford A, Narasimhan K, Salimans T, et al. Improving Language Understanding by Generative Pre-training[OL]. [2023-09-01]. https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf.
[27] Radford A, Wu J, Child R, et al. Language Models are Unsupervised Multitask Learners[J]. OpenAI Blog, 2019, 1(8): 9.
[28] Brown T B, Mann B, Ryder N, et al. Language Models are Few-Shot Learners[C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020: 1877-1901.
[29] Wei J, Tay Y, Bommasani R, et al. Emergent Abilities of Large Language Models[OL]. arXiv Preprint, arXiv: 2206.07682.
[30] Fedus W, Zoph B, Shazeer N. Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity[J]. The Journal of Machine Learning Research, 2022, 23(1): 5232-5270.
[31] Chen S Y, Hou Y T, Cui Y M, et al. Recall and Learn: Fine-Tuning Deep Pretrained Language Models with Less Forgetting[OL]. arXiv Preprint, arXiv:2004.12651.
[32] Houlsby N, Giurgiu A, Jastrzebski S, et al. Parameter-Efficient Transfer Learning for NLP[C]// Proceedings of the 36th International Conference on Machine Learning. 2019: 2790-2799.
[33] Li X L, Liang P. Prefix-Tuning: Optimizing Continuous Prompts for Generation[OL]. arXiv Preprint, arXiv: 2101.00190.
[34] Liu X, Zheng Y N, Du Z X, et al. GPT Understands, Too[OL]. arXiv Preprint, arXiv: 2103.10385.
[35] Lester B, Al-Rfou R, Constant N. The Power of Scale for Parameter-Efficient Prompt Tuning[OL]. arXiv Preprint, arXiv: 2104.08691.
[36] Hu E J, Shen Y L, Wallis P, et al. LoRA: Low-Rank Adaptation of Large Language Models[OL]. arXiv Preprint, arXiv: 2106.09685.
[37] Liu X, Ji K X, Fu Y C, et al. P-Tuning v2: Prompt Tuning can be Comparable to Fine-Tuning Universally across Scales and Tasks[OL]. arXiv Preprint, arXiv: 2110.07602.
[38] Zhang Q R, Chen M S, Bukharin A, et al. AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning[OL]. arXiv Preprint, arXiv: 2303.10512.
[39] Dettmers T, Pagnoni A, Holtzman A, et al. QLoRA: Efficient Finetuning of Quantized LLMS[OL]. arXiv Preprint, arXiv: 2305.14314.
[40] Pfeiffer J, Vulić I, Gurevych I, et al. MAD-X: An Adapter-Based Framework for Multi-task Cross-Lingual Transfer[OL]. arXiv Preprint, arXiv: 2005.00052.
[41] He J X, Zhou C T, Ma X Z, et al. Towards a Unified View of Parameter-Efficient Transfer Learning[OL]. arXiv Preprint, arXiv: 2110.04366.
[42] Lewis P, Perez E, Piktus A, et al. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks[C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020: 9459-9474.
[43] Cambria E, White B. Jumping NLP Curves: A Review of Natural Language Processing Research[J]. IEEE Computational Intelligence Magazine, 2014, 9(2): 48-57.
[44] 赵悦阳, 崔雷. 文本嵌入技术的研究与应用进展[J]. 数据与计算发展前沿, 2023, 5(3): 92-110.
[44] (Zhao Yueyang, Cui Lei. Progress in Research and Application of Text Embedding Technology[J]. Frontiers of Data & Computing, 2023, 5(3): 92-110.)
[45] Harris Z S. Distributional Structure[J]. WORD, 1954, 10(2-3): 146-162.
[46] Bengio Y, Ducharme R, Vincent P, et al. A Neural Probabilistic Language Model[J]. The Journal of Machine Learning Research, 2003, 3: 1137-1155.
[47] Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv: 1301.3781.
[48] Pennington J, Socher R, Manning C D. GloVe: Global Vectors for Word Representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014: 1532-1543.
[49] McCann B, Bradbury J, Xiong C M, et al. Learned in Translation: Contextualized Word Vectors[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 6297-6308.
[50] Peters M E, Neumann M, Iyyer M, et al. Deep Contextualized Word Representations[OL]. arXiv Preprint, arXiv: 1802.05365.
[51] Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[OL]. arXiv Preprint, arXiv: 1810.04805.
[52] Yang Z L, Dai Z H, Yang Y M, et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding[C]// Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019: 5753-5763.
[53] Sun Y, Wang S H, Feng S K, et al. ERNIE 3.0: Large-Scale Knowledge Enhanced Pre-training for Language Understanding and Generation[OL]. arXiv Preprint, arXiv: 2107.02137.
[54] Su H J, Shi W J, Kasai J, et al. One Embedder, Any Task: Instruction-Finetuned Text Embeddings[OL]. arXiv Preprint, arXiv: 2212.09741.
[55] Wang L, Yang N, Huang X L, et al. Text Embeddings by Weakly-Supervised Contrastive Pre-training[OL]. arXiv Preprint, arXiv: 2212.03533.
[56] Muennighoff N, Tazi N, Magne L, et al. MTEB: Massive Text Embedding Benchmark[OL]. arXiv Preprint, arXiv: 2210.07316.
[57] Mikolov T, Sutskever I, Chen K, et al. Distributed Representations of Words and Phrases and Their Compositionality[C]// Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2. 2013: 3111-3119.
[58] Ram P, Gray A G. Maximum Inner-Product Search Using Cone Trees[C]// Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2012: 931-939.
[59] 刘英帆. 基于局部敏感哈希的近似最近邻查询研究[D]. 西安: 西安电子科技大学, 2014.
[59] (Liu Yingfan. Research of Approximate Nearest Neighbor Search Based on Locality Sensitive Hashing[D]. Xi’an: Xidian University, 2014.)
[60] 陈健. 基于空间划分的搜索算法[D]. 济南: 山东大学, 2005.
[60] (Chen Jian. Search Algorithm Based on Space Division[D]. Jinan: Shandong University, 2005.)
[61] Jégou H, Douze M, Schmid C. Product Quantization for Nearest Neighbor Search[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(1): 117-128.
doi: 10.1109/TPAMI.2010.57 pmid: 21088323
[62] Li W, Zhang Y, Sun Y F, et al. Approximate Nearest Neighbor Search on High Dimensional Data—Experiments, Analyses, and Improvement[J]. IEEE Transactions on Knowledge and Data Engineering, 2020, 32(8): 1475-1488.
[63] Wei J, Wang X Z, Schuurmans D, et al. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models[OL]. arXiv Preprint, arXiv: 2201.11903.
[64] Wang X Z, Wei J, Schuurmans D, et al. Self-Consistency Improves Chain of Thought Reasoning in Language Models[OL]. arXiv Preprint, arXiv: 2203.11171.
[65] Liu B, Jiang Y Q, Zhang X H, et al. LLM+ P: Empowering Large Language Models with Optimal Planning Proficiency[OL]. arXiv Preprint, arXiv: 2304.11477.
[66] Yao S Y, Yu D, Zhao J, et al. Tree of Thoughts: Deliberate Problem Solving with Large Language Models[OL]. arXiv Preprint, arXiv: 2305.10601.
[67] Ning X F, Lin Z N, Zhou Z X, et al. Skeleton-of-Thought: Large Language Models can do Parallel Decoding[OL]. arXiv Preprint, arXiv: 2307.15337.
[68] Besta M, Blach N, Kubicek A, et al. Graph of Thoughts: Solving Elaborate Problems with Large Language Models[OL]. arXiv Preprint, arXiv: 2308.09687.
[69] Yao S Y, Zhao J, Yu D, et al. ReAct: Synergizing Reasoning and Acting in Language Models[OL]. arXiv Preprint, arXiv: 2210.03629.
[70] Press O, Zhang M R, Min S, et al. Measuring and Narrowing the Compositionality Gap in Language Models[OL]. arXiv Preprint, arXiv: 2210.03350.
[71] Shinn N, Labash B, Gopinath A. Reflexion: An Autonomous Agent with Dynamic Memory and Self-Reflection[OL]. arXiv Preprint, arXiv: 2303.11366.
[72] Wang L, Xu W Y, Lan Y H, et al. Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models[OL]. arXiv Preprint, arXiv: 2305.04091.
[73] Li B, Wang R, Guo J L, et al. Deliberate then Generate: Enhanced Prompting Framework for Text Generation[OL]. arXiv Preprint, arXiv: 2305.19835.
[1] 刘磊, 梁茂成. LingAlign:基于跨语言句向量的多语种句对齐方法研究*[J]. 数据分析与知识发现, 2024, 8(6): 56-68.
[2] 熊曙初, 李轩, 吴佳妮, 周赵宏, 孟晗. 基于有监督对比学习的文本情感语义优化方法研究*[J]. 数据分析与知识发现, 2024, 8(6): 69-81.
[3] 朱侯, 罗颖嘉, 陈梦蕾, 欧阳佳祥, 肖颖, 蔡伊南. 基于知识库增强深度学习模型的隐私政策合规性研究——从完整性与语义冲突角度*[J]. 数据分析与知识发现, 2024, 8(5): 46-58.
[4] 王翼虎, 白海燕. 基于机器阅读理解的智能咨询问答系统构建*[J]. 数据分析与知识发现, 2024, 8(5): 151-162.
[5] 江亿平, 张婷, 夏争鸣, 李玉花, 张兆同. 融合边缘采样和Tri-training的用户评论情感分析方法*[J]. 数据分析与知识发现, 2024, 8(5): 102-112.
[6] 李霏, 邓凯方, 范茂慧, 滕冲, 姬东鸿. 基于篇章级语义图的对话一致性检测*[J]. 数据分析与知识发现, 2024, 8(5): 18-28.
[7] 李嘉俊, 明灿, 郭志浩, 钱铁云, 彭智勇, 王晓光, 李旭晖, 李静. 基于预训练语言模型的古籍文本智能补全研究*[J]. 数据分析与知识发现, 2024, 8(5): 59-67.
[8] 吕学强, 田驰, 张乐, 杜一凡, 张旭, 才藏太. 融合多特征和注意力机制的多模态情感分析模型*[J]. 数据分析与知识发现, 2024, 8(5): 91-101.
[9] 徐国兰, 白如江. 基于双图神经网络的先序关系挖掘*[J]. 数据分析与知识发现, 2024, 8(5): 38-45.
[10] 张静, 高子信, 丁伟杰. 基于BERT-DPCNN的警情文本分类研究 [J]. 数据分析与知识发现, 0, (): 1-.
[11] 齐小英, 李晗语, 杨海平. 基于AlexNet模型的南海地图多标签自动分类研究*[J]. 数据分析与知识发现, 2024, 8(4): 76-87.
[12] 韩普, 陈文祺. 多模态命名实体识别研究进展*[J]. 数据分析与知识发现, 2024, 8(4): 50-63.
[13] 白如江, 陈启明, 张玉洁, 杨超. 基于ChatGPT+Prompt的专利技术功效实体自动生成研究*[J]. 数据分析与知识发现, 2024, 8(4): 14-25.
[14] 马志远, 高颖, 张强, 周洪, 李兵, 陶皖. 融合语义与结构信息的知识图谱补全模型研究*[J]. 数据分析与知识发现, 2024, 8(4): 39-49.
[15] 黄泰峰, 马静. 基于提示学习增强的文本情感分类模型*[J]. 数据分析与知识发现, 2024, 8(3): 77-84.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn