Doc2X | 轻松转换多栏 PDF 支持多栏 PDF 转换为 Word、Markdown、HTML,并整合沉浸式翻译功能,为文档处理增效。 Doc2X | Effortless Multi-Column PDF Conversion Convert multi-column PDFs to Word, Markdown, or HTML with immersive translation for enhanced document handling. 👉 开始使用 Doc2X | Start Using Doc2X
WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models
WISE:重新思考终身模型编辑大型语言模型的知识记忆
Peng Wang Zexi Ningyu Zhang Ziwen Xu Yunzhi Yao Yong Jiang Huajun Chen
王鹏 李泽 张宁宇 徐子文 姚云志 江勇 陈华军
1 Zhejiang University 2 Alibaba Group
1 浙江大学 2 阿里巴巴集团
{peng2001,zexi.li,zhangningyu}@zju.edu.cn
{peng2001,zexi.li,zhangningyu}@zju.edu.cn
Abstract
摘要
Large language models (LLMs) need knowledge updates to meet the ever-growing world facts and correct the hallucinated responses, facilitating the methods of lifelong model editing. Where the updated knowledge resides in memories is a fundamental question for model editing. In this paper, we find that editing either long-term memory (direct model parameters) or working memory (nonparametric knowledge of neural network activations/representations by retrieval) will result in an impossible triangle-reliability, generalization, and locality can not be realized together in the lifelong editing settings. For long-term memory, directly editing the parameters will cause conflicts with irrelevant pretrained knowledge or previous edits (poor reliability and locality). For working memory, retrieval-based activations can hardly make the model understand the edits and generalize (poor generalization). Therefore, we propose WISE to bridge the gap between memories. In WISE, we design a dual parametric memory scheme, which consists of the main memory for the pretrained knowledge and a side memory for the edited knowledge. We only edit the knowledge in the side memory and train a router to decide which memory to go through when given a query. For continual editing, we devise a knowledge-sharding mechanism where different sets of edits reside in distinct subspaces of parameters, and are subsequently merged into a shared memory without conflicts. Extensive experiments show that WISE can outperform previous model editing methods and overcome the impossible triangle under lifelong model editing of question answering, hallucination, and out-of-distribution settings across trending LLM architectures,e.g.,GPT,LLaMA,and Mistral .
大型语言模型(LLMs)需要知识更新以满足不断增长的世界事实并纠正幻觉响应,从而促进终身模型编辑的方法。更新知识存储在记忆中的位置是模型编辑中的一个基本问题。在本文中,我们发现无论是编辑长期记忆(直接模型参数)还是工作记忆(通过检索获得的神经网络激活/表示的非参数知识),都会导致在终身编辑设置中无法实现可靠性、泛化和局部性的三角关系。对于长期记忆,直接编辑参数会导致与无关的预训练知识或先前编辑的冲突(可靠性差和局部性差)。对于工作记忆,基于检索的激活很难使模型理解编辑并进行泛化(泛化性差)。因此,我们提出了WISE来弥合记忆之间的差距。在WISE中,我们设计了一种双参数记忆方案,包括用于预训练知识的主记忆和用于编辑知识的侧记忆。我们仅在侧记忆中编辑知识,并训练一个路由器来决定在给定查询时选择哪个记忆。对于持续编辑,我们设计了一种知识分片机制,其中不同的编辑集驻留在参数的不同子空间中,并随后无冲突地合并到共享记忆中。大量实验表明,WISE可以超越先前的模型编辑方法,并在终身模型编辑的问题回答、幻觉和分布外设置中克服不可能的三角关系,适用于流行的LLM架构,例如GPT、LLaMA和Mistral 。
1 Introduction
1 引言
Large language models (LLMs) show emergent intelligence when scaling the number of parameters and data [1-4], which reveals the sparks of artificial general intelligence [5]. However, when deployed, LLMs still make mistakes [6], generating responses with hallucinations [7], bias [8], and factual decays [9]. On the other hand, the world's knowledge is ever-growing, so the up-to-date knowledge is usually different from the one during LLMs' pretraining [10]. Many such errors and emerging facts will arise sequentially in deployment, some of which have to be addressed timely and efficiently without waiting for retraining or finetuning [11, 12]. Also, retraining or finetuning is often too computationally expensive ,which is not sustainable for lifelong growing knowledge. Therefore, lifelong model editing [10] was proposed to remedy the continual knowledge updates and injections for LLMs in a cheap and timely manner.
大型语言模型(LLMs)在扩展参数数量和数据时展现出涌现的智能 [1-4],这揭示了人工通用智能的火花 [5]。然而,在部署时,LLMs 仍然会犯错误 [6],生成带有幻觉 [7]、偏见 [8] 和事实衰减 [9] 的响应。另一方面,世界知识不断增长,因此最新的知识通常与 LLMs 预训练时的知识不同 [10]。在部署过程中,许多此类错误和新出现的事实将依次出现,其中一些必须在不等待重新训练或微调的情况下及时有效地解决 [11, 12]。此外,重新训练或微调通常在计算上过于昂贵 ,这对于终身增长的知识来说不可持续。因此,终身模型编辑 [10] 被提出,以廉价且及时的方式修复 LLMs 的持续知识更新和注入。
-
Equal contribution.
-
同等贡献。
† Corresponding Author.
† 通讯作者。
Code is available at github.com/zjunlp/Easy….
代码可在 github.com/zjunlp/Easy… 获取。
An effective lifelong model editing approach should satisfy the following properties , 17]: i) reliability, the model can remember both current and previous edits after sequential editing; ii) locality, model editing will not influence inherent pretrained knowledge which is irrelevant to the edited knowledge; iii) generalization, the model is not just merely memorizing the query-target pairs; instead, it should understand and generalize when given other forms of queries with the same knowledge. We compare existing model editing and continual learning methods on the three metrics in Figure 1 and find that it seems to be an impossible triangle-reliability, generalization, and locality can not be realized at the same time in the continual editing settings. We find that where the updated knowledge resides in memories affects editing performances, and previous methods can be generally divided into editing either long-term memory, e.g., ROME [18], MEMIT [19], and FT-EWC (Finetuning with Elastic Weight Consolidation [20], a continual learning method), or working memory, e.g., GRACE [10]. Note that the categorization of long-term and working memories is derived from human recognition [21, 22] and neuroscience [23] which has recently been adopted in the study of LLMs [24-27]. Model editing of long-term memory refers to directly editing the model parameters, which contain generalizable parametric knowledge [28, 24]. However, editing long-term memory will cause conflicts with previous pretrained knowledge, resulting in poor locality (e.g., ROME and FT-EWC in Figure 1). Working memory refers to the non-parametric knowledge of neural network activations/representations by retrieval, and it does not change the network parameters [24]; instead, it replaces the representations by retrieval at working (inference) time, like GRACE. GRACE's working memory shows promising results in reliability and locality, but in our experiments, it shows poor generalization since retrieval-based representations can hardly make the model understand the edits and generalize to different queries. It reveals that long-term memory and working memory both have drawbacks for lifelong model editing, though there were some special memory designs for LLM architectures, like MemorryLLM [28], SPALM [27], and Memoria [25], they change the architectures and cannot be directly applied for different LLMs. Intuitively, there is a gap between editing working and long-term memories, thus, in this paper, we study:
一种有效的终身模型编辑方法应满足以下特性 , 17]:i) 可靠性,模型在连续编辑后能够记住当前和之前的编辑;ii) 局部性,模型编辑不会影响与编辑知识无关的固有预训练知识;iii) 泛化性,模型不仅仅是简单地记忆查询-目标对;相反,它应该在给定具有相同知识的其他形式的查询时理解和泛化。我们在图1中比较了现有的模型编辑和持续学习方法在这三个指标上的表现,发现似乎存在一个不可能三角形——可靠性、泛化和局部性在持续编辑设置中无法同时实现。我们发现更新知识在记忆中的位置影响编辑性能,之前的模型编辑方法大致可以分为编辑长期记忆,例如ROME [18]、MEMIT [19] 和 FT-EWC(带有弹性权重整合的微调 [20],一种持续学习方法),或工作记忆,例如GRACE [10]。需要注意的是,长期记忆和工作记忆的分类源自人类认知 [21, 22] 和神经科学 [23],最近在LLM研究中被采用 [24-27]。长期记忆的模型编辑指的是直接编辑模型参数,这些参数包含可泛化的参数知识 [28, 24]。然而,编辑长期记忆会导致与之前的预训练知识发生冲突,导致局部性较差(例如,图1中的ROME和FT-EWC)。工作记忆指的是通过检索神经网络激活/表示的非参数知识,它不改变网络参数 [24];相反,它在工作(推理)时通过检索替换表示,类似于GRACE。GRACE的工作记忆在可靠性和局部性方面显示出有希望的结果,但在我们的实验中,它显示出较差的泛化性,因为基于检索的表示很难使模型理解编辑并泛化到不同的查询。这表明长期记忆和工作记忆在终身模型编辑中都有缺点,尽管有一些针对LLM架构的特殊记忆设计,如MemorryLLM [28]、SPALM [27] 和 Memoria [25],它们改变了架构,无法直接应用于不同的LLM。直观地说,编辑工作记忆和长期记忆之间存在差距,因此,在本文中,我们研究:
Figure 1: Metric triangle among reliability, generalization, and locality. ZsRE dataset, number of continual edits ,LLaMA-2-7B. Editing methods based on long-term memory (ROME and FT-EWC) and working memory (DEFER and GRACE) show the impossible triangle in metrics, while our WISE is leading in all three metrics.
图1:可靠性、泛化性和局部性之间的度量三角形。ZsRE数据集,持续编辑次数 ,LLaMA-2-7B。基于长期记忆(ROME和FT-EWC)和工作记忆(DEFER和GRACE)的编辑方法显示出度量中的不可能三角形,而我们的WISE在所有三个度量中都处于领先地位。
What is the better memory mechanism for lifelong model editing to break the impossible triangle?
终身模型编辑中,哪种记忆机制能打破不可能三角形?
Human brains contain the left and right hemispheres, which have different divisions as studied in recognition science ,e.g.,the left brain is typically associated with logical tasks while the right brain is more involved in intuitive processes. This inspires us to design WISE, which makes model editor WISER in memories. WISE contains a dual parametric memory mechanism for LLMs' editing: the main memory for the pretrained knowledge and a side memory for the edited knowledge, realizing both long-term memory's generalization and retrieval-based working memory's reliability and locality. The side memory is a form of mid-term memory. We only edit the knowledge in the side memory and train a router to decide which memory to go through when given a query. For continual editing, we design a knowledge-sharding mechanism where different sets of edits reside in distinct and orthogonal subspaces of parameters. These are then merged into a common side memory without conflicts. Our contributions are as follows:
人类大脑包含左半球和右半球,根据认知科学的研究 ,它们有不同的分工,例如,左脑通常与逻辑任务相关联,而右脑更多地参与直觉过程。这启发我们设计WISE,它使模型编辑器WISER在记忆中运作。WISE包含一种双参数化记忆机制,用于LLM的编辑:主记忆用于预训练知识,侧记忆用于编辑知识,实现了长期记忆的泛化性和基于检索的工作记忆的可靠性和局部性。侧记忆是一种中期记忆形式。我们仅在侧记忆中编辑知识,并训练一个路由器来决定在给定查询时选择哪个记忆。对于持续编辑,我们设计了一种知识分片机制,其中不同的编辑集驻留在参数的不同且正交的子空间中。这些随后被合并到一个无冲突的公共侧记忆中。我们的贡献如下:
-
We identify the pitfalls of current model editing methods in lifelong settings, that is, the impossible triangle among-reliability, generalization, and locality. Behind the impossible triangle, we find there is a gap between editing long-term memory and working memory.
-
我们识别了当前模型编辑方法在终身设置中的缺陷,即可靠性、泛化性和局部性之间的不可能三角形。在不可能三角形背后,我们发现长期记忆和工作记忆之间的编辑存在差距。
-
We propose WISE, with a side parametric memory as the mid-term memory, realizing the advantages of both parametric long-term memory and retrieval-based working memory. We design memory routing, sharding, and merging modules in WISE, making WISE lead in continual knowledge editing, reaching the three metrics better simultaneously.
-
我们提出了WISE,它具有侧参数化记忆作为中期记忆,实现了参数化长期记忆和基于检索的工作记忆的优势。我们在WISE中设计了记忆路由、分片和合并模块,使WISE在持续知识编辑方面领先,同时达到三个指标更好。
-
Extensive experiments on GPT, LLaMA, and Mistral across QA, Hallucination, and out-of-distribution datasets validate the effectiveness of WISE for lifelong model editing.
-
在GPT、LLaMA和Mistral上的广泛实验,涵盖QA、幻觉和分布外数据集,验证了WISE在终身模型编辑中的有效性。
2 Methodology
2 方法论
2.1 Preliminaries: Lifelong Model Editing
2.1 预备知识:终身模型编辑
We focus on lifelong model editing problem ,which can ensure hundreds or even thousands of sequential edits on LLMs to make the outputs of target queries align with human expectations while maintaining LLMs’ previous knowledge and capability. Let ,parameterized by ,denote a model function mapping an input to the prediction . The initial model before editing is ,which is trained on a large corpus . When the LLM makes mistakes or requires injections of new knowledge, it needs model editing with a time-evolving editing dataset as . At the time step ,a model editor (ME) takes the -th edit and the LLM of the time step as inputs and produce the revised LLM model following the equation below:
我们关注终身模型编辑问题 ,它能够确保对大型语言模型进行数百甚至数千次的顺序编辑,使得目标查询的输出符合人类期望,同时保持大型语言模型的先前知识和能力。设 ,由 参数化,表示一个将输入 映射到预测 的模型函数。编辑前的初始模型是 ,它是在大型语料库 上训练的。当大型语言模型出错或需要注入新知识时,它需要使用随时间演变的编辑数据集 进行模型编辑。在时间步 ,模型编辑器(ME)将第 次编辑和第 时间步 的大型语言模型作为输入,并根据以下方程生成修订后的大型语言模型 :
Equation 1 describes that after model editing, the LLM should make the correct prediction on the current edit as ,while also preserving knowledge from past editing instances as well as maintaining capability of on the irrelevant data when , especially for general training corpus .
方程1描述了在模型编辑后,大型语言模型应在当前编辑上做出正确的预测 ,同时保留来自过去编辑实例的知识 ,并在 时保持对无关数据的能力 ,特别是对于一般训练语料库 。
2.2 Rethinking the Memory Design of Lifelong Model Editing
2.2 重新思考终身模型编辑的记忆设计
Table 1: Comparison of current model editing methods. "√" refers to "yes" and "well-supported", X refers to "no" or "badly-supported",and " " refers to "less-supported". The three metrics of Reliability, Generalization, and Locality denote the performances on lifelong (continual) editing.
表1:当前模型编辑方法的比较。"√"表示"是"和"支持良好",X表示"否"或"支持不佳",""表示"支持较少"。可靠性、泛化性和局部性这三个指标表示在终身(持续)编辑中的表现。
In Table 1, we compare current model editing methods in terms of memory types and lifelong editing abilities. FT-EWC [20], ROME [18], MEMIT [19], and MEND [31] edit the long-term memory stored in the LLMs' model parameters, but they either do not support continual editing or have negative effects on irrelevant knowledge (poor locality). GRACE [10] is designed for lifelong editing via retrieval-based working memory. The retrieval codebook can avoid the conflicts of irrelevant knowledge, but GRACE fails to generalize due to its codebook being a non-parametric knowledge representation that solely memorizes queries without comprehension. It is worth noting that SERAC [32]/DEFER [10] uses working memory that is stored in additional small models: a scope classifier and a counterfactual model, whose knowledge is parametric. However, the small counterfactual model cannot match the expressiveness and generalization capabilities of LLM itself, making it challenging for the edited knowledge to generalize effectively.
在表1中,我们根据内存类型和终身编辑能力比较了当前的模型编辑方法。FT-EWC [20]、ROME [18]、MEMIT [19] 和 MEND [31] 编辑了存储在 LLM 模型参数中的长期记忆,但它们要么不支持持续编辑,要么对无关知识有负面影响(局部性差)。GRACE [10] 通过基于检索的工作记忆设计用于终身编辑。检索码本可以避免无关知识的冲突,但由于其码本是一种非参数的知识表示,仅记忆查询而不理解,GRACE 无法泛化。值得注意的是,SERAC [32]/DEFER [10] 使用存储在额外小模型中的工作记忆:一个范围分类器和一个反事实模型,其知识是参数化的。然而,小型反事实模型无法匹配 LLM 本身的表达能力和泛化能力,使得编辑后的知识难以有效泛化。
To enable effective lifelong model editing, the method should take advantage of both LLM parameters' long-term memory and retrieval-based working memory. Therefore, we propose WISE as follows.
为了实现有效的终身模型编辑,方法应同时利用 LLM 参数的长期记忆和基于检索的工作记忆。因此,我们提出 WISE 如下。
2.3 WISE: Side Memory with Knowledge Sharding, Merging, and Routing
2.3 WISE:带有知识分片、合并和路由的侧边记忆
As illustrated in Figure 2, WISE comprises two key components: 1) Side Memory Design: i) side memory: side memory is a memory container that is initialized as a copy of LLM's certain FFN layer, storing the stream of edits; ii) memory routing mechanism: similar to retrieval, a routing activation component is adopted to identify the scope of edits, routing the main (original) or side memories during inference; 2) Knowledge Sharding and Merging: i) knowledge in random memory subspaces: to make the edits in appropriate knowledge density and avoid forgetting, we shard the side memory into several random subspaces for editing; ii) knowledge merging: we leverage model merging techniques to merge different memory shards into one side memory without loss of knowledge.
如图2所示,WISE包含两个关键组件:1) 侧记忆设计:i) 侧记忆:侧记忆是一个内存容器,初始化为LLM的某个FFN层的副本,存储编辑流;ii) 记忆路由机制:类似于检索,采用路由激活组件来识别编辑范围,在推理过程中路由主(原始)或侧记忆;2) 知识分片与合并:i) 随机内存子空间中的知识:为了使编辑在适当的知识密度并避免遗忘,我们将侧记忆分片为几个随机子空间进行编辑;ii) 知识合并:我们利用模型合并技术将不同的记忆分片合并为一个侧记忆,而不会丢失知识。
Figure 2: Overview of WISE. Side memory (in blue) and main memory (in green) store edited and pretrained knowledge, respectively. Note: during inference, if WISE-Retrieve, the activation routing will retrieve and select one side memory with maximal activation score.
图2:WISE概述。侧记忆(蓝色)和主记忆(绿色)分别存储编辑和预训练的知识。注意:在推理过程中,如果使用WISE-Retrieve,激活路由将检索并选择激活分数最高的侧记忆。
2.3.1 Side Memory Design
2.3.1 侧记忆设计
Side memory in FFN's value matrix. Each layer in a Transformer contains a multi-head self-attention (MHA) mechanism and a feed-forward network (FFN), where the FFN constitutes two-thirds of the model parameters [33]. The question of how Transformers retrieve and utilize stored knowledge remains unresolved ,yet past works have demonstrated that editing the weights of the FFN is consistently more effective for LLMs. The FFN typically consists of key-value linear matrices: ,i.e.,two multi-layer perceptron (MLP) layers. For the output of attention feature ,the computation of the feed-forward network,omitting the bias terms,can be represented as:
FFN的值矩阵中的侧记忆。Transformer的每一层包含一个多头自注意力(MHA)机制和一个前馈网络(FFN),其中FFN占据了模型参数的三分之二[33]。Transformer如何检索和利用存储的知识仍然是一个未解之谜,但过去的研究表明,编辑FFN的权重对于LLM来说始终更为有效。FFN通常由键值线性矩阵组成:,即两个多层感知器(MLP)层。对于注意力特征的输出,前馈网络的计算(忽略偏置项)可以表示为:
where is a nonlinear activation function (e.g. SwiGLU,GeLU),and a represents the activation values of the first MLP layer. Following previous works [18,33],we edit the value matrix of the chosen FFN layer.
其中 是非线性激活函数(例如 SwiGLU、GeLU),a 表示第一个 MLP 层的激活值。根据先前的工作 [18,33],我们编辑所选 FFN 层的值矩阵 。
However, directly editing the value matrix may cause forgetting and side effects in a lifelong setting. Thus, we copy a value matrix as side memory and edit the side memory instead of the original matrix (main memory). Specifically, the side memory is initialized with the copy of main memory as . Given the side memory,the new output is expressed as . We will introduce how to update the side memory in Section 2.3.2.
然而,直接编辑值矩阵可能会在终身学习环境中导致遗忘和副作用。因此,我们复制一个值矩阵作为辅助记忆,并编辑辅助记忆而不是原始矩阵(主记忆)。具体来说,辅助记忆初始化为主记忆的副本,即 。给定辅助记忆,新输出表示为 。我们将在第 2.3.2 节介绍如何更新辅助记忆。
Locating side memory's FFN layer. Transformer LLMs have been widely demonstrated to encode "lower-level" information (e.g., parts of speech) in earlier layers while processing more advanced linguistic phenomena like anaphora and coreference in later layers [35-37]. Representations in later hidden layers propagate through residual connections without drastic changes [38, 18], enabling effective early exit in LLMs [39, 40]. Therefore, to minimize the side effects of editing and adjust advanced linguistic phenomena, we target mid-to-late layers (e.g. 27) for side memory. Further analysis of layer selection is provided in Section 3.3.
定位辅助记忆的 FFN 层。Transformer 大型语言模型已被广泛证明在早期层中编码“较低级别”信息(例如,词性部分),而在后期层中处理更高级的语言现象,如回指和共指 [35-37]。后期隐藏层的表示通过残差连接传播,没有剧烈变化 [38, 18],从而在大型语言模型中实现有效的早期退出 [39, 40]。因此,为了最小化编辑的副作用并调整高级语言现象,我们将目标锁定在中后期层(例如第 27 层)进行辅助记忆。进一步的层选择分析在第 3.3 节中提供。
Routing between side memories and main memory. Similar to the retrieval-based methods [10, 32], during inference, it is needed to decide whether the main memory or the side memory is used. If a given query is within the scope of previous edits, the side memory is used; otherwise, the main memory. Inspired by [11],we introduce a routing activation indicator,given an input ,it is formulated:
辅助记忆与主记忆之间的路由。类似于基于检索的方法 [10, 32],在推理过程中,需要决定是使用主记忆还是辅助记忆。如果给定的查询在先前编辑的范围内,则使用辅助记忆;否则,使用主记忆。受 [11] 启发,我们引入了一个路由激活指示器,给定输入 ,其公式为:
where is the activation of the side memory’s corresponding FFN layer in Equation 2. We want the activation indicators of editing queries to be larger than the ones of irrelevant queries by a large margin, which is:
其中 是方程 2 中侧记忆对应的 FFN 层的激活。我们希望编辑查询的激活指示器比不相关查询的激活指示器大得多,即:
where is the irrelevant dataset which includes .
其中 是不相关数据集,包括 。
—— 更多内容请到Doc2X翻译查看—— —— For more content, please visit Doc2X for translations ——