We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: SWEA: Changing Factual Knowledge in Large Language Models via Subject Word Embedding Altering

Abstract: Model editing has recently gained widespread attention. Current model editing methods primarily involve modifying model parameters or adding additional modules to the existing model. However, the former causes irreversible damage to Large Language Models (LLMs), while the latter incurs additional inference overhead and fuzzy vector matching is not always reliable. To address these issues, we propose an expandable Subject Word Embedding Altering (SWEA) framework, which finds the fused embeddings through character-level key-value matching and adds them to the subject word embeddings in Transformer input. To get these fused embeddings, we propose optimizing then suppressing fusion method, which first optimizes learnable embedding vectors for the editing target and then suppresses the Knowledge Embedding Dimensions (KEDs) to obtain final fused embeddings. We thus propose SWEA$\oplus$OS method for editing factual knowledge in LLMs. We demonstrate the overall state-of-the-art (SOTA) performance of SWEA$\oplus$OS on the COUNTERFACT and zsRE datasets. To further validate the reasoning ability of SWEA$\oplus$OS in editing knowledge, we evaluate it on the more complex RippleEdits benchmark. The results demonstrate that SWEA$\oplus$OS possesses SOTA reasoning ability.
Comments: Under review; Our code will be released
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as: arXiv:2401.17809 [cs.CL]
  (or arXiv:2401.17809v2 [cs.CL] for this version)

Submission history

From: Xiaopeng Li [view email]
[v1] Wed, 31 Jan 2024 13:08:45 GMT (206kb,D)
[v2] Thu, 15 Feb 2024 15:43:55 GMT (380kb,D)
[v3] Tue, 23 Apr 2024 01:08:44 GMT (391kb,D)

Link back to: arXiv, form interface, contact.