We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cond-mat.dis-nn

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Condensed Matter > Disordered Systems and Neural Networks

Title: Optimal inference of a generalised Potts model by single-layer transformers with factored attention

Abstract: Transformers are the type of neural networks that has revolutionised natural language processing and protein science. Their key building block is a mechanism called self-attention which is trained to predict missing words in sentences. Despite the practical success of transformers in applications it remains unclear what self-attention learns from data, and how. Here, we give a precise analytical and numerical characterisation of transformers trained on data drawn from a generalised Potts model with interactions between sites and Potts colours. While an off-the-shelf transformer requires several layers to learn this distribution, we show analytically that a single layer of self-attention with a small modification can learn the Potts model exactly in the limit of infinite sampling. We show that this modified self-attention, that we call ``factored'', has the same functional form as the conditional probability of a Potts spin given the other spins, compute its generalisation error using the replica method from statistical physics, and derive an exact mapping to pseudo-likelihood methods for solving the inverse Ising and Potts problem.
Comments: 4 pages, 3 figures
Subjects: Disordered Systems and Neural Networks (cond-mat.dis-nn); Statistical Mechanics (cond-mat.stat-mech); Computation and Language (cs.CL); Machine Learning (stat.ML)
Cite as: arXiv:2304.07235 [cond-mat.dis-nn]
  (or arXiv:2304.07235v1 [cond-mat.dis-nn] for this version)

Submission history

From: Riccardo Rende [view email]
[v1] Fri, 14 Apr 2023 16:32:56 GMT (563kb,D)
[v2] Thu, 14 Dec 2023 12:08:44 GMT (569kb,D)
[v3] Wed, 7 Feb 2024 09:48:07 GMT (568kb,D)
[v4] Thu, 4 Apr 2024 13:24:36 GMT (569kb,D)

Link back to: arXiv, form interface, contact.