We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: I Have an Attention Bridge to Sell You: Generalization Capabilities of Modular Translation Architectures

Abstract: Modularity is a paradigm of machine translation with the potential of bringing forth models that are large at training time and small during inference. Within this field of study, modular approaches, and in particular attention bridges, have been argued to improve the generalization capabilities of models by fostering language-independent representations. In the present paper, we study whether modularity affects translation quality; as well as how well modular architectures generalize across different evaluation scenarios. For a given computational budget, we find non-modular architectures to be always comparable or preferable to all modular designs we study.
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2404.17918 [cs.CL]
  (or arXiv:2404.17918v2 [cs.CL] for this version)

Submission history

From: Timothee Mickus [view email]
[v1] Sat, 27 Apr 2024 14:10:51 GMT (7083kb,D)
[v2] Tue, 30 Apr 2024 05:53:10 GMT (7083kb,D)

Link back to: arXiv, form interface, contact.