We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: ChuXin: 1.6B Technical Report

Abstract: In this report, we present ChuXin, an entirely open-source language model with a size of 1.6 billion parameters. Unlike the majority of works that only open-sourced the model weights and architecture, we have made everything needed to train a model available, including the training data, the training process, and the evaluation code. Our goal is to empower and strengthen the open research community, fostering transparency and enabling a new wave of innovation in the field of language modeling. Furthermore, we extend the context length to 1M tokens through lightweight continual pretraining and demonstrate strong needle-in-a-haystack retrieval performance. The weights for both models are available at Hugging Face to download and use.
Comments: Technical Report
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2405.04828 [cs.CL]
  (or arXiv:2405.04828v1 [cs.CL] for this version)

Submission history

From: Yufan Jiang [view email]
[v1] Wed, 8 May 2024 05:54:44 GMT (110kb,D)

Link back to: arXiv, form interface, contact.