Tele-FLM Technical Report

Li, Xiang; Yao, Yiqun; Jiang, Xin; Fang, Xuezhi; Wang, Chao; Liu, Xinzhang; Wang, Zihan; Zhao, Yu; Wang, Xin; Huang, Yuyao; Song, Shuangyong; Li, Yongxiang; Zhang, Zheng; Zhao, Bo; Sun, Aixin; Wang, Yequan; He, Zhongjiang; Wang, Zhongyuan; Li, Xuelong; Huang, Tiejun

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2404

Computer Science > Computation and Language

Title: Tele-FLM Technical Report

(Submitted on 25 Apr 2024)

Abstract: Large language models (LLMs) have showcased profound capabilities in language understanding and generation, facilitating a wide array of applications. However, there is a notable paucity of detailed, open-sourced methodologies on efficiently scaling LLMs beyond 50 billion parameters with minimum trial-and-error cost and computational resources. In this report, we introduce Tele-FLM (aka FLM-2), a 52B open-sourced multilingual large language model that features a stable, efficient pre-training paradigm and enhanced factual judgment capabilities. Tele-FLM demonstrates superior multilingual language modeling abilities, measured by BPB on textual corpus. Besides, in both English and Chinese foundation model evaluation, it is comparable to strong open-sourced models that involve larger pre-training FLOPs, such as Llama2-70B and DeepSeek-67B. In addition to the model weights, we share the core designs, engineering practices, and training details, which we expect to benefit both the academic and industrial communities.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2404.16645 [cs.CL]
	(or arXiv:2404.16645v1 [cs.CL] for this version)

Submission history

From: Yequan Wang [view email]
[v1] Thu, 25 Apr 2024 14:34:47 GMT (1357kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2404.16645

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Tele-FLM Technical Report

Submission history