We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: Octopus v4: Graph of language models

Abstract: Language models have been effective in a wide range of applications, yet the most sophisticated models are often proprietary. For example, GPT-4 by OpenAI and various models by Anthropic are expensive and consume substantial energy. In contrast, the open-source community has produced competitive models, like Llama3. Furthermore, niche-specific smaller language models, such as those tailored for legal, medical or financial tasks, have outperformed their proprietary counterparts. This paper introduces a novel approach that employs \textit{functional tokens} to integrate \textbf{multiple open-source models}, each optimized for particular tasks. Our newly developed Octopus v4 model leverages \textit{functional tokens} to intelligently direct user queries to the most appropriate vertical model and reformat the query to achieve the best performance. Octopus v4, an evolution of the Octopus v1, v2, and v3 models, excels in selection and parameter understanding and reformatting. Additionally, we explore the use of graph as a versatile data structure that effectively coordinates multiple open-source models by harnessing the capabilities of the Octopus model and \textit{functional tokens}. Use our open-sourced GitHub (\url{this https URL}) to try Octopus v4 models (\url{this https URL}), and contrite to a larger graph of language models. By activating models less than 10B parameters, we achieved SOTA MMLU score of 74.8 among the same level models.
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2404.19296 [cs.CL]
  (or arXiv:2404.19296v1 [cs.CL] for this version)

Submission history

From: Wei Chen [view email]
[v1] Tue, 30 Apr 2024 06:55:45 GMT (3363kb,D)

Link back to: arXiv, form interface, contact.