We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: Comparing Specialised Small and General Large Language Models on Text Classification: 100 Labelled Samples to Achieve Break-Even Performance

Abstract: When solving NLP tasks with limited labelled data, researchers can either use a general large language model without further update, or use a small number of labelled examples to tune a specialised smaller model. In this work, we address the research gap of how many labelled samples are required for the specialised small models to outperform general large models, while taking the performance variance into consideration. By observing the behaviour of fine-tuning, instruction-tuning, prompting and in-context learning on 7 language models, we identify such performance break-even points across 8 representative text classification tasks of varying characteristics. We show that the specialised models often need only few samples (on average $10 - 1000$) to be on par or better than the general ones. At the same time, the number of required labels strongly depends on the dataset or task characteristics, with this number being significantly lower on multi-class datasets (up to $100$) than on binary datasets (up to $5000$). When performance variance is taken into consideration, the number of required labels increases on average by $100 - 200\%$ and even up to $1500\%$ in specific cases.
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as: arXiv:2402.12819 [cs.CL]
  (or arXiv:2402.12819v2 [cs.CL] for this version)

Submission history

From: Branislav Pecher [view email]
[v1] Tue, 20 Feb 2024 08:38:24 GMT (9705kb,D)
[v2] Fri, 26 Apr 2024 08:20:40 GMT (2745kb,D)

Link back to: arXiv, form interface, contact.