We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios

Abstract: We introduce TableLLM, a robust large language model (LLM) with 13 billion parameters, purpose-built for proficiently handling tabular data manipulation tasks, whether they are embedded within documents or spreadsheets, catering to real-world office scenarios. We propose a distant supervision method for training, which comprises a reasoning process extension strategy, aiding in training LLMs to understand reasoning patterns more effectively as well as a cross-way validation strategy, ensuring the quality of the automatically generated data. To evaluate the performance of TableLLM, we have crafted a benchmark tailored to address both document and spreadsheet formats as well as constructed a well-organized evaluation pipeline capable of handling both scenarios. Thorough evaluations underscore the advantages of TableLLM when compared to various existing general-purpose and tabular data-focused LLMs. We have publicly released the model checkpoint, source code, benchmarks, and a web application for user interaction.Our codes and data are publicly available at this https URL
Comments: this https URL
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2403.19318 [cs.CL]
  (or arXiv:2403.19318v2 [cs.CL] for this version)

Submission history

From: Bohan Zhang [view email]
[v1] Thu, 28 Mar 2024 11:21:12 GMT (3244kb,D)
[v2] Mon, 1 Apr 2024 05:10:56 GMT (3244kb,D)

Link back to: arXiv, form interface, contact.