We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: STaR-GATE: Teaching Language Models to Ask Clarifying Questions

Abstract: When prompting language models to complete a task, users often leave important aspects unsaid. While asking questions could resolve this ambiguity \citep[GATE;][]{li2023eliciting}, models often struggle to ask good questions. We explore a language model's ability to self-improve \citep[STaR;][]{zelikman2022star} by rewarding the model for generating useful questions -- a simple method we dub STaR-GATE. We generate a synthetic dataset of 25,500 unique persona-task prompts to simulate conversations between a pretrained language model -- the \texttt{Questioner} -- and a \texttt{Roleplayer} whose preferences are unknown to the \texttt{Questioner}. By asking questions, the \texttt{Questioner} elicits preferences from the \texttt{Roleplayer}. The \texttt{Questioner} is iteratively finetuned on questions that increase the probability of high-quality responses to the task, which are generated by an \texttt{Oracle} with access to the \texttt{Roleplayer}'s latent preferences. After two iterations of self-improvement, the \texttt{Questioner} asks better questions, allowing it to generate responses that are preferred over responses from the initial model on \highlightpink{\textbf{72\%}} of tasks. Our results indicate that teaching a language model to ask better questions leads to better personalized responses.
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as: arXiv:2403.19154 [cs.CL]
  (or arXiv:2403.19154v1 [cs.CL] for this version)

Submission history

From: Jan-Philipp Fränken [view email]
[v1] Thu, 28 Mar 2024 05:35:22 GMT (717kb,D)
[v2] Fri, 29 Mar 2024 05:15:12 GMT (717kb,D)

Link back to: arXiv, form interface, contact.