We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: Predicting postoperative risks using large language models

Abstract: Predicting postoperative risk can inform effective care management & planning. We explored large language models (LLMs) in predicting postoperative risk through clinical texts using various tuning strategies. Records spanning 84,875 patients from Barnes Jewish Hospital (BJH) between 2018 & 2021, with a mean duration of follow-up based on the length of postoperative ICU stay less than 7 days, were utilized. Methods were replicated on the MIMIC-III dataset. Outcomes included 30-day mortality, pulmonary embolism (PE) & pneumonia. Three domain adaptation & finetuning strategies were implemented for three LLMs (BioGPT, ClinicalBERT & BioClinicalBERT): self-supervised objectives; incorporating labels with semi-supervised fine-tuning; & foundational modelling through multi-task learning. Model performance was compared using the AUROC & AUPRC for classification tasks & MSE & R2 for regression tasks. Cohort had a mean age of 56.9 (sd: 16.8) years; 50.3% male; 74% White. Pre-trained LLMs outperformed traditional word embeddings, with absolute maximal gains of 38.3% for AUROC & 14% for AUPRC. Adapting models through self-supervised finetuning further improved performance by 3.2% for AUROC & 1.5% for AUPRC Incorporating labels into the finetuning procedure further boosted performances, with semi-supervised finetuning improving by 1.8% for AUROC & 2% for AUPRC & foundational modelling improving by 3.6% for AUROC & 2.6% for AUPRC compared to self-supervised finetuning. Pre-trained clinical LLMs offer opportunities for postoperative risk predictions with unseen data, & further improvements from finetuning suggests benefits in adapting pre-trained models to note-specific perioperative use cases. Incorporating labels can further boost performance. The superior performance of foundational models suggests the potential of task-agnostic learning towards the generalizable LLMs in perioperative care.
Comments: Supplemental file available at: this https URL models publicly available at: this https URL AND this https URL
Subjects: Computation and Language (cs.CL)
ACM classes: J.3; I.2.7
Cite as: arXiv:2402.17493 [cs.CL]
  (or arXiv:2402.17493v4 [cs.CL] for this version)

Submission history

From: Charles Alba [view email]
[v1] Tue, 27 Feb 2024 13:18:00 GMT (1422kb)
[v2] Wed, 28 Feb 2024 05:51:15 GMT (1422kb)
[v3] Thu, 25 Apr 2024 05:04:00 GMT (600kb)
[v4] Sun, 5 May 2024 04:07:44 GMT (787kb)

Link back to: arXiv, form interface, contact.