Predicting postoperative risks using large language models

Xue, Bing; Alba, Charles; Abraham, Joanna; Kannampallil, Thomas; Lu, Chenyang

Full-text links:

Download:

PDF only

Current browse context:

cs.CL

< prev | next >

new | recent | 2402

Change to browse by:

Computer Science > Computation and Language

Title: Predicting postoperative risks using large language models

Authors: Bing Xue, Charles Alba, Joanna Abraham, Thomas Kannampallil, Chenyang Lu

(Submitted on 27 Feb 2024 (v1), last revised 5 May 2024 (this version, v4))

Abstract: Predicting postoperative risk can inform effective care management & planning. We explored large language models (LLMs) in predicting postoperative risk through clinical texts using various tuning strategies. Records spanning 84,875 patients from Barnes Jewish Hospital (BJH) between 2018 & 2021, with a mean duration of follow-up based on the length of postoperative ICU stay less than 7 days, were utilized. Methods were replicated on the MIMIC-III dataset. Outcomes included 30-day mortality, pulmonary embolism (PE) & pneumonia. Three domain adaptation & finetuning strategies were implemented for three LLMs (BioGPT, ClinicalBERT & BioClinicalBERT): self-supervised objectives; incorporating labels with semi-supervised fine-tuning; & foundational modelling through multi-task learning. Model performance was compared using the AUROC & AUPRC for classification tasks & MSE & R2 for regression tasks. Cohort had a mean age of 56.9 (sd: 16.8) years; 50.3% male; 74% White. Pre-trained LLMs outperformed traditional word embeddings, with absolute maximal gains of 38.3% for AUROC & 14% for AUPRC. Adapting models through self-supervised finetuning further improved performance by 3.2% for AUROC & 1.5% for AUPRC Incorporating labels into the finetuning procedure further boosted performances, with semi-supervised finetuning improving by 1.8% for AUROC & 2% for AUPRC & foundational modelling improving by 3.6% for AUROC & 2.6% for AUPRC compared to self-supervised finetuning. Pre-trained clinical LLMs offer opportunities for postoperative risk predictions with unseen data, & further improvements from finetuning suggests benefits in adapting pre-trained models to note-specific perioperative use cases. Incorporating labels can further boost performance. The superior performance of foundational models suggests the potential of task-agnostic learning towards the generalizable LLMs in perioperative care.

Comments:	Supplemental file available at: this https URL models publicly available at: this https URL AND this https URL
Subjects:	Computation and Language (cs.CL)
ACM classes:	J.3; I.2.7
Cite as:	arXiv:2402.17493 [cs.CL]
	(or arXiv:2402.17493v4 [cs.CL] for this version)

Submission history

From: Charles Alba [view email]
[v1] Tue, 27 Feb 2024 13:18:00 GMT (1422kb)
[v2] Wed, 28 Feb 2024 05:51:15 GMT (1422kb)
[v3] Thu, 25 Apr 2024 05:04:00 GMT (600kb)
[v4] Sun, 5 May 2024 04:07:44 GMT (787kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2402.17493

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Predicting postoperative risks using large language models

Submission history