Large Language Models for Medical Forecasting -- Foresight 2

Kraljevic, Zeljko; Yeung, Joshua Au; Bean, Daniel; Teo, James; Dobson, Richard J.

Computer Science > Computation and Language

arXiv:2412.10848 (cs)

[Submitted on 14 Dec 2024]

Title:Large Language Models for Medical Forecasting -- Foresight 2

Authors:Zeljko Kraljevic, Joshua Au Yeung, Daniel Bean, James Teo, Richard J. Dobson

View PDF HTML (experimental)

Abstract:Foresight 2 (FS2) is a large language model fine-tuned on hospital data for modelling patient timelines (GitHub 'removed for anon'). It can understand patients' clinical notes and predict SNOMED codes for a wide range of biomedical use cases, including diagnosis suggestions, risk forecasting, and procedure and medication recommendations. FS2 is trained on the free text portion of the MIMIC-III dataset, firstly through extracting biomedical concepts and then creating contextualised patient timelines, upon which the model is then fine-tuned. The results show significant improvement over the previous state-of-the-art for the next new biomedical concept prediction (P/R - 0.73/0.66 vs 0.52/0.32) and a similar improvement specifically for the next new disorder prediction (P/R - 0.69/0.62 vs 0.46/0.25). Finally, on the task of risk forecast, we compare our model to GPT-4-turbo (and a range of open-source biomedical LLMs) and show that FS2 performs significantly better on such tasks (P@5 - 0.90 vs 0.65). This highlights the need to incorporate hospital data into LLMs and shows that small models outperform much larger ones when fine-tuned on high-quality, specialised data.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2412.10848 [cs.CL]
	(or arXiv:2412.10848v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2412.10848

Submission history

From: Zeljko Kraljevic [view email]
[v1] Sat, 14 Dec 2024 14:45:28 UTC (678 KB)

Computer Science > Computation and Language

Title:Large Language Models for Medical Forecasting -- Foresight 2

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Large Language Models for Medical Forecasting -- Foresight 2

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators