Delirium Identification from Nursing Reports Using Large Language Models

Lataukset42

Verkkojulkaisu

Tiivistelmä

This study investigates large language models for delirium detection from nursing reports, comparing keyword matching, prompting, and finetuning. Using a manually labelled dataset from the University Hospital Freiburg, Germany, we tested Llama3 and Phi3 models. Both prompting and finetuning were effective, with finetuning Phi3 (3.8B) achieving the highest accuracy (90.24%) and AUROC (96.07%), significantly outperforming other methods.

item.page.okmtext