Wei Waverly, Shao Junzhe, Lyu Rita Qiuran, Hemono Rebecca, Ma Xinwei, Giorgio Joseph, Zheng Zeyu, Ji Feng, Zhang Xiaoya, Katabaro Emmanuel, Mlowe Matilda, Sabasaba Amon, Lister Caroline, Shabani Siraji, Njau Prosper, McCoy Sandra I, Wang Jingshen
University of Southern California.
University of California, Berkeley.
Res Sq. 2025 May 8:rs.3.rs-6608559. doi: 10.21203/rs.3.rs-6608559/v1.
SUMMARY: Sustained engagement in HIV care and adherence to antiretroviral therapy (ART) are essential for achieving the UNAIDS "95-95-95" targets. Despite increased ART access, disengagement from care remains a significant issue, particularly in sub-Saharan Africa. Traditional machine learning (ML) models have shown moderate success in predicting care disengagement, which would enable early intervention. We develop an enhanced large language model (LLM) fine-tuned with electronic medical records (EMRs) to predict people at risk of disengaging from HIV care in Tanzania and to provide interpretative insights into modifiable risk factors. METHODS: We developed a novel AI model by enhancing a pre-trained LLM (LLaMA 3.1, an open-source pre-trained LLM released by Meta) using routinely collected EMRs from Tanzania's National HIV Care and Treatment Program from January 1, 2018, to June 30, 2023 (4,809,765 records for 261,192 people) to identify people at risk of disengaging from HIV care or developing adverse outcomes. Outcomes included risk of ART non-adherence, non-suppressed viral load, and loss to follow-up. Models were evaluated internally (Kagera region) and externally (Geita region), with performance compared against state-of-art ML models and zero-shot LLMs. Additionally, a team of HIV physicians in Tanzania assessed the LLM's predictions along with LLM provided justifications for a subset of patient records to evaluate their clinical relevance and reasoning. FINDINGS: The enhanced LLMs consistently outperformed the supervised ML model and zero-shot LLMs across all outcomes in both internal and external validation datasets. When focusing on the 25% of PLHIV predicted as most likely to lost-to-follow-up (LTFU), the model correctly identified 78% (2,515 of 3,224) of people living with HIV (PLHIV) genuinely at risk in internal validation and 73% (7,105 of 9,733) in external validation. Attention score analysis indicated that the enhanced LLM focused on keywords such as gaps in follow-up care and ART adherence. The human expert evaluation showed alignment between clinician assessments and the LLM's predictions in 65% of cases, with experts finding the model's justifications reasonable and clinically relevant in 92.3% of aligned cases. INTERPRETATION: If implemented in HIV clinics, this LLM-based AI model could help allocate resources efficiently and deliver targeted interventions, improving retention in care and advancing the UNAIDS "95-95-95" targets. By functioning like a clinician-analyzing patient summaries, predicting risks, and offering reasoning-the enhanced LLM could be integrated into clinical workflows to complement human expertise, facilitating timely interventions and informed decision-making. If implemented widely, this human-AI collaboration has the potential to improve health outcomes for people living with HIV and reduce onward transmission. FUNDING: The study was supported by a grant from the US National Institutes of Health (NIH): NIH NIMH 1R01MH125746.
摘要:持续参与艾滋病护理并坚持抗逆转录病毒疗法(ART)对于实现联合国艾滋病规划署的“95-95-95”目标至关重要。尽管获得抗逆转录病毒疗法的机会有所增加,但脱离护理仍然是一个重大问题,尤其是在撒哈拉以南非洲地区。传统的机器学习(ML)模型在预测护理脱离方面已取得一定成功,这将有助于早期干预。我们开发了一种通过电子病历(EMR)进行微调的增强型大语言模型(LLM),以预测坦桑尼亚有脱离艾滋病护理风险的人群,并提供对可改变风险因素的解释性见解。 方法:我们通过使用从2018年1月1日至2023年6月30日坦桑尼亚国家艾滋病护理和治疗计划中常规收集的电子病历(4,809,765条记录,涉及261,192人)来增强预训练的大语言模型(LLaMA 3.1,Meta发布的开源预训练大语言模型),开发了一种新型人工智能模型,以识别有脱离艾滋病护理或出现不良后果风险的人群。结果包括抗逆转录病毒疗法不依从风险、病毒载量未被抑制以及失访风险。模型在内部(卡盖拉地区)和外部(基塔地区)进行了评估,并将性能与最先进的机器学习模型和零样本大语言模型进行了比较。此外,坦桑尼亚的一组艾滋病医生评估了大语言模型的预测结果以及大语言模型为一部分患者记录提供的理由,以评估其临床相关性和推理能力。 结果:在内部和外部验证数据集中,增强型大语言模型在所有结果上均始终优于监督式机器学习模型和零样本大语言模型。当关注预测为最有可能失访(LTFU)的25%的艾滋病毒感染者时,该模型在内部验证中正确识别出78%(3,224人中的2,515人)真正有风险的艾滋病毒感染者(PLHIV),在外部验证中正确识别出73%(9,733人中的7,105人)。注意力分数分析表明,增强型大语言模型关注诸如后续护理和抗逆转录病毒疗法依从性方面的差距等关键词。人类专家评估显示,在65%的病例中,临床医生的评估与大语言模型的预测结果一致,在92.3%的一致病例中,专家认为该模型的理由合理且与临床相关。 解读:如果在艾滋病诊所实施,这种基于大语言模型的人工智能模型可以帮助有效分配资源并提供有针对性的干预措施,提高护理留存率并推进联合国艾滋病规划署的“95-95-95”目标。通过像临床医生一样分析患者摘要、预测风险并提供推理,增强型大语言模型可以整合到临床工作流程中以补充人类专业知识,促进及时干预和明智决策。如果广泛实施,这种人机协作有可能改善艾滋病毒感染者的健康结果并减少病毒传播。 资金:该研究得到了美国国立卫生研究院(NIH)的资助:NIH NIMH 1R01MH125746。
medRxiv. 2025-2-10
JAMA Netw Open. 2024-4-1
Cochrane Database Syst Rev. 2022-2-1
J Am Med Inform Assoc. 2025-3-1
IEEE Trans Vis Comput Graph. 2024-1
Nat Med. 2023-8
PLOS Glob Public Health. 2022-9-16
AIDS. 2021-5-1