• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用临床记录和自然语言处理进行自动化 HIV 风险评估。

Using Clinical Notes and Natural Language Processing for Automated HIV Risk Assessment.

机构信息

Department of Biomedical Informatics, Columbia University, New York, NY.

Division of Infectious Diseases, Department of Medicine, Columbia University, New York, NY.

出版信息

J Acquir Immune Defic Syndr. 2018 Feb 1;77(2):160-166. doi: 10.1097/QAI.0000000000001580.

DOI:10.1097/QAI.0000000000001580
PMID:29084046
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5762388/
Abstract

OBJECTIVE

Universal HIV screening programs are costly, labor intensive, and often fail to identify high-risk individuals. Automated risk assessment methods that leverage longitudinal electronic health records (EHRs) could catalyze targeted screening programs. Although social and behavioral determinants of health are typically captured in narrative documentation, previous analyses have considered only structured EHR fields. We examined whether natural language processing (NLP) would improve predictive models of HIV diagnosis.

METHODS

One hundred eighty-one HIV+ individuals received care at New York Presbyterian Hospital before a confirmatory HIV diagnosis and 543 HIV negative controls were selected using propensity score matching and included in the study cohort. EHR data including demographics, laboratory tests, diagnosis codes, and unstructured notes before HIV diagnosis were extracted for modeling. Three predictive algorithms were developed using machine-learning algorithms: (1) a baseline model with only structured EHR data, (2) baseline plus NLP topics, and (3) baseline plus NLP clinical keywords.

RESULTS

Predictive models demonstrated a range of performance with F measures of 0.59 for the baseline model, 0.63 for the baseline + NLP topic model, and 0.74 for the baseline + NLP keyword model. The baseline + NLP keyword model yielded the highest precision by including keywords including "msm," "unprotected," "hiv," and "methamphetamine," and structured EHR data indicative of additional HIV risk factors.

CONCLUSIONS

NLP improved the predictive performance of automated HIV risk assessment by extracting terms in clinical text indicative of high-risk behavior. Future studies should explore more advanced techniques for extracting social and behavioral determinants from clinical text.

摘要

目的

普及艾滋病毒筛查计划成本高昂,劳动强度大,且往往无法识别高危人群。利用纵向电子健康记录(EHR)的自动化风险评估方法可以促进有针对性的筛查计划。尽管健康的社会和行为决定因素通常在叙述性文件中记录,但之前的分析仅考虑了结构化 EHR 字段。我们研究了自然语言处理(NLP)是否会提高艾滋病毒诊断预测模型的性能。

方法

181 名艾滋病毒阳性个体在纽约长老会医院接受治疗,然后在确诊艾滋病毒之前接受了检查,并且使用倾向评分匹配选择了 543 名艾滋病毒阴性对照者,并将其纳入研究队列。提取 EHR 数据,包括诊断前的人口统计学数据、实验室检查、诊断代码和非结构化记录,用于建模。使用机器学习算法开发了三种预测算法:(1)仅使用结构化 EHR 数据的基线模型;(2)基线+NLP 主题模型;(3)基线+NLP 临床关键词模型。

结果

预测模型的性能范围广泛,基线模型的 F 度量为 0.59,基线+NLP 主题模型为 0.63,基线+NLP 关键词模型为 0.74。基线+NLP 关键词模型通过包含“男男性接触者”、“无保护措施”、“艾滋病毒”和“甲基苯丙胺”等关键词以及结构化 EHR 数据,提示了其他艾滋病毒风险因素,从而实现了最高的精度。

结论

NLP 通过提取临床文本中表示高危行为的术语,提高了自动化艾滋病毒风险评估的预测性能。未来的研究应该探索从临床文本中提取社会和行为决定因素的更先进技术。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d579/5762388/4381a2201ba1/nihms914939f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d579/5762388/768bffd75762/nihms914939f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d579/5762388/4381a2201ba1/nihms914939f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d579/5762388/768bffd75762/nihms914939f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d579/5762388/4381a2201ba1/nihms914939f2.jpg

相似文献

1
Using Clinical Notes and Natural Language Processing for Automated HIV Risk Assessment.利用临床记录和自然语言处理进行自动化 HIV 风险评估。
J Acquir Immune Defic Syndr. 2018 Feb 1;77(2):160-166. doi: 10.1097/QAI.0000000000001580.
2
Augmented intelligence with natural language processing applied to electronic health records for identifying patients with non-alcoholic fatty liver disease at risk for disease progression.应用自然语言处理的增强型人工智能用于电子健康记录,以识别非酒精性脂肪性肝病患者中疾病进展风险较高的患者。
Int J Med Inform. 2019 Sep;129:334-341. doi: 10.1016/j.ijmedinf.2019.06.028. Epub 2019 Jul 6.
3
Natural Language Processing of Clinical Notes to Identify Mental Illness and Substance Use Among People Living with HIV: Retrospective Cohort Study.利用临床记录的自然语言处理技术识别HIV感染者中的精神疾病和药物使用情况:回顾性队列研究
JMIR Med Inform. 2021 Mar 10;9(3):e23456. doi: 10.2196/23456.
4
Leveraging Natural Language Processing to Improve Electronic Health Record Suicide Risk Prediction for Veterans Health Administration Users.利用自然语言处理提高退伍军人健康管理局用户电子健康记录自杀风险预测
J Clin Psychiatry. 2023 Jun 19;84(4):22m14568. doi: 10.4088/JCP.22m14568.
5
Challenges of Developing a Natural Language Processing Method With Electronic Health Records to Identify Persons With Chronic Mobility Disability.开发一种使用电子健康记录识别慢性移动障碍患者的自然语言处理方法所面临的挑战。
Arch Phys Med Rehabil. 2020 Oct;101(10):1739-1746. doi: 10.1016/j.apmr.2020.04.024. Epub 2020 May 21.
6
Risk prediction using natural language processing of electronic mental health records in an inpatient forensic psychiatry setting.利用电子心理健康记录的自然语言处理进行住院法医精神病学环境中的风险预测。
J Biomed Inform. 2018 Oct;86:49-58. doi: 10.1016/j.jbi.2018.08.007. Epub 2018 Aug 14.
7
Automated feature selection of predictors in electronic medical records data.电子病历数据中预测指标的自动特征选择
Biometrics. 2019 Mar;75(1):268-277. doi: 10.1111/biom.12987. Epub 2019 Apr 2.
8
Identification of pancreatic cancer risk factors from clinical notes using natural language processing.利用自然语言处理从临床记录中识别胰腺癌风险因素。
Pancreatology. 2024 Jun;24(4):572-578. doi: 10.1016/j.pan.2024.03.016. Epub 2024 Mar 26.
9
Natural language processing to identify lupus nephritis phenotype in electronic health records.利用自然语言处理技术在电子健康记录中识别狼疮性肾炎表型。
BMC Med Inform Decis Mak. 2024 Mar 3;22(Suppl 2):348. doi: 10.1186/s12911-024-02420-7.
10
The Growing Impact of Natural Language Processing in Healthcare and Public Health.自然语言处理在医疗保健和公共卫生领域的影响日益扩大。
Inquiry. 2024 Jan-Dec;61:469580241290095. doi: 10.1177/00469580241290095.

引用本文的文献

1
Frequent Missed Opportunities for Earlier HIV Diagnosis in a Routine Opt-out Testing Environment in Atlanta.在亚特兰大常规的主动退出式检测环境中,早期HIV诊断存在频繁的错失机会。
Open Forum Infect Dis. 2025 Aug 26;12(8):ofaf423. doi: 10.1093/ofid/ofaf423. eCollection 2025 Aug.
2
The status of machine learning in HIV testing in South Africa: a qualitative inquiry with stakeholders in Gauteng province.机器学习在南非艾滋病毒检测中的现状:对豪登省利益相关者的定性调查。
Front Digit Health. 2025 Aug 1;7:1618781. doi: 10.3389/fdgth.2025.1618781. eCollection 2025.
3
Extracting circumstances of Covid-19 transmission from free text with large language models.

本文引用的文献

1
Expanded HIV Testing Strategy Leveraging the Electronic Medical Record Uncovers Undiagnosed Infection Among Hospitalized Patients.利用电子病历的扩大艾滋病毒检测策略发现住院患者中未被诊断出的感染情况。
J Acquir Immune Defic Syndr. 2017 May 1;75(1):27-34. doi: 10.1097/QAI.0000000000001299.
2
Missed Opportunities for Repeat HIV Testing in Pregnancy: Implications for Elimination of Mother-to-Child Transmission in the United States.孕期重复进行HIV检测的错失机会:对美国消除母婴传播的影响
AIDS Patient Care STDS. 2017 Jan;31(1):20-26. doi: 10.1089/apc.2016.0204. Epub 2016 Dec 12.
3
Aspiring to Unintended Consequences of Natural Language Processing: A Review of Recent Developments in Clinical and Consumer-Generated Text Processing.
使用大语言模型从自由文本中提取新冠病毒-19传播情况
Nat Commun. 2025 Jul 1;16(1):5836. doi: 10.1038/s41467-025-60762-w.
4
Building models, building capacity: A review of participatory machine learning for HIV prevention.构建模型,提升能力:关于用于艾滋病预防的参与式机器学习的综述
PLOS Glob Public Health. 2025 Jun 4;5(6):e0003862. doi: 10.1371/journal.pgph.0003862. eCollection 2025.
5
Role of Artificial Intelligence and Personalized Medicine in Enhancing HIV Management and Treatment Outcomes.人工智能与个性化医疗在改善艾滋病病毒管理及治疗效果中的作用
Life (Basel). 2025 May 6;15(5):745. doi: 10.3390/life15050745.
6
Development of machine learning-based mpox surveillance models in a learning health system.在学习型健康系统中基于机器学习的猴痘监测模型的开发。
Sex Transm Infect. 2025 May 2. doi: 10.1136/sextrans-2024-056382.
7
HIV Risk Score and Prediction Model in the United States: A Scoping Review.美国的HIV风险评分与预测模型:一项范围综述
AIDS Behav. 2025 Apr 5. doi: 10.1007/s10461-025-04702-1.
8
Perspectives on Using Artificial Intelligence to Derive Social Determinants of Health Data From Medical Records in Canada: Large Multijurisdictional Qualitative Study.关于利用人工智能从加拿大医疗记录中推导健康数据的社会决定因素的观点:大型多辖区定性研究
J Med Internet Res. 2025 Mar 6;27:e52244. doi: 10.2196/52244.
9
Artificial intelligence and natural language processing for improved telemedicine: Before, during and after remote consultation.用于改善远程医疗的人工智能与自然语言处理:远程会诊前、会诊期间及会诊后
Aten Primaria. 2025 Feb 15;57(8):103228. doi: 10.1016/j.aprim.2025.103228.
10
Using Machine Learning Techniques to Predict Viral Suppression Among People With HIV.运用机器学习技术预测HIV感染者的病毒抑制情况。
J Acquir Immune Defic Syndr. 2025 Mar 1;98(3):209-216. doi: 10.1097/QAI.0000000000003561.
探究自然语言处理的意外后果:临床及用户生成文本处理的最新进展综述
Yearb Med Inform. 2016 Nov 10(1):224-233. doi: 10.15265/IY-2016-017.
4
Identifying Areas for Improvement in the HIV Screening Process of a High-Prevalence Emergency Department.确定高流行率急诊科艾滋病毒筛查流程中的改进领域。
AIDS Patient Care STDS. 2016 Jun;30(6):247-53. doi: 10.1089/apc.2016.0068.
5
Evaluating topic model interpretability from a primary care physician perspective.从初级保健医生的角度评估主题模型的可解释性。
Comput Methods Programs Biomed. 2016 Feb;124:67-75. doi: 10.1016/j.cmpb.2015.10.014. Epub 2015 Oct 30.
6
Evaluation of hidden HIV infections in an urban ED with a rapid HIV screening program.通过快速艾滋病毒筛查项目评估城市急诊科中隐匿性艾滋病毒感染情况。
Am J Emerg Med. 2016 Feb;34(2):180-4. doi: 10.1016/j.ajem.2015.10.002. Epub 2015 Oct 9.
7
Learning probabilistic phenotypes from heterogeneous EHR data.从异构电子健康记录数据中学习概率性表型。
J Biomed Inform. 2015 Dec;58:156-165. doi: 10.1016/j.jbi.2015.10.001. Epub 2015 Oct 14.
8
Identifying risk factors for heart disease over time: Overview of 2014 i2b2/UTHealth shared task Track 2.随着时间推移识别心脏病的风险因素:2014年i2b2/德克萨斯大学健康科学中心共享任务第2轨道概述
J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S67-S77. doi: 10.1016/j.jbi.2015.07.001. Epub 2015 Jul 22.
9
A Review of Feature Selection and Feature Extraction Methods Applied on Microarray Data.应用于微阵列数据的特征选择与特征提取方法综述
Adv Bioinformatics. 2015;2015:198363. doi: 10.1155/2015/198363. Epub 2015 Jun 11.
10
Risk prediction for chronic kidney disease progression using heterogeneous electronic health record data and time series analysis.利用异质电子健康记录数据和时间序列分析对慢性肾脏病进展进行风险预测。
J Am Med Inform Assoc. 2015 Jul;22(4):872-80. doi: 10.1093/jamia/ocv024. Epub 2015 Apr 20.