Northwestern University, Evanston, USA.
National Human Genome Research Institute, Bethesda, USA.
Sci Rep. 2023 Feb 3;13(1):1971. doi: 10.1038/s41598-023-27481-y.
The electronic Medical Records and Genomics (eMERGE) Network assessed the feasibility of deploying portable phenotype rule-based algorithms with natural language processing (NLP) components added to improve performance of existing algorithms using electronic health records (EHRs). Based on scientific merit and predicted difficulty, eMERGE selected six existing phenotypes to enhance with NLP. We assessed performance, portability, and ease of use. We summarized lessons learned by: (1) challenges; (2) best practices to address challenges based on existing evidence and/or eMERGE experience; and (3) opportunities for future research. Adding NLP resulted in improved, or the same, precision and/or recall for all but one algorithm. Portability, phenotyping workflow/process, and technology were major themes. With NLP, development and validation took longer. Besides portability of NLP technology and algorithm replicability, factors to ensure success include privacy protection, technical infrastructure setup, intellectual property agreement, and efficient communication. Workflow improvements can improve communication and reduce implementation time. NLP performance varied mainly due to clinical document heterogeneity; therefore, we suggest using semi-structured notes, comprehensive documentation, and customization options. NLP portability is possible with improved phenotype algorithm performance, but careful planning and architecture of the algorithms is essential to support local customizations.
电子病历和基因组学(eMERGE)网络评估了部署便携式基于表型规则的算法的可行性,该算法添加了自然语言处理(NLP)组件,以利用电子健康记录(EHR)来提高现有算法的性能。基于科学价值和预测的难度,eMERGE 选择了六个现有的表型,并用 NLP 进行增强。我们评估了性能、可移植性和易用性。我们总结了经验教训:(1)挑战;(2)基于现有证据和/或 eMERGE 经验解决挑战的最佳实践;(3)未来研究的机会。除了一个算法之外,添加 NLP 可提高或保持所有算法的精度和/或召回率。可移植性、表型工作流程/过程和技术是主要主题。使用 NLP,开发和验证需要更长的时间。除了 NLP 技术和算法可复制性的可移植性外,确保成功的因素还包括隐私保护、技术基础设施设置、知识产权协议和高效沟通。工作流程的改进可以改善沟通并减少实施时间。NLP 的性能主要取决于临床文档的异质性;因此,我们建议使用半结构化笔记、全面的文档和定制选项。通过改进表型算法性能,可以实现 NLP 的可移植性,但必须精心规划和架构算法,以支持本地定制。