Pillai Malvika, Posada Jose, Gardner Rebecca M, Hernandez-Boussard Tina, Bannett Yair
Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA 94305, United States.
Computer Science Department, University of the North, Barranquilla 080020, Colombia.
J Am Med Inform Assoc. 2024 Apr 3;31(4):949-957. doi: 10.1093/jamia/ocae001.
To measure pediatrician adherence to evidence-based guidelines in the treatment of young children with attention-deficit/hyperactivity disorder (ADHD) in a diverse healthcare system using natural language processing (NLP) techniques.
We extracted structured and free-text data from electronic health records (EHRs) of all office visits (2015-2019) of children aged 4-6 years in a community-based primary healthcare network in California, who had ≥1 visits with an ICD-10 diagnosis of ADHD. Two pediatricians annotated clinical notes of the first ADHD visit for 423 patients. Inter-annotator agreement (IAA) was assessed for the recommendation for the first-line behavioral treatment (F-measure = 0.89). Four pre-trained language models, including BioClinical Bidirectional Encoder Representations from Transformers (BioClinicalBERT), were used to identify behavioral treatment recommendations using a 70/30 train/test split. For temporal validation, we deployed BioClinicalBERT on 1,020 unannotated notes from other ADHD visits and well-care visits; all positively classified notes (n = 53) and 5% of negatively classified notes (n = 50) were manually reviewed.
Of 423 patients, 313 (74%) were male; 298 (70%) were privately insured; 138 (33%) were White; 61 (14%) were Hispanic. The BioClinicalBERT model trained on the first ADHD visits achieved F1 = 0.76, precision = 0.81, recall = 0.72, and AUC = 0.81 [0.72-0.89]. Temporal validation achieved F1 = 0.77, precision = 0.68, and recall = 0.88. Fairness analysis revealed low model performance in publicly insured patients (F1 = 0.53).
Deploying pre-trained language models on a variable set of clinical notes accurately captured pediatrician adherence to guidelines in the treatment of children with ADHD. Validating this approach in other patient populations is needed to achieve equitable measurement of quality of care at scale and improve clinical care for mental health conditions.
利用自然语言处理(NLP)技术,在一个多样化的医疗系统中,衡量儿科医生在治疗患有注意力缺陷多动障碍(ADHD)的幼儿时对循证指南的遵循情况。
我们从加利福尼亚州一个社区基层医疗网络中4至6岁儿童(有≥1次就诊且ICD-10诊断为ADHD)的所有门诊就诊(2015 - 2019年)的电子健康记录(EHR)中提取结构化和自由文本数据。两名儿科医生对423例患者首次ADHD就诊的临床记录进行了标注。对一线行为治疗的推荐评估了标注者间一致性(IAA)(F值 = 0.89)。使用包括生物临床双向编码器表征从变换器(BioClinicalBERT)在内的四个预训练语言模型,采用70/30的训练/测试划分来识别行为治疗推荐。为进行时间验证,我们将BioClinicalBERT应用于来自其他ADHD就诊和健康体检就诊的1020份未标注记录;对所有阳性分类记录(n = 53)和5%的阴性分类记录(n = 50)进行人工审核。
在423例患者中,313例(74%)为男性;298例(70%)有私人保险;138例(33%)为白人;61例(14%)为西班牙裔。在首次ADHD就诊记录上训练的BioClinicalBERT模型的F1值 = 0.76,精确率 = 0.81,召回率 = 0.72,曲线下面积(AUC) = 0.81 [0.72 - 0.89]。时间验证的F1值 = 0.77,精确率 = 0.68,召回率 = 0.88。公平性分析显示,在参加公共保险的患者中模型表现较差(F1值 = 0.53)。
在一组可变的临床记录上部署预训练语言模型,能够准确捕捉儿科医生在治疗ADHD儿童时对指南的遵循情况。需要在其他患者群体中验证这种方法,以实现大规模公平的医疗质量评估,并改善心理健康状况的临床护理。