Patel Jay S, Kumar Krishna, Zai Ahad, Shin Daniel, Willis Lisa, Thyvalikakath Thankam P
Dental Informatics, Department of Cariology Operative Dentistry and Dental Public Health, Indiana Univesity School of Dentistry, Indianapolis, IN 46202, USA.
Health Informatics, Department of Health Services Administrations and Policy, Temple University College of Public Health, Philadelphia, PA 19122, USA.
Diagnostics (Basel). 2023 Mar 8;13(6):1028. doi: 10.3390/diagnostics13061028.
To develop two automated computer algorithms to extract information from clinical notes, and to generate three cohorts of patients (disease improvement, disease progression, and no disease change) to track periodontal disease (PD) change over time using longitudinal electronic dental records (EDR).
We conducted a retrospective study of 28,908 patients who received a comprehensive oral evaluation between 1 January 2009, and 31 December 2014, at Indiana University School of Dentistry (IUSD) clinics. We utilized various Python libraries, such as Pandas, TensorFlow, and PyTorch, and a natural language tool kit to develop and test computer algorithms. We tested the performance through a manual review process by generating a confusion matrix. We calculated precision, recall, sensitivity, specificity, and accuracy to evaluate the performances of the algorithms. Finally, we evaluated the density of longitudinal EDR data for the following follow-up times: (1) None; (2) Up to 5 years; (3) > 5 and ≤ 10 years; and (4) >10 and ≤ 15 years.
Thirty-four percent ( = 9954) of the study cohort had up to five years of follow-up visits, with an average of 2.78 visits with periodontal charting information. For clinician-documented diagnoses from clinical notes, 42% of patients ( = 5562) had at least two PD diagnoses to determine their disease change. In this cohort, with clinician-documented diagnoses, 72% percent of patients ( = 3919) did not have a disease status change between their first and last visits, 669 (13%) patients' disease status progressed, and 589 (11%) patients' disease improved.
This study demonstrated the feasibility of utilizing longitudinal EDR data to track disease changes over 15 years during the observation study period. We provided detailed steps and computer algorithms to clean and preprocess the EDR data and generated three cohorts of patients. This information can now be utilized for studying clinical courses using artificial intelligence and machine learning methods.
开发两种自动化计算机算法,从临床记录中提取信息,并使用纵向电子牙科记录(EDR)生成三组患者(疾病改善组、疾病进展组和疾病无变化组),以跟踪牙周疾病(PD)随时间的变化。
我们对2009年1月1日至2014年12月31日期间在印第安纳大学牙科学院(IUSD)诊所接受全面口腔评估的28,908名患者进行了回顾性研究。我们使用了各种Python库,如Pandas、TensorFlow和PyTorch,以及一个自然语言工具包来开发和测试计算机算法。我们通过生成混淆矩阵的人工审核过程来测试性能。我们计算了精确率、召回率、灵敏度、特异度和准确率来评估算法的性能。最后,我们评估了以下随访时间的纵向EDR数据密度:(1)无;(2)长达5年;(3)>5年且≤10年;(4)>10年且≤15年。
研究队列中的34%(n = 9954)患者有长达五年的随访就诊,平均有2.78次带有牙周图表信息的就诊。对于临床记录中医生记录的诊断,42%的患者(n = 5562)有至少两次PD诊断以确定其疾病变化。在这个队列中,根据医生记录的诊断,72%的患者(n = 3919)在首次和末次就诊之间疾病状态没有变化,669名(13%)患者的疾病状态进展,589名(11%)患者的疾病得到改善。
本研究证明了在观察研究期间利用纵向EDR数据跟踪15年疾病变化的可行性。我们提供了清理和预处理EDR数据并生成三组患者的详细步骤和计算机算法。这些信息现在可用于使用人工智能和机器学习方法研究临床病程。