Evans William, Akyea Ralph K, Simms Alex, Kai Joe, Qureshi Nadeem
Primary Care Stratified Medicine (PRISM), Centre for Academic Primary Care, School of Medicine, University of Nottingham, Applied Health Research Building [42], University Park, Nottingham, NG7 2RD, UK.
Department of Cardiology, Leeds Teaching Hospital NHS Trust, Leeds, UK.
J Community Genet. 2024 Dec;15(6):687-698. doi: 10.1007/s12687-024-00742-7. Epub 2024 Oct 15.
Patients with rare genetic diseases frequently experience significant diagnostic delays. Routinely collected data in the electronic health record (EHR) may be used to help identify patients at risk of undiagnosed conditions. Long QT syndrome (LQTS) is a rare inherited cardiac condition associated with significant morbidity and premature mortality. In this study, we examine LQTS as an exemplar disease to assess if clinical features recorded in the primary care EHR can be used to develop and validate a predictive model to aid earlier detection.
1495 patients with an LQTS diagnostic code and 7475 propensity-score matched controls were identified from 10.5 million patients' electronic primary care records in the UK's Clinical Practice Research Datalink (CPRD). Associated clinical features recorded before diagnosis (with p < 0.05) were incorporated into a multivariable logistic regression model, the final model was determined by backwards regression and validated by bootstrapping to determine model optimism.
The mean age at LQTS diagnosis was 58.4 (SD 19.41). 18 features were included in the final model. Discriminative accuracy, assessed by area under the curve (AUC), was 0.74, (95% CI 0.73, 0.75) (optimism 6%). Features occurring at significantly greater frequency before diagnosis included: epilepsy, palpitations, syncope, collapse, mitral valve disease and irritable bowel syndrome.
This study demonstrates the potential to develop primary care prediction models for rare conditions, like LQTS, in routine primary care records and highlights key considerations including disease suitability, finding an appropriate linked dataset, the need for accurate case ascertainment and utilising an approach to modelling suitable for rare events.
患有罕见遗传病的患者常常经历显著的诊断延迟。电子健康记录(EHR)中常规收集的数据可用于帮助识别有未确诊疾病风险的患者。长QT综合征(LQTS)是一种罕见的遗传性心脏疾病,与显著的发病率和过早死亡相关。在本研究中,我们将LQTS作为一种典型疾病进行研究,以评估初级保健EHR中记录的临床特征是否可用于开发和验证预测模型,以帮助早期检测。
从英国临床实践研究数据链(CPRD)的1050万患者的电子初级保健记录中识别出1495例有LQTS诊断代码的患者和7475例倾向评分匹配的对照。将诊断前记录的相关临床特征(p<0.05)纳入多变量逻辑回归模型,最终模型通过向后回归确定,并通过自举法进行验证以确定模型乐观度。
LQTS诊断时的平均年龄为58.4岁(标准差19.41)。最终模型纳入了18个特征。通过曲线下面积(AUC)评估的判别准确性为0.74(95%CI 0.73,0.75)(乐观度6%)。诊断前出现频率显著更高的特征包括:癫痫、心悸、晕厥、虚脱、二尖瓣疾病和肠易激综合征。
本研究证明了在常规初级保健记录中为LQTS等罕见疾病开发初级保健预测模型的潜力,并强调了关键考虑因素,包括疾病适用性、找到合适的关联数据集、准确确定病例的必要性以及采用适合罕见事件的建模方法。