Didden Eva-Maria, Lu Di, Hsi Andrew, Brand Monika, Hedlin Haley, Zamanian Roham T
Global Epidemiology, Rare Disease Epicenter, Actelion Pharmaceuticals Ltd Janssen Pharmaceutical Company of Johnson & Johnson Allschwil Switzerland.
Quantitative Sciences Unit Stanford University Stanford California USA.
Pulm Circ. 2024 Feb 8;14(1):e12333. doi: 10.1002/pul2.12333. eCollection 2024 Jan.
Pulmonary arterial hypertension (PAH) is a rare subgroup of pulmonary hypertension (PH). Claims and administrative databases can be particularly important for research in rare diseases; however, there is a lack of validated algorithms to identify PAH patients using administrative codes. We aimed to measure the accuracy of code-based PAH algorithms against the true clinical diagnosis by right heart catheterization (RHC). This study evaluated algorithms in patients who were recorded in two linkable data assets: the Stanford Healthcare administrative electronic health record database and the Stanford Vera Moulton Wall Center clinical PH database (which records each patient's RHC diagnosis). We assessed the sensitivity and specificity achieved by 16 algorithms (six published). In total, 720 PH patients with linked data available were included and 558 (78%) of these were PAH patients. Algorithms consisting solely of a P(A)H-specific diagnostic code classed all or almost all PH patients as PAH (sensitivity >97%, specificity <12%) while multicomponent algorithms with well-defined temporal sequences of procedure, diagnosis and treatment codes achieved a better balance of sensitivity and specificity. Specificity increased and sensitivity decreased with increasing algorithm complexity. The best-performing algorithms, in terms of fewest misclassified patients, included multiple components (e.g., PH diagnosis, PAH treatment, continuous enrollment for ≥6 months before and ≥12 months following index date) and achieved sensitivities and specificities of around 95% and 38%, respectively. Our findings help researchers tailor their choice and design of code-based PAH algorithms to their research question and demonstrate the importance of including well-defined temporal components in the algorithms.
肺动脉高压(PAH)是肺动脉高压(PH)中一个罕见的亚组。索赔和行政数据库对于罕见病研究可能尤为重要;然而,缺乏使用行政代码识别PAH患者的经过验证的算法。我们旨在通过右心导管检查(RHC)来衡量基于代码的PAH算法相对于真实临床诊断的准确性。本研究评估了在两个可关联数据资产中记录的患者的算法:斯坦福医疗行政电子健康记录数据库和斯坦福维拉·莫尔顿·沃尔中心临床PH数据库(该数据库记录每位患者的RHC诊断)。我们评估了16种算法(6种已发表)的敏感性和特异性。总共纳入了720名有可用关联数据的PH患者,其中558名(78%)为PAH患者。仅由PAH特异性诊断代码组成的算法将所有或几乎所有PH患者归类为PAH(敏感性>97%,特异性<12%),而具有明确程序、诊断和治疗代码时间顺序的多组分算法在敏感性和特异性之间实现了更好的平衡。随着算法复杂性的增加,特异性增加而敏感性降低。就误分类患者最少而言,表现最佳的算法包括多个组分(例如,PH诊断、PAH治疗、索引日期前≥6个月和索引日期后≥12个月持续入组),其敏感性和特异性分别约为95%和38%。我们的研究结果有助于研究人员根据其研究问题定制基于代码的PAH算法的选择和设计,并证明在算法中纳入明确时间组分的重要性。