Janssen Research and Development Epidemiology, Titusville, New Jersey, United States of America.
Observational Health Data Sciences and Informatics (OHDSI), New York, New York, United States of America.
PLoS One. 2023 Feb 16;18(2):e0281929. doi: 10.1371/journal.pone.0281929. eCollection 2023.
Systemic lupus erythematosus (SLE) is a chronic autoimmune disease of unknown origin. The objective of this research was to develop phenotype algorithms for SLE suitable for use in epidemiological studies using empirical evidence from observational databases.
We used a process for empirically determining and evaluating phenotype algorithms for health conditions to be analyzed in observational research. The process started with a literature search to discover prior algorithms used for SLE. We then used a set of Observational Health Data Sciences and Informatics (OHDSI) open-source tools to refine and validate the algorithms. These included tools to discover codes for SLE that may have been missed in prior studies and to determine possible low specificity and index date misclassification in algorithms for correction.
We developed four algorithms using our process: two algorithms for prevalent SLE and two for incident SLE. The algorithms for both incident and prevalent cases are comprised of a more specific version and a more sensitive version. Each of the algorithms corrects for possible index date misclassification. After validation, we found the highest positive predictive value estimate for the prevalent, specific algorithm (89%). The highest sensitivity estimate was found for the sensitive, prevalent algorithm (77%).
We developed phenotype algorithms for SLE using a data-driven approach. The four final algorithms may be used directly in observational studies. The validation of these algorithms provides researchers an added measure of confidence that the algorithms are selecting subjects correctly and allows for the application of quantitative bias analysis.
系统性红斑狼疮(SLE)是一种病因不明的慢性自身免疫性疾病。本研究旨在开发适用于流行病学研究的 SLE 表型算法,该算法基于观察性数据库中的经验证据。
我们使用一种针对观察性研究中分析的健康状况的经验确定和评估表型算法的过程。该过程首先从文献检索开始,以发现先前用于 SLE 的算法。然后,我们使用一组观察性健康数据科学和信息学(OHDSI)开源工具来改进和验证这些算法。这些工具包括用于发现先前研究中可能遗漏的 SLE 代码的工具,以及用于确定算法中可能存在的低特异性和索引日期分类错误的工具。
我们使用我们的过程开发了四个算法:两个用于现患 SLE 的算法和两个用于新发病例 SLE 的算法。现患和新发病例的算法都包括更具体的版本和更敏感的版本。每个算法都对可能的索引日期分类错误进行了校正。经过验证,我们发现最常见的现患、特定算法的阳性预测值估计值最高(89%)。最敏感的算法是敏感的现患算法(77%)。
我们使用数据驱动的方法开发了 SLE 表型算法。这四个最终的算法可直接用于观察性研究。这些算法的验证为研究人员提供了一个额外的信心指标,即算法能够正确选择研究对象,并允许进行定量偏倚分析。