Medical & Imaging Informatics, University of California Los Angeles, 924 Westwood Blvd Suite 420, Los Angeles, CA 90024, USA.
Department of Medicine, Division of Rheumatology, David Geffen School of Medicine, 10833 Le Conte Ave, Los Angeles, CA 90095, USA.
J Biomed Inform. 2022 Nov;135:104214. doi: 10.1016/j.jbi.2022.104214. Epub 2022 Oct 8.
To better understand the challenges of generally implementing and adapting computational phenotyping approaches, the performance of a Phenotype KnowledgeBase (PheKB) algorithm for rheumatoid arthritis (RA) was evaluated on a University of California, Los Angeles (UCLA) patient population, focusing on examining its performance on ambiguous cases. The algorithm was evaluated on a cohort of 4,766 patients, along with a chart review of 300 patients by rheumatologists against accepted diagnostic guidelines. The performance revealed low sensitivity towards specific subtypes of positive RA cases, which suggests revisions in features used for phenotyping. A close examination of select cases also indicated a significant portion of patients with missing data, drawing attention to the need to consider data integrity as an integral part of phenotyping pipelines, as well as issues around the usability of various codes for distinguishing cases. We use patterns in the PheKB algorithm's errors to further demonstrate important considerations when designing a phenotyping algorithm.
为了更好地了解一般实施和适应计算表型方法的挑战,我们评估了 Phenotype KnowledgeBase(PheKB)算法在加利福尼亚大学洛杉矶分校(UCLA)患者群体中治疗类风湿关节炎(RA)的性能,重点研究其在模棱两可病例中的表现。该算法在 4766 名患者的队列中进行了评估,并由风湿病专家对 300 名患者进行了图表审查,以符合公认的诊断标准。结果表明,该算法对阳性 RA 病例的某些特定亚型的敏感性较低,这表明需要对用于表型分析的特征进行修订。对选定病例的仔细检查还表明,大量患者的数据缺失,这引起了人们对数据完整性的关注,数据完整性是表型分析管道的一个组成部分,同时也涉及到用于区分病例的各种代码的可用性问题。我们利用 PheKB 算法错误中的模式进一步展示了在设计表型分析算法时需要考虑的重要因素。