Janoudi Ghayath, Fell Deshayne B, Ray Joel G, Foster Angel M, Giffen Randy, Clifford Tammy J, Rodger Marc A, Smith Graeme N, Walker Mark C
Epidemiology and Public Health, University of Ottawa, Ottawa, CAN.
Clinical Epidemiology, Ottawa Hospital Research Institute, Ottawa, CAN.
Cureus. 2023 Mar 30;15(3):e36909. doi: 10.7759/cureus.36909. eCollection 2023 Mar.
Objectives Clinical discoveries are heralded by observing unique and unusual clinical cases. The effort of identifying such cases rests on the shoulders of busy clinicians. We assess the feasibility and applicability of an augmented intelligence framework to accelerate the rate of clinical discovery in preeclampsia and hypertensive disorders of pregnancy-an area that has seen little change in its clinical management. Methods We conducted a retrospective exploratory outlier analysis of participants enrolled in the folic acid clinical trial (FACT, N=2,301) and the Ottawa and Kingston birth cohort (OaK, N=8,085). We applied two outlier analysis methods: extreme misclassification contextual outlier and isolation forest point outlier. The extreme misclassification contextual outlier is based on a random forest predictive model for the outcome of preeclampsia in FACT and hypertensive disorder of pregnancy in OaK. We defined outliers in the extreme misclassification approach as mislabelled observations with a confidence level of more than 90%. Within the isolation forest approach, we defined outliers as observations with an average path length z score less or equal to -3, or more or equal to 3. Content experts reviewed the identified outliers and determined if they represented a potential novelty that could conceivably lead to a clinical discovery. Results In the FACT study, we identified 19 outliers using the isolation forest algorithm and 13 outliers using the random forest extreme misclassification approach. We determined that three (15.8%) and 10 (76.9%) were potential novelties, respectively. Out of 8,085 participants in the OaK study, we identified 172 outliers using the isolation forest algorithm and 98 outliers using the random forest extreme misclassification approach; four (2.3%) and 32 (32.7%), respectively, were potential novelties. Overall, the outlier analysis part of the augmented intelligence framework identified a total of 302 outliers. These were subsequently reviewed by content experts, representing the human part of the augmented intelligence framework. The clinical review determined that 49 of the 302 outliers represented potential novelties. Conclusions Augmented intelligence using extreme misclassification outlier analysis is a feasible and applicable approach for accelerating the rate of clinical discoveries. The use of an extreme misclassification contextual outlier analysis approach has resulted in a higher proportion of potential novelties than using the more traditional point outlier isolation forest approach. This finding was consistent in both the clinical trial and real-world cohort study data. Using augmented intelligence through outlier analysis has the potential to speed up the process of identifying potential clinical discoveries. This approach can be replicated across clinical disciplines and could exist within electronic medical records systems to automatically identify outliers within clinical notes to clinical experts.
目的 临床发现是通过观察独特且不寻常的临床病例来预示的。识别此类病例的工作落在忙碌的临床医生肩上。我们评估一种增强智能框架在加速子痫前期和妊娠高血压疾病临床发现率方面的可行性和适用性,这是一个临床管理几乎没有变化的领域。方法 我们对参与叶酸临床试验(FACT,N = 2301)和渥太华与金斯顿出生队列(OaK,N = 8085)的参与者进行了回顾性探索性异常值分析。我们应用了两种异常值分析方法:极端错误分类上下文异常值和孤立森林点异常值。极端错误分类上下文异常值基于FACT中子痫前期结局和OaK中妊娠高血压疾病结局的随机森林预测模型。我们将极端错误分类方法中的异常值定义为置信水平超过90%的错误标记观测值。在孤立森林方法中,我们将异常值定义为平均路径长度z分数小于或等于 -3 或大于或等于3的观测值。内容专家审查了识别出的异常值,并确定它们是否代表可能导致临床发现的潜在新颖性。结果 在FACT研究中,我们使用孤立森林算法识别出19个异常值,使用随机森林极端错误分类方法识别出13个异常值。我们分别确定其中3个(15.8%)和10个(76.9%)是潜在新颖性。在OaK研究的8085名参与者中,我们使用孤立森林算法识别出172个异常值,使用随机森林极端错误分类方法识别出98个异常值;分别有4个(2.3%)和32个(32.7%)是潜在新颖性。总体而言,增强智能框架的异常值分析部分共识别出302个异常值。随后由内容专家进行审查,这代表了增强智能框架的人工部分。临床审查确定302个异常值中有49个代表潜在新颖性。结论 使用极端错误分类异常值分析的增强智能是加速临床发现率的一种可行且适用的方法。与使用更传统的点异常值孤立森林方法相比,使用极端错误分类上下文异常值分析方法产生的潜在新颖性比例更高。这一发现在临床试验和真实世界队列研究数据中均一致。通过异常值分析使用增强智能有可能加快识别潜在临床发现的过程。这种方法可以在各临床学科中复制,并且可以存在于电子病历系统中,以自动在临床记录中识别异常值供临床专家参考。