Department of Psychiatry, University of Oxford, Oxford OX3 7JX, UK.
Department of Psychiatry, University of Oxford, Oxford OX3 7JX, UK; 4youandme, Seattle, WA 98121-1031, USA.
Int J Med Inform. 2022 Apr;160:104704. doi: 10.1016/j.ijmedinf.2022.104704. Epub 2022 Jan 24.
UK Biobank (UKB) is widely employed to investigate mental health disorders and related exposures; however, its applicability and relevance in a clinical setting and the assumptions required have not been sufficiently and systematically investigated. Here, we present the first validation study using secondary care mental health data with linkage to UKB from Oxford - Clinical Record Interactive Search (CRIS) focusing on comparison of demographic information, diagnostic outcome, medication record and cognitive test results, with missing data and the implied bias from both resources depicted. We applied a natural language processing model to extract information embedded in unstructured text from clinical notes and attachments. Using a contingency table we compared the demographic information recorded in UKB and CRIS. We calculated the positive predictive value (PPV, proportion of true positives cases detected) for mental health diagnosis and relevant medication. Amongst the cohort of 854 subjects, PPVs for any mental health diagnosis for dementia, depression, bipolar disorder and schizophrenia were 41.6%, and were 59.5%, 12.5%, 50.0% and 52.6%, respectively. Self-reported medication records in UKB had general PPV of 47.0%, with the prevalence of frequently prescribed medicines to each typical mental health disorder considerably different from the information provided by CRIS. UKB is highly multimodal, but with limited follow-up records, whereas CRIS offers a longitudinal high-resolution clinical picture with more than ten years of observations. The linkage of both datasets will reduce the self-report bias and synergistically augment diverse modalities into a unified resource to facilitate more robust research in mental health.
英国生物库(UKB)广泛应用于研究精神健康障碍及相关暴露因素;然而,其在临床环境中的适用性和相关性,以及所需的假设,尚未得到充分和系统的研究。在此,我们首次使用牛津临床记录交互搜索(CRIS)的二级保健心理健康数据对英国生物库进行了验证研究,重点比较了人口统计学信息、诊断结果、药物记录和认知测试结果,同时还描绘了两个资源中的缺失数据和隐含偏差。我们应用自然语言处理模型从临床记录和附件中的非结构化文本中提取信息。使用列联表比较了 UKB 和 CRIS 中记录的人口统计学信息。我们计算了精神健康诊断和相关药物的阳性预测值(PPV,即检测到的真阳性病例比例)。在 854 名受试者队列中,痴呆、抑郁、双相情感障碍和精神分裂症的任何精神健康诊断的 PPV 分别为 41.6%、59.5%、12.5%和 50.0%。UKB 中自我报告的药物记录总体 PPV 为 47.0%,而经常开给每种典型精神健康障碍的药物的流行率与 CRIS 提供的信息有很大差异。UKB 高度多样化,但随访记录有限,而 CRIS 提供了具有十多年观察期的纵向高分辨率临床图像。两个数据集的链接将减少自我报告偏差,并协同增强多种模式,形成一个统一的资源,以促进精神健康领域更稳健的研究。