Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA.
Programs in Metabolism and Medical & Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA.
Diabetes Care. 2023 Apr 1;46(4):794-800. doi: 10.2337/dc22-1833.
Automated algorithms to identify individuals with type 1 diabetes using electronic health records are increasingly used in biomedical research. It is not known whether the accuracy of these algorithms differs by self-reported race. We investigated whether polygenic scores improve identification of individuals with type 1 diabetes.
We investigated two large hospital-based biobanks (Mass General Brigham [MGB] and BioMe) and identified individuals with type 1 diabetes using an established automated algorithm. We performed medical record reviews to validate the diagnosis of type 1 diabetes. We implemented two published polygenic scores for type 1 diabetes (developed in individuals of European or African ancestry). We assessed the classification algorithm before and after incorporating polygenic scores.
The automated algorithm was more likely to incorrectly assign a diagnosis of type 1 diabetes in self-reported non-White individuals than in self-reported White individuals (odds ratio 3.45; 95% CI 1.54-7.69; P = 0.0026). After incorporating polygenic scores into the MGB Biobank, the positive predictive value of the type 1 diabetes algorithm increased from 70 to 97% for self-reported White individuals (meaning that 97% of those predicted to have type 1 diabetes indeed had type 1 diabetes) and from 53 to 100% for self-reported non-White individuals. Similar results were found in BioMe.
Automated phenotyping algorithms may exacerbate health disparities because of an increased risk of misclassification of individuals from underrepresented populations. Polygenic scores may be used to improve the performance of phenotyping algorithms and potentially reduce this disparity.
使用电子健康记录自动识别 1 型糖尿病患者的算法在生物医学研究中越来越受欢迎。这些算法的准确性是否因自我报告的种族而异尚不清楚。我们研究了多基因评分是否能提高 1 型糖尿病患者的识别率。
我们调查了两个大型基于医院的生物库(MGB 和 BioMe),并使用一种已建立的自动算法来识别 1 型糖尿病患者。我们进行了病历审查以验证 1 型糖尿病的诊断。我们实施了两种已发表的用于 1 型糖尿病的多基因评分(在欧洲或非洲血统的个体中开发)。我们评估了在纳入多基因评分前后的分类算法。
与自我报告为白人的个体相比,自动算法更有可能错误地分配 1 型糖尿病的诊断给自我报告为非白人的个体(比值比 3.45;95%置信区间 1.54-7.69;P = 0.0026)。在将多基因评分纳入 MGB 生物库后,1 型糖尿病算法的阳性预测值从自我报告为白人的个体的 70%增加到 97%(意味着 97%预测患有 1 型糖尿病的个体实际上患有 1 型糖尿病),从自我报告为非白人的个体的 53%增加到 100%。在 BioMe 中也发现了类似的结果。
自动表型算法可能会加剧健康差距,因为代表性不足的人群的个体分类错误风险增加。多基因评分可用于提高表型算法的性能,并可能减少这种差异。