Curtis David
UCL Genetics Institute, UCL, London, United Kingdom.
Centre for Psychiatry, Queen Mary University London, London, United Kingdom.
Ann Hum Genet. 2019 Jul;83(4):274-277. doi: 10.1111/ahg.12302. Epub 2019 Mar 25.
A recent study claimed that genome-wide polygenic scores (GPSs) for five common diseases could identify individuals with risk equivalent to monogenic mutations. Receiver operator curve analyses were reported to have areas under the curve (AUCs) ranging from 0.63 for inflammatory bowel disease up to 0.81 for coronary artery disease (CAD), but these models also included age and sex, themselves strong predictors of risk. The GPS for CAD identified 8% of the population at threefold increased risk, which was claimed to be comparable to the excess risk from monogenic mutations.
In the present study, attempts were made to model the distribution of the GPS for CAD to match the information provided. These models were based on the reported distribution of prevalence by centile of GPS and on the distribution of GPS in controls and cases, and were fitted to the reported results using linear approximations to the distributions and using simulations of a liability-threshold model.
It was impossible to produce a compatible model in which the GPS produced an AUC as high as 0.81 and the most plausible estimate was that the true AUC was only 0.65. The reported distributions of the GPS in cases and controls overlap so much that they are not compatible with an AUC of 0.7 or higher.
The AUC of the GPS for these diseases is modest. Furthermore, the literature robustly demonstrates that the true CAD risk associated with monogenic mutations is much higher than the threefold increase that is predicted by the GPS. Together, these findings cast doubt on the clinical utility of the GPS.
最近一项研究声称,针对五种常见疾病的全基因组多基因评分(GPS)能够识别出风险等同于单基因突变异种的个体。据报道,接受者操作特征曲线分析得出的曲线下面积(AUC)范围为,炎症性肠病的0.63至冠状动脉疾病(CAD)的0.81,但这些模型还纳入了年龄和性别,而年龄和性别本身就是风险的有力预测因素。CAD的GPS识别出8%的人群风险增加了两倍,据称这与单基因突变异种带来的额外风险相当。
在本研究中,我们尝试对CAD的GPS分布进行建模,以匹配所提供的信息。这些模型基于所报道的按GPS百分位数划分的患病率分布以及GPS在对照组和病例组中的分布,并使用分布的线性近似以及责任阈值模型的模拟来拟合所报道的结果。
不可能构建出一个兼容的模型,使GPS产生高达0.81的AUC,最合理的估计是真实的AUC仅为0.65。所报道的病例组和对照组中GPS的分布重叠程度如此之大,以至于它们与0.7或更高的AUC不兼容。
这些疾病的GPS的AUC适中。此外,文献有力地证明,与单基因突变异种相关的真实CAD风险远高于GPS预测的两倍增加。综上所述,这些发现对GPS的临床实用性提出了质疑。