Li Shuangning, Ren Zhimei, Sabatti Chiara, Sesia Matteo
Department of Statistics, Harvard University, Stanford, CA 94305, USA.
Department of Statistics, University of Chicago, Chicago, IL 60637, USA.
Sankhya B (2008). 2022 Nov 15. doi: 10.1007/s13571-022-00297-y.
This paper presents and compares alternative transfer learning methods that can increase the power of conditional testing via knockoffs by leveraging prior information in external data sets collected from different populations or measuring relatedoutcomes. The relevance of this methodology is explored in particular within the context of genome-wide association studies, where it can be helpful to address the pressing need for principled ways to suitably account for, and efficiently learn from the genetic variation associated to diverse ancestries. Finally, we apply these methods to analyze several phenotypes in the UK Biobank data set, demonstrating that transfer learning helps knockoffs discover more associations in the data collected from minority populations, potentially opening the way to the development of more accurate polygenic risk scores.
本文介绍并比较了几种替代迁移学习方法,这些方法可以通过利用从不同人群收集的外部数据集中的先验信息或测量相关结果,通过仿冒品来提高条件测试的效力。特别是在全基因组关联研究的背景下探讨了这种方法的相关性,在该研究中,对于找到有原则的方法来适当地考虑与不同祖先相关的遗传变异并从中有效学习的迫切需求,这种方法可能会有所帮助。最后,我们应用这些方法来分析英国生物银行数据集中的几种表型,证明迁移学习有助于仿冒品在从少数族裔人群收集的数据中发现更多关联,这可能为开发更准确的多基因风险评分开辟道路。