School of Statistics, University of Minnesota, Minneapolis, MN 55455, USA.
Department of Statistics, University of Florida, Gainesville, FL 32611, USA.
Biometrics. 2024 Oct 3;80(4). doi: 10.1093/biomtc/ujae109.
Ancestry-specific proteome-wide association studies (PWAS) based on genetically predicted protein expression can reveal complex disease etiology specific to certain ancestral groups. These studies require ancestry-specific models for protein expression as a function of SNP genotypes. In order to improve protein expression prediction in ancestral populations historically underrepresented in genomic studies, we propose a new penalized maximum likelihood estimator for fitting ancestry-specific joint protein quantitative trait loci models. Our estimator borrows information across ancestral groups, while simultaneously allowing for heterogeneous error variances and regression coefficients. We propose an alternative parameterization of our model that makes the objective function convex and the penalty scale invariant. To improve computational efficiency, we propose an approximate version of our method and study its theoretical properties. Our method provides a substantial improvement in protein expression prediction accuracy in individuals of African ancestry, and in a downstream PWAS analysis, leads to the discovery of multiple associations between protein expression and blood lipid traits in the African ancestry population.
基于遗传预测蛋白质表达的特定祖源全蛋白质组关联研究(PWAS)可以揭示特定祖源群体特有的复杂疾病病因。这些研究需要针对蛋白质表达的特定祖源模型,作为 SNP 基因型的函数。为了提高在基因组研究中历史上代表性不足的祖源群体中的蛋白质表达预测,我们提出了一种新的惩罚最大似然估计器,用于拟合特定祖源的联合蛋白质数量性状位点模型。我们的估计器在跨祖源群体的同时借用信息,同时允许异质误差方差和回归系数。我们提出了我们模型的另一种参数化,使目标函数凸和惩罚尺度不变。为了提高计算效率,我们提出了我们方法的一个近似版本,并研究了它的理论性质。我们的方法在非洲裔个体的蛋白质表达预测准确性方面有了很大的提高,并且在下游的 PWAS 分析中,导致在非洲裔人群中发现了蛋白质表达与血液脂质特征之间的多个关联。