Tavares Vânia, Monteiro Joana, Vassos Evangelos, Coleman Jonathan, Prata Diana
Instituto de Biofísica e Engenharia Biomédica, Faculdade de Ciências, Universidade de Lisboa, 1749-016 Lisboa, Portugal.
Faculdade de Medicina, Universidade de Lisboa, 1649-028 Lisboa, Portugal.
Genes (Basel). 2021 Sep 28;12(10):1531. doi: 10.3390/genes12101531.
Predicting gene expression from genotyped data is valuable for studying inaccessible tissues such as the brain. Herein we present eGenScore, a polygenic/poly-variation method, and compare it with PrediXcan, a method based on regularized linear regression using elastic nets. While both methods have the same purpose of predicting gene expression based on genotype, they carry important methodological differences. We compared the performance of expression quantitative trait loci (eQTL) models to predict gene expression in the frontal cortex, comparing across these frameworks (eGenScore vs. PrediXcan) and training datasets (BrainEAC, which is brain-specific, vs. GTEx, which has data across multiple tissues). In addition to internal five-fold cross-validation, we externally validated the gene expression models using the CommonMind Consortium database. Our results showed that (1) PrediXcan outperforms eGenScore regardless of the training database used; and (2) when using PrediXcan, the performance of the eQTL models in frontal cortex is higher when trained with GTEx than with BrainEAC.
从基因分型数据预测基因表达对于研究诸如大脑等难以获取的组织具有重要价值。在此,我们介绍了一种多基因/多变异方法eGenScore,并将其与PrediXcan进行比较,后者是一种基于使用弹性网络的正则化线性回归的方法。虽然这两种方法都有基于基因型预测基因表达的相同目的,但它们在方法上存在重要差异。我们比较了表达定量性状位点(eQTL)模型在额叶皮质中预测基因表达的性能,在这些框架(eGenScore与PrediXcan)和训练数据集(特定于大脑的BrainEAC与具有多个组织数据的GTEx)之间进行比较。除了内部五折交叉验证外,我们还使用CommonMind联盟数据库对基因表达模型进行了外部验证。我们的结果表明:(1)无论使用何种训练数据库,PrediXcan的性能均优于eGenScore;(2)当使用PrediXcan时,在额叶皮质中,使用GTEx训练的eQTL模型的性能高于使用BrainEAC训练的模型。