Milton Jacqueline N, Gordeuk Victor R, Taylor James G, Gladwin Mark T, Steinberg Martin H, Sebastiani Paola
Department of Biostatistics, Boston University School of Public Health, Boston, MA.
Circ Cardiovasc Genet. 2014 Apr;7(2):110-5. doi: 10.1161/CIRCGENETICS.113.000387. Epub 2014 Mar 1.
Fetal hemoglobin (HbF) is the major modifier of the clinical course of sickle cell anemia. Its levels are highly heritable, and its interpersonal variability is modulated in part by 3 quantitative trait loci that affect HbF gene expression. Genome-wide association studies have identified single-nucleotide polymorphisms (SNPs) in these quantitative trait loci that are highly associated with HbF but explain only 10% to 12% of the variance of HbF. Combining SNPs into a genetic risk score can help to explain a larger amount of the variability of HbF level, but the challenge of this approach is to select the optimal number of SNPs to be included in the genetic risk score.
We developed a collection of 14 models with genetic risk score composed of different numbers of SNPs and used the ensemble of these models to predict HbF in patients with sickle cell anemia. The models were trained in 841 patients with sickle cell anemia and were tested in 3 independent cohorts. The ensemble of 14 models explained 23.4% of the variability in HbF in the discovery cohort, whereas the correlation between predicted and observed HbF in the 3 independent cohorts ranged between 0.28 and 0.44. The models included SNPs in BCL11A, the HBS1L-MYB intergenic region, and the site of the HBB gene cluster, quantitative trait loci previously associated with HbF.
An ensemble of 14 genetic risk models can predict HbF levels with accuracy between 0.28 and 0.44, and the approach may also prove useful in other applications.
胎儿血红蛋白(HbF)是镰状细胞贫血临床病程的主要调节因子。其水平具有高度遗传性,人与人之间的变异性部分由影响HbF基因表达的3个数量性状位点调节。全基因组关联研究已经在这些数量性状位点中鉴定出与HbF高度相关的单核苷酸多态性(SNP),但仅解释了HbF变异的10%至12%。将SNP组合成遗传风险评分有助于解释HbF水平的更大变异性,但这种方法的挑战在于选择纳入遗传风险评分的最佳SNP数量。
我们开发了一组由不同数量SNP组成遗传风险评分的14个模型,并使用这些模型的集合来预测镰状细胞贫血患者的HbF。这些模型在841例镰状细胞贫血患者中进行训练,并在3个独立队列中进行测试。14个模型的集合在发现队列中解释了HbF变异性的23.4%,而在3个独立队列中预测的HbF与观察到的HbF之间的相关性在0.28至0.44之间。这些模型包括BCL11A、HBS1L-MYB基因间区域以及HBB基因簇位点中的SNP,这些数量性状位点先前与HbF相关。
14个遗传风险模型的集合可以预测HbF水平,准确率在0.28至0.44之间,并且该方法在其他应用中可能也有用。