Department of Genetics, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA.
Department of Biochemistry and Microbiology, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA.
Hum Genet. 2022 Oct;141(10):1615-1627. doi: 10.1007/s00439-022-02450-z. Epub 2022 Mar 26.
Infertility is a major reproductive health issue that affects about 12% of women of reproductive age in the United States. Aneuploidy in eggs accounts for a significant proportion of early miscarriage and in vitro fertilization failure. Recent studies have shown that genetic variants in several genes affect chromosome segregation fidelity and predispose women to a higher incidence of egg aneuploidy. However, the exact genetic causes of aneuploid egg production remain unclear, making it difficult to diagnose infertility based on individual genetic variants in mother's genome. In this study, we evaluated machine learning-based classifiers for predicting the embryonic aneuploidy risk in female IVF patients using whole-exome sequencing data. Using two exome datasets, we obtained an area under the receiver operating curve of 0.77 and 0.68, respectively. High precision could be traded off for high specificity in classifying patients by selecting different prediction score cutoffs. For example, a strict prediction score cutoff of 0.7 identified 29% of patients as high-risk with 94% precision. In addition, we identified MCM5, FGGY, and DDX60L as potential aneuploidy risk genes that contribute the most to the predictive power of the model. These candidate genes and their molecular interaction partners are enriched for meiotic-related gene ontology categories and pathways, such as microtubule organizing center and DNA recombination. In summary, we demonstrate that sequencing data can be mined to predict patients' aneuploidy risk thus improving clinical diagnosis. The candidate genes and pathways we identified are promising targets for future aneuploidy studies.
不孕不育是一个主要的生殖健康问题,影响了美国约 12%的育龄妇女。卵子的非整倍体是早期流产和体外受精失败的一个重要原因。最近的研究表明,几个基因中的遗传变异会影响染色体分离的保真度,并使女性更容易发生卵子非整倍体。然而,卵子产生非整倍体的确切遗传原因尚不清楚,这使得根据母亲基因组中的个体遗传变异来诊断不孕不育变得困难。在这项研究中,我们使用全外显子组测序数据评估了基于机器学习的分类器,以预测女性体外受精患者的胚胎非整倍体风险。使用两个外显子组数据集,我们分别获得了 0.77 和 0.68 的接收者操作特征曲线下面积。通过选择不同的预测评分截断值,可以在分类患者时权衡高精度和高特异性。例如,严格的预测评分截断值为 0.7,可以将 29%的患者识别为高风险,准确率为 94%。此外,我们确定了 MCM5、FGGY 和 DDX60L 作为潜在的非整倍体风险基因,它们对模型的预测能力贡献最大。这些候选基因及其分子相互作用伙伴在与减数分裂相关的基因本体类别和途径中富集,如微管组织中心和 DNA 重组。总之,我们证明可以挖掘测序数据来预测患者的非整倍体风险,从而改善临床诊断。我们鉴定的候选基因和途径是未来非整倍体研究的有前途的目标。