Suppr超能文献

无研究偏差的单倍剂量不足预测

Haploinsufficiency predictions without study bias.

作者信息

Steinberg Julia, Honti Frantisek, Meader Stephen, Webber Caleb

机构信息

MRC Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford OX1 3PT, UK The Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK.

MRC Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford OX1 3PT, UK.

出版信息

Nucleic Acids Res. 2015 Sep 3;43(15):e101. doi: 10.1093/nar/gkv474. Epub 2015 May 22.

Abstract

Any given human individual carries multiple genetic variants that disrupt protein-coding genes, through structural variation, as well as nucleotide variants and indels. Predicting the phenotypic consequences of a gene disruption remains a significant challenge. Current approaches employ information from a range of biological networks to predict which human genes are haploinsufficient (meaning two copies are required for normal function) or essential (meaning at least one copy is required for viability). Using recently available study gene sets, we show that these approaches are strongly biased towards providing accurate predictions for well-studied genes. By contrast, we derive a haploinsufficiency score from a combination of unbiased large-scale high-throughput datasets, including gene co-expression and genetic variation in over 6000 human exomes. Our approach provides a haploinsufficiency prediction for over twice as many genes currently unassociated with papers listed in Pubmed as three commonly-used approaches, and outperforms these approaches for predicting haploinsufficiency for less-studied genes. We also show that fine-tuning the predictor on a set of well-studied 'gold standard' haploinsufficient genes does not improve the prediction for less-studied genes. This new score can readily be used to prioritize gene disruptions resulting from any genetic variant, including copy number variants, indels and single-nucleotide variants.

摘要

任何一个人类个体都携带多种通过结构变异以及核苷酸变异和插入缺失来破坏蛋白质编码基因的遗传变异。预测基因破坏的表型后果仍然是一项重大挑战。当前的方法利用一系列生物网络中的信息来预测哪些人类基因是单倍体不足的(即正常功能需要两个拷贝)或必需的(即生存至少需要一个拷贝)。利用最近可得的研究基因集,我们表明这些方法在为研究充分的基因提供准确预测方面存在强烈偏差。相比之下,我们从无偏差的大规模高通量数据集(包括基因共表达和6000多个人类外显子组中的遗传变异)的组合中得出单倍体不足评分。我们的方法为目前与PubMed列出的论文无关联的基因提供的单倍体不足预测数量是三种常用方法的两倍多,并且在预测研究较少的基因的单倍体不足方面优于这些方法。我们还表明,在一组研究充分的“金标准”单倍体不足基因上微调预测器并不能改善对研究较少的基因的预测。这个新评分可以很容易地用于对任何遗传变异(包括拷贝数变异、插入缺失和单核苷酸变异)导致的基因破坏进行优先级排序。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e654/4551909/a8d87e94a350/gkv474fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验