Suppr
超能文献

无研究偏差的单倍剂量不足预测

Haploinsufficiency predictions without study bias.

作者信息

Steinberg Julia, Honti Frantisek, Meader Stephen, Webber Caleb

机构信息

MRC Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford OX1 3PT, UK The Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK.

MRC Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford OX1 3PT, UK.

出版信息

Nucleic Acids Res. 2015 Sep 3;43(15):e101. doi: 10.1093/nar/gkv474. Epub 2015 May 22.

DOI:10.1093/nar/gkv474

PMID:26001969

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4551909/

Abstract

Any given human individual carries multiple genetic variants that disrupt protein-coding genes, through structural variation, as well as nucleotide variants and indels. Predicting the phenotypic consequences of a gene disruption remains a significant challenge. Current approaches employ information from a range of biological networks to predict which human genes are haploinsufficient (meaning two copies are required for normal function) or essential (meaning at least one copy is required for viability). Using recently available study gene sets, we show that these approaches are strongly biased towards providing accurate predictions for well-studied genes. By contrast, we derive a haploinsufficiency score from a combination of unbiased large-scale high-throughput datasets, including gene co-expression and genetic variation in over 6000 human exomes. Our approach provides a haploinsufficiency prediction for over twice as many genes currently unassociated with papers listed in Pubmed as three commonly-used approaches, and outperforms these approaches for predicting haploinsufficiency for less-studied genes. We also show that fine-tuning the predictor on a set of well-studied 'gold standard' haploinsufficient genes does not improve the prediction for less-studied genes. This new score can readily be used to prioritize gene disruptions resulting from any genetic variant, including copy number variants, indels and single-nucleotide variants.

摘要

任何一个人类个体都携带多种通过结构变异以及核苷酸变异和插入缺失来破坏蛋白质编码基因的遗传变异。预测基因破坏的表型后果仍然是一项重大挑战。当前的方法利用一系列生物网络中的信息来预测哪些人类基因是单倍体不足的（即正常功能需要两个拷贝）或必需的（即生存至少需要一个拷贝）。利用最近可得的研究基因集，我们表明这些方法在为研究充分的基因提供准确预测方面存在强烈偏差。相比之下，我们从无偏差的大规模高通量数据集（包括基因共表达和6000多个人类外显子组中的遗传变异）的组合中得出单倍体不足评分。我们的方法为目前与PubMed列出的论文无关联的基因提供的单倍体不足预测数量是三种常用方法的两倍多，并且在预测研究较少的基因的单倍体不足方面优于这些方法。我们还表明，在一组研究充分的“金标准”单倍体不足基因上微调预测器并不能改善对研究较少的基因的预测。这个新评分可以很容易地用于对任何遗传变异（包括拷贝数变异、插入缺失和单核苷酸变异）导致的基因破坏进行优先级排序。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e654/4551909/a8d87e94a350/gkv474fig1.jpg

相似文献

Haploinsufficiency predictions without study bias.

Nucleic Acids Res. 2015 Sep 3;43(15):e101. doi: 10.1093/nar/gkv474. Epub 2015 May 22.

HIPred: an integrative approach to predicting haploinsufficient genes.

Bioinformatics. 2017 Jun 15;33(12):1751-1757. doi: 10.1093/bioinformatics/btx028.

Characterising and predicting haploinsufficiency in the human genome.

PLoS Genet. 2010 Oct 14;6(10):e1001154. doi: 10.1371/journal.pgen.1001154.

Distinct epigenomic patterns are associated with haploinsufficiency and predict risk genes of developmental disorders.

Nat Commun. 2018 May 30;9(1):2138. doi: 10.1038/s41467-018-04552-7.

Deep multiple-instance learning accurately predicts gene haploinsufficiency and deletion pathogenicity.

bioRxiv. 2023 Oct 5:2023.08.29.555384. doi: 10.1101/2023.08.29.555384.

Characterization and prediction of haploinsufficiency using systems-level gene properties in yeast.

G3 (Bethesda). 2013 Nov 6;3(11):1965-77. doi: 10.1534/g3.113.008144.

A random set scoring model for prioritization of disease candidate genes using protein complexes and data-mining of GeneRIF, OMIM and PubMed records.

BMC Bioinformatics. 2014 Sep 24;15(1):315. doi: 10.1186/1471-2105-15-315.

Genetic analysis of ocular tumour-associated genes using large genomic datasets: insights into selection constraints and variant representation in the population.

BMJ Open Ophthalmol. 2024 Feb 21;9(1):e001565. doi: 10.1136/bmjophth-2023-001565.

Integrative analysis of genomic variants reveals new associations of candidate haploinsufficient genes with congenital heart disease.

PLoS Genet. 2021 Jul 29;17(7):e1009679. doi: 10.1371/journal.pgen.1009679. eCollection 2021 Jul.

Haploinsufficiency and the sex chromosomes from yeasts to humans.

BMC Biol. 2011 Feb 28;9:15. doi: 10.1186/1741-7007-9-15.

引用本文的文献

Depletion of aneuploid cells is shaped by cell-to-cell interactions.

Cell Genom. 2025 Aug 13;5(8):100894. doi: 10.1016/j.xgen.2025.100894. Epub 2025 Jun 3.

MDVarP: modifier ~ disease-causing variant pairs predictor.

BioData Min. 2024 Oct 8;17(1):39. doi: 10.1186/s13040-024-00392-y.

Proteome-scale prediction of molecular mechanisms underlying dominant genetic diseases.

PLoS One. 2024 Aug 22;19(8):e0307312. doi: 10.1371/journal.pone.0307312. eCollection 2024.

Heterozygous female mice demonstrate mosaic NEXMIF expression, autism-like behaviors, and abnormalities in dendritic arborization and synaptogenesis.

Heliyon. 2024 Jan 24;10(3):e24703. doi: 10.1016/j.heliyon.2024.e24703. eCollection 2024 Feb 15.

Cancer genomes tolerate deleterious coding mutations through somatic copy number amplifications of wild-type regions.

Nat Commun. 2023 Jun 16;14(1):3594. doi: 10.1038/s41467-023-39313-8.

Buffering of genetic dominance by allele-specific protein complex assembly.

Sci Adv. 2023 Jun 2;9(22):eadf9845. doi: 10.1126/sciadv.adf9845. Epub 2023 May 31.

dbCNV: deleteriousness-based model to predict pathogenicity of copy number variations.

BMC Genomics. 2023 Mar 20;24(1):131. doi: 10.1186/s12864-023-09225-4.

Circular RNA repertoires are associated with evolutionarily young transposable elements.

Elife. 2021 Sep 20;10:e67991. doi: 10.7554/eLife.67991.

X-CNV: genome-wide prediction of the pathogenicity of copy number variations.

Genome Med. 2021 Aug 18;13(1):132. doi: 10.1186/s13073-021-00945-4.

and human gene essentiality estimations capture contrasting functional constraints.

NAR Genom Bioinform. 2021 Jul 13;3(3):lqab063. doi: 10.1093/nargab/lqab063. eCollection 2021 Sep.

本文引用的文献

A proteome-scale map of the human interactome network.

Cell. 2014 Nov 20;159(5):1212-1226. doi: 10.1016/j.cell.2014.10.050.

Synaptic, transcriptional and chromatin genes disrupted in autism.

Nature. 2014 Nov 13;515(7526):209-15. doi: 10.1038/nature13772. Epub 2014 Oct 29.

Phenotype ontologies and cross-species analysis for translational research.

PLoS Genet. 2014 Apr 3;10(4):e1004268. doi: 10.1371/journal.pgen.1004268. eCollection 2014 Apr.

Bias tradeoffs in the creation and analysis of protein-protein interaction networks.

J Proteomics. 2014 Apr 4;100:44-54. doi: 10.1016/j.jprot.2014.01.020. Epub 2014 Jan 27.

The role of de novo mutations in the genetics of autism spectrum disorders.

Nat Rev Genet. 2014 Feb;15(2):133-41. doi: 10.1038/nrg3585. Epub 2014 Jan 16.

The roles of FMRP-regulated genes in autism spectrum disorder: single- and multiple-hit genetic etiologies.

Am J Hum Genet. 2013 Nov 7;93(5):825-39. doi: 10.1016/j.ajhg.2013.09.013. Epub 2013 Oct 24.

Genic intolerance to functional variation and the interpretation of personal genomes.

PLoS Genet. 2013;9(8):e1003709. doi: 10.1371/journal.pgen.1003709. Epub 2013 Aug 22.

The Genotype-Tissue Expression (GTEx) project.

Nat Genet. 2013 Jun;45(6):580-5. doi: 10.1038/ng.2653.

From mouse to human: evolutionary genomics analysis of human orthologs of essential genes.

PLoS Genet. 2013 May;9(5):e1003484. doi: 10.1371/journal.pgen.1003484. Epub 2013 May 9.

Interpretation of genomic variants using a unified biological network approach.

PLoS Comput Biol. 2013;9(3):e1002886. doi: 10.1371/journal.pcbi.1002886. Epub 2013 Mar 7.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

无研究偏差的单倍剂量不足预测

Haploinsufficiency predictions without study bias.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译