一种系统预测基因-表型关联的综合模块化方法。

An integrative modular approach to systematically predict gene-phenotype associations.

机构信息

Program in Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles CA 90089, USA.

出版信息

BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S62. doi: 10.1186/1471-2105-11-S1-S62.

DOI:10.1186/1471-2105-11-S1-S62

PMID:20122238

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3009536/

Abstract

BACKGROUND

Complex human diseases are often caused by multiple mutations, each of which contributes only a minor effect to the disease phenotype. To study the basis for these complex phenotypes, we developed a network-based approach to identify coexpression modules specifically activated in particular phenotypes. We integrated these modules, protein-protein interaction data, Gene Ontology annotations, and our database of gene-phenotype associations derived from literature to predict novel human gene-phenotype associations. Our systematic predictions provide us with the opportunity to perform a global analysis of human gene pleiotropy and its underlying regulatory mechanisms.

RESULTS

We applied this method to 338 microarray datasets, covering 178 phenotype classes, and identified 193,145 phenotype-specific coexpression modules. We trained random forest classifiers for each phenotype and predicted a total of 6,558 gene-phenotype associations. We showed that 40.9% genes are pleiotropic, highlighting that pleiotropy is more prevalent than previously expected. We collected 77 ChIP-chip datasets studying 69 transcription factors binding over 16,000 targets under various phenotypic conditions. Utilizing this unique data source, we confirmed that dynamic transcriptional regulation is an important force driving the formation of phenotype specific gene modules.

CONCLUSION

We created a genome-wide gene to phenotype mapping that has many potential implications, including providing potential new drug targets and uncovering the basis for human disease phenotypes. Our analysis of these phenotype-specific coexpression modules reveals a high prevalence of gene pleiotropy, and suggests that phenotype-specific transcription factor binding may contribute to phenotypic diversity. All resources from our study are made freely available on our online Phenotype Prediction Database.

摘要

背景

复杂的人类疾病通常是由多个突变引起的，每个突变对疾病表型的贡献都很小。为了研究这些复杂表型的基础，我们开发了一种基于网络的方法来识别特定表型中特异性激活的共表达模块。我们整合了这些模块、蛋白质-蛋白质相互作用数据、GO 注释以及我们从文献中提取的基因-表型关联数据库，以预测新的人类基因-表型关联。我们的系统预测为我们提供了分析人类基因多效性及其潜在调控机制的机会。

结果

我们将这种方法应用于 338 个微阵列数据集，涵盖 178 种表型类别，鉴定出 193145 个表型特异性共表达模块。我们为每个表型训练了随机森林分类器，并预测了总共 6558 个基因-表型关联。我们发现 40.9%的基因是多效性的，这表明多效性比以前预期的更为普遍。我们收集了 77 个 ChIP-chip 数据集，这些数据集研究了 69 个转录因子在各种表型条件下对超过 16000 个靶标的结合情况。利用这个独特的数据源，我们证实了动态转录调控是形成表型特异性基因模块的重要力量。

结论

我们创建了一个全基因组基因到表型的映射，这具有许多潜在的意义，包括提供潜在的新药物靶点，并揭示人类疾病表型的基础。我们对这些表型特异性共表达模块的分析表明基因多效性的发生率很高，并表明表型特异性转录因子结合可能有助于表型多样性。我们研究的所有资源都在我们的在线表型预测数据库中免费提供。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2804/3009536/2903b081a946/1471-2105-11-S1-S62-1.jpg

相似文献

An integrative modular approach to systematically predict gene-phenotype associations.一种系统预测基因-表型关联的综合模块化方法。

BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S62. doi: 10.1186/1471-2105-11-S1-S62.

Predicting distinct organization of transcription factor binding sites on the promoter regions: a new genome-based approach to expand human embryonic stem cell regulatory network.预测启动子区域转录因子结合位点的不同组织：一种新的基于基因组的方法来扩展人类胚胎干细胞调控网络。

Gene. 2013 Dec 1;531(2):212-9. doi: 10.1016/j.gene.2013.09.011. Epub 2013 Sep 13.

A graph-based approach to systematically reconstruct human transcriptional regulatory modules.一种基于图形的方法来系统地重建人类转录调控模块。

Bioinformatics. 2007 Jul 1;23(13):i577-86. doi: 10.1093/bioinformatics/btm227.

Human gene coexpression landscape: confident network derived from tissue transcriptomic profiles.人类基因共表达图谱：源自组织转录组图谱的可靠网络。

PLoS One. 2008;3(12):e3911. doi: 10.1371/journal.pone.0003911. Epub 2008 Dec 15.

An integrative network approach to map the transcriptome to the phenome.一种将转录组映射到表型组的综合网络方法。

J Comput Biol. 2009 Aug;16(8):1023-34. doi: 10.1089/cmb.2009.0037.

Towards prediction and prioritization of disease genes by the modularity of human phenome-genome assembled network.基于人类表型组-基因组组装网络的模块性实现疾病基因的预测与优先级排序

J Integr Bioinform. 2010 Nov 22;7(2):425. doi: 10.2390/biecoll-jib-2010-149.

Gene coexpression networks reveal key drivers of phenotypic divergence in porcine muscle.基因共表达网络揭示了猪肌肉表型差异的关键驱动因素。

BMC Genomics. 2015 Feb 5;16(1):50. doi: 10.1186/s12864-015-1238-5.

Transfer learning across ontologies for phenome-genome association prediction.跨本体的迁移学习用于表型-基因组关联预测。

Bioinformatics. 2017 Feb 15;33(4):529-536. doi: 10.1093/bioinformatics/btw649.

A novel computational approach for predicting complex phenotypes in Drosophila (starvation-sensitive and sterile) by deriving their gene expression signatures from public data.一种新的计算方法，通过从公共数据中提取其基因表达特征，来预测果蝇（饥饿敏感和不育）的复杂表型（starvation-sensitive 和 sterile）。

PLoS One. 2020 Oct 26;15(10):e0240824. doi: 10.1371/journal.pone.0240824. eCollection 2020.

De novo prediction of cis-regulatory elements and modules through integrative analysis of a large number of ChIP datasets.通过对大量染色质免疫沉淀数据集进行综合分析，从头预测顺式调控元件和模块。

BMC Genomics. 2014 Dec 2;15:1047. doi: 10.1186/1471-2164-15-1047.

引用本文的文献

HPOAnnotator: improving large-scale prediction of HPO annotations by low-rank approximation with HPO semantic similarities and multiple PPI networks.HPOAnnotator：通过使用 HPO 语义相似性和多个 PPI 网络进行低秩逼近，提高大规模 HPO 注释的预测。

BMC Med Genomics. 2019 Dec 23;12(Suppl 10):187. doi: 10.1186/s12920-019-0625-1.

Network-based Phenome-Genome Association Prediction by Bi-Random Walk.基于双随机游走的基于网络的表型组-基因组关联预测

PLoS One. 2015 May 1;10(5):e0125138. doi: 10.1371/journal.pone.0125138. eCollection 2015.

A network of genes, genetic disorders, and brain areas.一个由基因、遗传疾病和大脑区域组成的网络。

PLoS One. 2011;6(6):e20907. doi: 10.1371/journal.pone.0020907. Epub 2011 Jun 10.

GOChase-II: correcting semantic inconsistencies from Gene Ontology-based annotations for gene products.GOChase-II：纠正基于基因本体论注释的基因产物中的语义不一致性。

BMC Bioinformatics. 2011 Feb 15;12 Suppl 1(Suppl 1):S40. doi: 10.1186/1471-2105-12-S1-S40.

Genome-wide approaches to schizophrenia.全基因组方法研究精神分裂症。

Brain Res Bull. 2010 Sep 30;83(3-4):93-102. doi: 10.1016/j.brainresbull.2010.04.009. Epub 2010 Apr 28.

本文引用的文献

An integrative network approach to map the transcriptome to the phenome.一种将转录组映射到表型组的综合网络方法。

J Comput Biol. 2009 Aug;16(8):1023-34. doi: 10.1089/cmb.2009.0037.

Platelet-derived microparticles promote invasiveness of prostate cancer cells via upregulation of MMP-2 production.血小板衍生微粒通过上调MMP-2的产生促进前列腺癌细胞的侵袭性。

Int J Cancer. 2009 Apr 15;124(8):1773-7. doi: 10.1002/ijc.24016.

Prediction of human disease genes by human-mouse conserved coexpression analysis.通过人鼠保守共表达分析预测人类疾病基因。

PLoS Comput Biol. 2008 Mar 28;4(3):e1000043. doi: 10.1371/journal.pcbi.1000043.

Expression and functional role of beta-adrenoceptors in the human urinary bladder urothelium.β-肾上腺素能受体在人膀胱尿路上皮中的表达及功能作用

Naunyn Schmiedebergs Arch Pharmacol. 2008 Jun;377(4-6):473-81. doi: 10.1007/s00210-008-0274-y. Epub 2008 Mar 1.

A prototypic matricellular protein in the tumor microenvironment--where there's SPARC, there's fire.肿瘤微环境中的一种典型基质细胞蛋白——有富含半胱氨酸的酸性分泌蛋白（SPARC）的地方，就有麻烦。

J Cell Biochem. 2008 Jun 1;104(3):721-32. doi: 10.1002/jcb.21688.

Serum and mucosal S100 proteins, calprotectin (S100A8/S100A9) and S100A12, are elevated at diagnosis in children with inflammatory bowel disease.在炎症性肠病患儿诊断时，血清和黏膜中的S100蛋白、钙卫蛋白（S100A8/S100A9）及S100A12水平会升高。

Scand J Gastroenterol. 2007 Nov;42(11):1321-31. doi: 10.1080/00365520701416709.

A graph-based approach to systematically reconstruct human transcriptional regulatory modules.一种基于图形的方法来系统地重建人类转录调控模块。

Bioinformatics. 2007 Jul 1;23(13):i577-86. doi: 10.1093/bioinformatics/btm227.

Systematic discovery of functional modules and context-specific functional annotation of human genome.人类基因组功能模块的系统发现及特定背景下的功能注释

Bioinformatics. 2007 Jul 1;23(13):i222-9. doi: 10.1093/bioinformatics/btm222.

A human phenome-interactome network of protein complexes implicated in genetic disorders.一个与遗传疾病相关的蛋白质复合物的人类表型-相互作用组网络。

Nat Biotechnol. 2007 Mar;25(3):309-16. doi: 10.1038/nbt1295.

NCBI GEO: mining tens of millions of expression profiles--database and tools update.NCBI基因表达综合数据库：挖掘数千万个表达谱——数据库与工具更新

Nucleic Acids Res. 2007 Jan;35(Database issue):D760-5. doi: 10.1093/nar/gkl887. Epub 2006 Nov 11.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种系统预测基因-表型关联的综合模块化方法。

An integrative modular approach to systematically predict gene-phenotype associations.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献