整合人类组学数据，优先考虑候选基因。

Integrating human omics data to prioritize candidate genes.

机构信息

Department of Automation, MOE Key Laboratory of Bioinformatics; Bioinformatics Division and Center for Synthetic & Systems Biology, TNLIST, Tsinghua University, Beijing 100084, China.

出版信息

BMC Med Genomics. 2013 Dec 18;6:57. doi: 10.1186/1755-8794-6-57.

DOI:10.1186/1755-8794-6-57

PMID:24344781

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3878333/

Abstract

BACKGROUND

The identification of genes involved in human complex diseases remains a great challenge in computational systems biology. Although methods have been developed to use disease phenotypic similarities with a protein-protein interaction network for the prioritization of candidate genes, other valuable omics data sources have been largely overlooked in these methods.

METHODS

With this understanding, we proposed a method called BRIDGE to prioritize candidate genes by integrating disease phenotypic similarities with such omics data as protein-protein interactions, gene sequence similarities, gene expression patterns, gene ontology annotations, and gene pathway memberships. BRIDGE utilizes a multiple regression model with lasso penalty to automatically weight different data sources and is capable of discovering genes associated with diseases whose genetic bases are completely unknown.

RESULTS

We conducted large-scale cross-validation experiments and demonstrated that more than 60% known disease genes can be ranked top one by BRIDGE in simulated linkage intervals, suggesting the superior performance of this method. We further performed two comprehensive case studies by applying BRIDGE to predict novel genes and transcriptional networks involved in obesity and type II diabetes.

CONCLUSION

The proposed method provides an effective and scalable way for integrating multi omics data to infer disease genes. Further applications of BRIDGE will be benefit to providing novel disease genes and underlying mechanisms of human diseases.

摘要

背景

在计算系统生物学中，鉴定涉及人类复杂疾病的基因仍然是一个巨大的挑战。尽管已经开发了一些方法，利用疾病表型与蛋白质-蛋白质相互作用网络的相似性来优先考虑候选基因，但在这些方法中，其他有价值的组学数据源在很大程度上被忽视了。

方法

基于这一理解，我们提出了一种名为 BRIDGE 的方法，通过整合疾病表型相似性与蛋白质-蛋白质相互作用、基因序列相似性、基因表达模式、基因本体注释和基因途径成员等组学数据，来优先考虑候选基因。BRIDGE 利用带有套索惩罚的多元回归模型自动为不同的数据源赋权，并且能够发现与遗传基础完全未知的疾病相关的基因。

结果

我们进行了大规模的交叉验证实验，结果表明，在模拟的连锁区间内，超过 60%的已知疾病基因可以通过 BRIDGE 排名第一，这表明了该方法的优越性能。我们进一步通过应用 BRIDGE 来预测肥胖和 2 型糖尿病中涉及的新型基因和转录网络，进行了两项全面的案例研究。

结论

该方法为整合多组学数据推断疾病基因提供了一种有效且可扩展的方法。BRIDGE 的进一步应用将有助于提供新的疾病基因和人类疾病的潜在机制。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c7d1/3878333/949e5ed87b73/1755-8794-6-57-2.jpg

相似文献

Integrating human omics data to prioritize candidate genes.整合人类组学数据，优先考虑候选基因。

BMC Med Genomics. 2013 Dec 18;6:57. doi: 10.1186/1755-8794-6-57.

Pinpointing disease genes through phenomic and genomic data fusion.通过表型组学和基因组学数据融合来精准定位疾病基因。

BMC Genomics. 2015;16 Suppl 2(Suppl 2):S3. doi: 10.1186/1471-2164-16-S2-S3. Epub 2015 Jan 21.

Meta-analysis approach identifies candidate genes and associated molecular networks for type-2 diabetes mellitus.荟萃分析方法确定2型糖尿病的候选基因及相关分子网络。

BMC Genomics. 2008 Jun 30;9:310. doi: 10.1186/1471-2164-9-310.

DomainRBF: a Bayesian regression approach to the prioritization of candidate domains for complex diseases.DomainRBF：一种用于复杂疾病候选结构域优先级排序的贝叶斯回归方法。

BMC Syst Biol. 2011 Apr 19;5:55. doi: 10.1186/1752-0509-5-55.

Discovering cancer genes by integrating network and functional properties.通过整合网络和功能特性发现癌症基因。

BMC Med Genomics. 2009 Sep 19;2:61. doi: 10.1186/1755-8794-2-61.

Integrating multiple protein-protein interaction networks to prioritize disease genes: a Bayesian regression approach.整合多个蛋白质-蛋白质相互作用网络以优先考虑疾病基因：一种贝叶斯回归方法。

BMC Bioinformatics. 2011 Feb 15;12 Suppl 1(Suppl 1):S11. doi: 10.1186/1471-2105-12-S1-S11.

Identifying disease-causal genes using Semantic Web-based representation of integrated genomic and phenomic knowledge.利用基于语义网的整合基因组学和表型组学知识表征来鉴定疾病致病基因。

J Biomed Inform. 2008 Oct;41(5):717-29. doi: 10.1016/j.jbi.2008.07.004. Epub 2008 Aug 23.

IPAD: the Integrated Pathway Analysis Database for Systematic Enrichment Analysis.IPAD：系统富集分析的综合途径分析数据库。

BMC Bioinformatics. 2012;13 Suppl 15(Suppl 15):S7. doi: 10.1186/1471-2105-13-S15-S7. Epub 2012 Sep 11.

integRATE: a desirability-based data integration framework for the prioritization of candidate genes across heterogeneous omics and its application to preterm birth.integRATE：一种基于理想性的数据整合框架，用于对异质组学中的候选基因进行优先级排序，并将其应用于早产研究。

BMC Med Genomics. 2018 Nov 19;11(1):107. doi: 10.1186/s12920-018-0426-y.

Comparative analysis of protein interactome networks prioritizes candidate genes with cancer signatures.蛋白质相互作用组网络的比较分析对具有癌症特征的候选基因进行了优先排序。

Oncotarget. 2016 Nov 29;7(48):78841-78849. doi: 10.18632/oncotarget.12879.

引用本文的文献

GDC: Integration of Multi-Omic and Phenotypic Resources to Unravel the Genetic Pathogenesis of Hearing Loss.基因组数据中心（GDC）：整合多组学和表型资源以揭示听力损失的遗传发病机制。

Adv Sci (Weinh). 2025 Aug;12(29):e2408891. doi: 10.1002/advs.202408891. Epub 2025 Mar 16.

Mowat-Wilson Syndrome: Case Report and Review of Gene Variant Types, Protein Defects and Molecular Interactions.莫瓦特-威尔逊综合征：病例报告及基因变异类型、蛋白质缺陷和分子相互作用的综述

Int J Mol Sci. 2024 Feb 29;25(5):2838. doi: 10.3390/ijms25052838.

Potential Schizophrenia Disease-Related Genes Prediction Using Metagraph Representations Based on a Protein-Protein Interaction Keyword Network: Framework Development and Validation.基于蛋白质-蛋白质相互作用关键词网络的元图表示法预测潜在的精神分裂症相关基因：框架开发与验证

JMIR Form Res. 2023 Nov 15;7:e50998. doi: 10.2196/50998.

An Expectation-Maximization Algorithm for Combining a Sample of Partially Overlapping Covariance Matrices.一种用于合并部分重叠协方差矩阵样本的期望最大化算法。

Axioms. 2023 Feb;12(2). doi: 10.3390/axioms12020161. Epub 2023 Feb 4.

HetIG-PreDiG: A Heterogeneous Integrated Graph Model for Predicting Human Disease Genes based on gene expression.HetIG-PreDiG：一种基于基因表达的用于预测人类疾病基因的异构集成图模型。

PLoS One. 2023 Feb 15;18(2):e0280839. doi: 10.1371/journal.pone.0280839. eCollection 2023.

Graph Embedding Based Novel Gene Discovery Associated With Diabetes Mellitus.基于图嵌入的与糖尿病相关的新型基因发现

Front Genet. 2021 Nov 25;12:779186. doi: 10.3389/fgene.2021.779186. eCollection 2021.

Evaluation and comparison of multi-omics data integration methods for cancer subtyping.癌症亚型的多组学数据整合方法的评估与比较。

PLoS Comput Biol. 2021 Aug 12;17(8):e1009224. doi: 10.1371/journal.pcbi.1009224. eCollection 2021 Aug.

Multidimensional molecular measurements-environment interaction analysis for disease outcomes.多维分子测量-疾病结局的环境交互分析。

Biometrics. 2022 Dec;78(4):1542-1554. doi: 10.1111/biom.13526. Epub 2021 Aug 1.

Horizontal and vertical integrative analysis methods for mental disorders omics data.精神障碍组学数据的水平和垂直整合分析方法。

Sci Rep. 2019 Sep 17;9(1):13430. doi: 10.1038/s41598-019-49718-5.

A Survey of Gene Prioritization Tools for Mendelian and Complex Human Diseases.孟德尔和复杂人类疾病基因优先级排序工具综述

J Integr Bioinform. 2019 Sep 9;16(4):20180069. doi: 10.1515/jib-2018-0069.

本文引用的文献

Walking on a tissue-specific disease-protein-complex heterogeneous network for the discovery of disease-related protein complexes.在组织特异性疾病-蛋白质-复合物异质网络上进行漫步以发现疾病相关蛋白质复合物。

Biomed Res Int. 2013;2013:732650. doi: 10.1155/2013/732650. Epub 2013 Dec 28.

Constructing a gene semantic similarity network for the inference of disease genes.构建用于疾病基因推断的基因语义相似性网络。

BMC Syst Biol. 2011;5 Suppl 2(Suppl 2):S2. doi: 10.1186/1752-0509-5-S2-S2. Epub 2011 Dec 14.

Computational approaches to disease-gene prediction: rationale, classification and successes.计算方法在疾病基因预测中的应用：原理、分类与成功案例。

FEBS J. 2012 Mar;279(5):678-96. doi: 10.1111/j.1742-4658.2012.08471.x. Epub 2012 Jan 30.

Uncover disease genes by maximizing information flow in the phenome-interactome network.通过最大化表型-互作网络中的信息流来发现疾病基因。

Bioinformatics. 2011 Jul 1;27(13):i167-76. doi: 10.1093/bioinformatics/btr213.

DomainRBF: a Bayesian regression approach to the prioritization of candidate domains for complex diseases.DomainRBF：一种用于复杂疾病候选结构域优先级排序的贝叶斯回归方法。

BMC Syst Biol. 2011 Apr 19;5:55. doi: 10.1186/1752-0509-5-55.

BMC Bioinformatics. 2011 Feb 15;12 Suppl 1(Suppl 1):S11. doi: 10.1186/1471-2105-12-S1-S11.

Genome-wide inferring gene-phenotype relationship by walking on the heterogeneous network.基于异构网络游走的全基因组推断基因-表型关系。

Bioinformatics. 2010 May 1;26(9):1219-24. doi: 10.1093/bioinformatics/btq108. Epub 2010 Mar 9.

Associating genes and protein complexes with disease via network propagation.通过网络传播将基因和蛋白质复合物与疾病相关联。

PLoS Comput Biol. 2010 Jan 15;6(1):e1000641. doi: 10.1371/journal.pcbi.1000641.

Monoacylglycerol lipase regulates a fatty acid network that promotes cancer pathogenesis.单酰甘油脂肪酶调节促进癌症发病机制的脂肪酸网络。

Cell. 2010 Jan 8;140(1):49-61. doi: 10.1016/j.cell.2009.11.027.

Human genetics illuminates the paths to metabolic disease.人类遗传学为代谢性疾病的研究指明了方向。

Nature. 2009 Nov 19;462(7271):307-14. doi: 10.1038/nature08532.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

整合人类组学数据，优先考虑候选基因。

Integrating human omics data to prioritize candidate genes.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献