通过整合分离群体中的基因型和表达数据来增强检测因果关联的能力。

Increasing the power to detect causal associations by combining genotypic and expression data in segregating populations.

作者信息

Zhu Jun, Wiener Matthew C, Zhang Chunsheng, Fridman Arthur, Minch Eric, Lum Pek Y, Sachs Jeffrey R, Schadt Eric E

机构信息

Rosetta Inpharmatics, Seattle, Washington, United States of America.

出版信息

PLoS Comput Biol. 2007 Apr 13;3(4):e69. doi: 10.1371/journal.pcbi.0030069. Epub 2007 Feb 27.

DOI:10.1371/journal.pcbi.0030069

PMID:17432931

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1851982/

Abstract

To dissect common human diseases such as obesity and diabetes, a systematic approach is needed to study how genes interact with one another, and with genetic and environmental factors, to determine clinical end points or disease phenotypes. Bayesian networks provide a convenient framework for extracting relationships from noisy data and are frequently applied to large-scale data to derive causal relationships among variables of interest. Given the complexity of molecular networks underlying common human disease traits, and the fact that biological networks can change depending on environmental conditions and genetic factors, large datasets, generally involving multiple perturbations (experiments), are required to reconstruct and reliably extract information from these networks. With limited resources, the balance of coverage of multiple perturbations and multiple subjects in a single perturbation needs to be considered in the experimental design. Increasing the number of experiments, or the number of subjects in an experiment, is an expensive and time-consuming way to improve network reconstruction. Integrating multiple types of data from existing subjects might be more efficient. For example, it has recently been demonstrated that combining genotypic and gene expression data in a segregating population leads to improved network reconstruction, which in turn may lead to better predictions of the effects of experimental perturbations on any given gene. Here we simulate data based on networks reconstructed from biological data collected in a segregating mouse population and quantify the improvement in network reconstruction achieved using genotypic and gene expression data, compared with reconstruction using gene expression data alone. We demonstrate that networks reconstructed using the combined genotypic and gene expression data achieve a level of reconstruction accuracy that exceeds networks reconstructed from expression data alone, and that fewer subjects may be required to achieve this superior reconstruction accuracy. We conclude that this integrative genomics approach to reconstructing networks not only leads to more predictive network models, but also may save time and money by decreasing the amount of data that must be generated under any given condition of interest to construct predictive network models.

摘要

为剖析肥胖症和糖尿病等常见人类疾病，需要一种系统方法来研究基因如何相互作用，以及如何与遗传和环境因素相互作用，以确定临床终点或疾病表型。贝叶斯网络为从噪声数据中提取关系提供了一个便捷框架，并经常应用于大规模数据，以推导感兴趣变量之间的因果关系。鉴于常见人类疾病特征背后分子网络的复杂性，以及生物网络会根据环境条件和遗传因素而变化这一事实，通常需要大型数据集（一般涉及多次扰动（实验））来重建并可靠地从这些网络中提取信息。在资源有限的情况下，实验设计中需要考虑多次扰动的覆盖范围与单次扰动中多个受试者之间的平衡。增加实验次数或实验中受试者的数量是改善网络重建的一种昂贵且耗时的方法。整合来自现有受试者的多种类型数据可能更有效。例如，最近已证明，在一个分离群体中结合基因型和基因表达数据可改善网络重建，这反过来可能会更好地预测实验扰动对任何给定基因的影响。在此，我们基于从一个分离小鼠群体收集的生物数据重建的网络来模拟数据，并量化与仅使用基因表达数据进行重建相比，使用基因型和基因表达数据实现的网络重建改进情况。我们证明，使用组合的基因型和基因表达数据重建的网络达到的重建准确度水平超过仅从表达数据重建的网络，并且可能需要更少的受试者来实现这种更高的重建准确度。我们得出结论，这种用于重建网络的整合基因组学方法不仅能产生更具预测性的网络模型，还可能通过减少在任何给定感兴趣条件下构建预测性网络模型所需生成的数据量来节省时间和金钱。

相似文献

Increasing the power to detect causal associations by combining genotypic and expression data in segregating populations.

PLoS Comput Biol. 2007 Apr 13;3(4):e69. doi: 10.1371/journal.pcbi.0030069. Epub 2007 Feb 27.

Alternative pathway approach for automating analysis and validation of cell perturbation networks and design of perturbation experiments.

Ann N Y Acad Sci. 2007 Dec;1115:267-85. doi: 10.1196/annals.1407.011. Epub 2007 Oct 9.

An integrative genomics approach to the reconstruction of gene networks in segregating populations.

Cytogenet Genome Res. 2004;105(2-4):363-74. doi: 10.1159/000078209.

Fitting a geometric graph to a protein-protein interaction network.

Bioinformatics. 2008 Apr 15;24(8):1093-9. doi: 10.1093/bioinformatics/btn079. Epub 2008 Mar 14.

Using stochastic causal trees to augment Bayesian networks for modeling eQTL datasets.

BMC Bioinformatics. 2011 Jan 6;12:7. doi: 10.1186/1471-2105-12-7.

Ensemble learning of genetic networks from time-series expression data.

Bioinformatics. 2007 Dec 1;23(23):3225-31. doi: 10.1093/bioinformatics/btm514. Epub 2007 Oct 31.

Construction of a reference gene association network from multiple profiling data: application to data analysis.

Bioinformatics. 2007 Oct 15;23(20):2716-24. doi: 10.1093/bioinformatics/btm423. Epub 2007 Sep 10.

Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks with Bayesian networks.

Comput Syst Bioinformatics Conf. 2007;6:85-95.

Data integration and analysis of biological networks.

Curr Opin Biotechnol. 2010 Feb;21(1):78-84. doi: 10.1016/j.copbio.2010.01.003. Epub 2010 Feb 6.

Constructing gene co-expression networks and predicting functions of unknown genes by random matrix theory.

BMC Bioinformatics. 2007 Aug 14;8:299. doi: 10.1186/1471-2105-8-299.

引用本文的文献

Bayesian networks for network inference in biology.

J R Soc Interface. 2025 May;22(226):20240893. doi: 10.1098/rsif.2024.0893. Epub 2025 May 7.

Hierarchical Regulatory Networks Reveal Conserved Drivers of Plant Drought Response at the Cell-Type Level.

Adv Sci (Weinh). 2025 May;12(18):e2415106. doi: 10.1002/advs.202415106. Epub 2025 Mar 16.

BACH1 as a key driver in rheumatoid arthritis fibroblast-like synoviocytes identified through gene network analysis.

Life Sci Alliance. 2024 Oct 28;8(1). doi: 10.26508/lsa.202402808. Print 2025 Jan.

Cell-specific gene networks and drivers in rheumatoid arthritis synovial tissues.

Front Immunol. 2024 Aug 5;15:1428773. doi: 10.3389/fimmu.2024.1428773. eCollection 2024.

Regulation of cell distancing in peri-plaque glial nets by Plexin-B1 affects glial activation and amyloid compaction in Alzheimer's disease.

Nat Neurosci. 2024 Aug;27(8):1489-1504. doi: 10.1038/s41593-024-01664-w. Epub 2024 May 27.

Systems genetics analysis of human body fat distribution genes identifies adipocyte processes.

Life Sci Alliance. 2024 May 3;7(7). doi: 10.26508/lsa.202402603. Print 2024 Jul.

Shared and distinct pathways and networks genetically linked to coronary artery disease between human and mouse.

Elife. 2023 Dec 7;12:RP88266. doi: 10.7554/eLife.88266.

Gene Regulatory Networks in Coronary Artery Disease.

Curr Atheroscler Rep. 2023 Dec;25(12):1013-1023. doi: 10.1007/s11883-023-01170-7. Epub 2023 Nov 27.

Integrated study of systemic and local airway transcriptomes in asthma reveals causal mediation of systemic effects by airway key drivers.

Genome Med. 2023 Sep 20;15(1):71. doi: 10.1186/s13073-023-01222-2.

Network Preservation Analysis Reveals Dysregulated Metabolic Pathways in Human Vascular Smooth Muscle Cell Phenotypic Switching.

Circ Genom Precis Med. 2023 Aug;16(4):372-381. doi: 10.1161/CIRCGEN.122.003781. Epub 2023 Jun 30.

本文引用的文献

An Extension of the Concept of Partitioning Hereditary Variance for Analysis of Covariances among Relatives When Epistasis Is Present.

Genetics. 1954 Nov;39(6):859-82. doi: 10.1093/genetics/39.6.859.

Structural model analysis of multiple quantitative traits.

PLoS Genet. 2006 Jul;2(7):e114. doi: 10.1371/journal.pgen.0020114. Epub 2006 Jun 7.

Causal inference of regulator-target pairs by gene mapping of expression phenotypes.

BMC Genomics. 2006 May 24;7:125. doi: 10.1186/1471-2164-7-125.

Elucidating the murine brain transcriptional network in a segregating mouse population to identify core functional modules for obesity and diabetes.

J Neurochem. 2006 Apr;97 Suppl 1:50-62. doi: 10.1111/j.1471-4159.2006.03661.x.

Genetic and genomic analysis of a fat mass trait with complex inheritance reveals marked sex specificity.

PLoS Genet. 2006 Feb;2(2):e15. doi: 10.1371/journal.pgen.0020015. Epub 2006 Feb 3.

SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms.

BMC Bioinformatics. 2006 Jan 26;7:43. doi: 10.1186/1471-2105-7-43.

Integrating genotypic and expression data in a segregating mouse population to identify 5-lipoxygenase as a susceptibility gene for obesity and bone traits.

Nat Genet. 2005 Nov;37(11):1224-33. doi: 10.1038/ng1619. Epub 2005 Oct 2.

Integrating QTL and high-density SNP analyses in mice to identify Insig2 as a susceptibility gene for plasma cholesterol levels.

Genomics. 2005 Nov;86(5):505-17. doi: 10.1016/j.ygeno.2005.07.010. Epub 2005 Aug 29.

Local regulatory variation in Saccharomyces cerevisiae.

PLoS Genet. 2005 Aug;1(2):e25. doi: 10.1371/journal.pgen.0010025. Epub 2005 Aug 19.

Multiple locus linkage analysis of genomewide expression in yeast.

PLoS Biol. 2005 Aug;3(8):e267. doi: 10.1371/journal.pbio.0030267. Epub 2005 Jul 26.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通过整合分离群体中的基因型和表达数据来增强检测因果关联的能力。

Increasing the power to detect causal associations by combining genotypic and expression data in segregating populations.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献