• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

整合包含网络结构的遗传基因组学数据的综合分析。

Integrative analysis of genetical genomics data incorporating network structures.

作者信息

Gao Bin, Liu Xu, Li Hongzhe, Cui Yuehua

机构信息

Department of Statistics and Probability, Michigan State University, East Lansing, Michigan.

Quantitative Sciences, Janssen Research & Development, LLC, Spring House, Pennsylvania.

出版信息

Biometrics. 2019 Dec;75(4):1063-1075. doi: 10.1111/biom.13072. Epub 2019 Apr 29.

DOI:10.1111/biom.13072
PMID:31009063
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6810723/
Abstract

In a living organism, tens of thousands of genes are expressed and interact with each other to achieve necessary cellular functions. Gene regulatory networks contain information on regulatory mechanisms and the functions of gene expressions. Thus, incorporating network structures, discerned either through biological experiments or statistical estimations, could potentially increase the selection and estimation accuracy of genes associated with a phenotype of interest. Here, we considered a gene selection problem using gene expression data and the graphical structures found in gene networks. Because gene expression measurements are intermediate phenotypes between a trait and its associated genes, we adopted an instrumental variable regression approach. We treated genetic variants as instrumental variables to address the endogeneity issue. We proposed a two-step estimation procedure. In the first step, we applied the LASSO algorithm to estimate the effects of genetic variants on gene expression measurements. In the second step, the projected expression measurements obtained from the first step were treated as input variables. A graph-constrained regularization method was adopted to improve the efficiency of gene selection and estimation. We theoretically showed the selection consistency of the estimation method and derived the bound of the estimates. Simulation and real data analyses were conducted to demonstrate the effectiveness of our method and to compare it with its counterparts.

摘要

在活生物体中,数以万计的基因被表达并相互作用以实现必要的细胞功能。基因调控网络包含有关调控机制和基因表达功能的信息。因此,纳入通过生物学实验或统计估计识别出的网络结构,可能会提高与感兴趣表型相关基因的选择和估计准确性。在此,我们考虑使用基因表达数据和基因网络中发现的图形结构来解决基因选择问题。由于基因表达测量是性状与其相关基因之间的中间表型,我们采用了工具变量回归方法。我们将遗传变异视为工具变量以解决内生性问题。我们提出了一种两步估计程序。第一步,我们应用LASSO算法来估计遗传变异对基因表达测量的影响。第二步,将第一步获得的预测表达测量值作为输入变量。采用图形约束正则化方法来提高基因选择和估计的效率。我们从理论上证明了估计方法的选择一致性,并推导了估计值的界。进行了模拟和实际数据分析以证明我们方法的有效性,并将其与其他方法进行比较。

相似文献

1
Integrative analysis of genetical genomics data incorporating network structures.整合包含网络结构的遗传基因组学数据的综合分析。
Biometrics. 2019 Dec;75(4):1063-1075. doi: 10.1111/biom.13072. Epub 2019 Apr 29.
2
Learning directed acyclic graphical structures with genetical genomics data.利用遗传基因组学数据学习有向无环图结构
Bioinformatics. 2015 Dec 15;31(24):3953-60. doi: 10.1093/bioinformatics/btv513. Epub 2015 Sep 2.
3
Tailored graphical lasso for data integration in gene network reconstruction.针对基因网络重构中数据集成的定制图形套索。
BMC Bioinformatics. 2021 Oct 15;22(1):498. doi: 10.1186/s12859-021-04413-z.
4
Learning gene networks under SNP perturbations using eQTL datasets.利用eQTL数据集在SNP扰动下学习基因网络。
PLoS Comput Biol. 2014 Feb 27;10(2):e1003420. doi: 10.1371/journal.pcbi.1003420. eCollection 2014 Feb.
5
Gene expression network reconstruction by convex feature selection when incorporating genetic perturbations.在纳入遗传扰动时通过凸特征选择进行基因表达网络重构。
PLoS Comput Biol. 2010 Dec 2;6(12):e1001014. doi: 10.1371/journal.pcbi.1001014.
6
Robust network-based regularization and variable selection for high-dimensional genomic data in cancer prognosis.用于癌症预后高维基因组数据的基于网络的稳健正则化和变量选择
Genet Epidemiol. 2019 Apr;43(3):276-291. doi: 10.1002/gepi.22194. Epub 2019 Feb 11.
7
Covariate-Adjusted Precision Matrix Estimation with an Application in Genetical Genomics.协变量调整的精度矩阵估计及其在遗传基因组学中的应用
Biometrika. 2013 Mar;100(1):139-156. doi: 10.1093/biomet/ass058. Epub 2012 Nov 30.
8
Weighted lasso in graphical Gaussian modeling for large gene network estimation based on microarray data.基于微阵列数据的大型基因网络估计的图形高斯建模中的加权套索法
Genome Inform. 2007;19:142-53.
9
Grace-AKO: a novel and stable knockoff filter for variable selection incorporating gene network structures.Grace-AKO:一种新颖且稳定的用于变量选择的复制过滤器,纳入了基因网络结构。
BMC Bioinformatics. 2022 Nov 14;23(1):478. doi: 10.1186/s12859-022-05016-y.
10
Regularization Methods for High-Dimensional Instrumental Variables Regression With an Application to Genetical Genomics.高维工具变量回归的正则化方法及其在遗传基因组学中的应用
J Am Stat Assoc. 2015;110(509):270-288. doi: 10.1080/01621459.2014.908125.

引用本文的文献

1
NetMIM: network-based multi-omics integration with block missingness for biomarker selection and disease outcome prediction.NetMIM:基于网络的多组学整合,具有块缺失,用于生物标志物选择和疾病结果预测。
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae454.
2
Uncertainty quantification in high-dimensional linear models incorporating graphical structures with applications to gene set analysis.高维线性模型中包含图形结构的不确定性量化及其在基因集分析中的应用。
Bioinformatics. 2024 Sep 2;40(9). doi: 10.1093/bioinformatics/btae541.
3
Gene-gene interaction analysis incorporating network information via a structured Bayesian approach.基于结构贝叶斯方法的纳入网络信息的基因-基因交互作用分析。
Stat Med. 2021 Dec 20;40(29):6619-6633. doi: 10.1002/sim.9202. Epub 2021 Sep 20.
4
Vertical integration methods for gene expression data analysis.基因表达数据分析的垂直整合方法。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa169.
5
Incorporating biological structure into machine learning models in biomedicine.将生物结构纳入生物医学中的机器学习模型。
Curr Opin Biotechnol. 2020 Jun;63:126-134. doi: 10.1016/j.copbio.2019.12.021. Epub 2020 Jan 18.

本文引用的文献

1
Scalable Bayesian variable selection for structured high-dimensional data.用于结构化高维数据的可扩展贝叶斯变量选择
Biometrics. 2018 Dec;74(4):1372-1382. doi: 10.1111/biom.12882. Epub 2018 May 8.
2
Covariate-Adjusted Precision Matrix Estimation with an Application in Genetical Genomics.协变量调整的精度矩阵估计及其在遗传基因组学中的应用
Biometrika. 2013 Mar;100(1):139-156. doi: 10.1093/biomet/ass058. Epub 2012 Nov 30.
3
Regularization Methods for High-Dimensional Instrumental Variables Regression With an Application to Genetical Genomics.高维工具变量回归的正则化方法及其在遗传基因组学中的应用
J Am Stat Assoc. 2015;110(509):270-288. doi: 10.1080/01621459.2014.908125.
4
Transcription of the Human Microsomal Epoxide Hydrolase Gene (EPHX1) Is Regulated by PARP-1 and Histone H1.2. Association with Sodium-Dependent Bile Acid Transport.人微粒体环氧化物水解酶基因(EPHX1)的转录受聚(ADP-核糖)聚合酶-1(PARP-1)和组蛋白H1.2调控。与钠依赖性胆汁酸转运的关联。
PLoS One. 2015 May 20;10(5):e0125318. doi: 10.1371/journal.pone.0125318. eCollection 2015.
5
Endogeneity in High Dimensions.高维中的内生性
Ann Stat. 2014 Jun 1;42(3):872-917. doi: 10.1214/13-AOS1202.
6
More powerful genetic association testing via a new statistical framework for integrative genomics.通过一种用于整合基因组学的新统计框架进行更强大的基因关联测试。
Biometrics. 2014 Dec;70(4):881-90. doi: 10.1111/biom.12206. Epub 2014 Jun 26.
7
Sparse Multivariate Regression With Covariance Estimation.带协方差估计的稀疏多元回归
J Comput Graph Stat. 2010 Fall;19(4):947-962. doi: 10.1198/jcgs.2010.09188.
8
JOINT ANALYSIS OF SNP AND GENE EXPRESSION DATA IN GENETIC ASSOCIATION STUDIES OF COMPLEX DISEASES.复杂疾病遗传关联研究中SNP与基因表达数据的联合分析
Ann Appl Stat. 2014 Mar 1;8(1):352-376. doi: 10.1214/13-AOAS690.
9
Adjusting for High-dimensional Covariates in Sparse Precision Matrix Estimation by ℓ-Penalization.通过ℓ惩罚在稀疏精度矩阵估计中对高维协变量进行调整。
J Multivar Anal. 2013 Apr 1;116:365-381. doi: 10.1016/j.jmva.2013.01.005.
10
VARIABLE SELECTION AND REGRESSION ANALYSIS FOR GRAPH-STRUCTURED COVARIATES WITH AN APPLICATION TO GENOMICS.具有基因组学应用的图结构协变量的变量选择与回归分析
Ann Appl Stat. 2010 Sep 1;4(3):1498-1516. doi: 10.1214/10-AOAS332.