Suppr超能文献

整合多种类型高通量数据的统计方法。

Statistical methods for integrating multiple types of high-throughput data.

作者信息

Xie Yang, Ahn Chul

机构信息

Division of Biostatistics, Department of Clinical Sciences, The Harold C. Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, USA.

出版信息

Methods Mol Biol. 2010;620:511-29. doi: 10.1007/978-1-60761-580-4_19.

Abstract

Large-scale sequencing, copy number, mRNA, and protein data have given great promise to the biomedical research, while posing great challenges to data management and data analysis. Integrating different types of high-throughput data from diverse sources can increase the statistical power of data analysis and provide deeper biological understanding. This chapter uses two biomedical research examples to illustrate why there is an urgent need to develop reliable and robust methods for integrating the heterogeneous data. We then introduce and review some recently developed statistical methods for integrative analysis for both statistical inference and classification purposes. Finally, we present some useful public access databases and program code to facilitate the integrative analysis in practice.

摘要

大规模测序、拷贝数、信使核糖核酸和蛋白质数据给生物医学研究带来了巨大希望,同时也给数据管理和数据分析带来了巨大挑战。整合来自不同来源的不同类型高通量数据可以提高数据分析的统计效力,并提供更深入的生物学理解。本章通过两个生物医学研究实例来说明为何迫切需要开发可靠且强大的方法来整合异构数据。然后,我们介绍并回顾一些最近开发的用于统计推断和分类目的的综合分析统计方法。最后,我们提供一些有用的公共访问数据库和程序代码,以促进实际中的综合分析。

相似文献

1
Statistical methods for integrating multiple types of high-throughput data.
Methods Mol Biol. 2010;620:511-29. doi: 10.1007/978-1-60761-580-4_19.
2
Biostatistics: a toolkit for exploration, validation, and interpretation of clinical data.
J Thorac Oncol. 2009 Dec;4(12):1447-9. doi: 10.1097/JTO.0b013e3181c0a329.
3
Resolution of Students t-tests, ANOVA and analysis of variance components from intermediary data.
Biochem Med (Zagreb). 2017 Jun 15;27(2):253-258. doi: 10.11613/BM.2017.026.
4
Methods for mediation analysis with missing data.
Psychometrika. 2013 Jan;78(1):154-84. doi: 10.1007/s11336-012-9301-5. Epub 2012 Dec 7.
5
Bayesian inference of networks across multiple sample groups and data types.
Biostatistics. 2020 Jul 1;21(3):561-576. doi: 10.1093/biostatistics/kxy078.
6
Understanding the effect size and its measures.
Biochem Med (Zagreb). 2016;26(2):150-63. doi: 10.11613/BM.2016.015.
7
Pragmatic statistical issues in biological research: Introduction to special series.
J Mol Cell Cardiol. 2019 Aug;133:211-213. doi: 10.1016/j.yjmcc.2018.03.013. Epub 2018 Mar 24.
8
A two-stage approach of gene network analysis for high-dimensional heterogeneous data.
Biostatistics. 2018 Apr 1;19(2):216-232. doi: 10.1093/biostatistics/kxx033.
9
Dimension reduction for high-dimensional data.
Methods Mol Biol. 2010;620:417-34. doi: 10.1007/978-1-60761-580-4_14.
10
An overview of robust methods in medical research.
Stat Methods Med Res. 2012 Apr;21(2):111-33. doi: 10.1177/0962280210385865. Epub 2010 Oct 25.

引用本文的文献

1
Using "-omics" Data to Inform Genome-wide Association Studies (GWASs) in the Osteoporosis Field.
Curr Osteoporos Rep. 2021 Aug;19(4):369-380. doi: 10.1007/s11914-021-00684-w. Epub 2021 Jun 14.
2
Integrative genomics with mediation analysis in a survival context.
Comput Math Methods Med. 2013;2013:413783. doi: 10.1155/2013/413783. Epub 2013 Dec 18.

本文引用的文献

2
Improved detection of differentially expressed genes through incorporation of gene locations.
Biometrics. 2009 Sep;65(3):805-14. doi: 10.1111/j.1541-0420.2008.01161.x. Epub 2009 Jan 23.
3
Predicting the future for people with lung cancer.
Nat Med. 2008 Aug;14(8):812-3. doi: 10.1038/nm0808-812.
4
Gene expression-based survival prediction in lung adenocarcinoma: a multi-site, blinded validation study.
Nat Med. 2008 Aug;14(8):822-7. doi: 10.1038/nm.1790. Epub 2008 Jul 20.
5
Incorporating gene networks into statistical tests for genomic data via a spatially correlated mixture model.
Bioinformatics. 2008 Feb 1;24(3):404-11. doi: 10.1093/bioinformatics/btm612. Epub 2007 Dec 14.
6
Cross-study validation and combined analysis of gene expression microarray data.
Biostatistics. 2008 Apr;9(2):333-54. doi: 10.1093/biostatistics/kxm033. Epub 2007 Sep 14.
7
Incorporating prior knowledge of predictors into penalized classifiers with multiple penalty terms.
Bioinformatics. 2007 Jul 15;23(14):1775-82. doi: 10.1093/bioinformatics/btm234. Epub 2007 May 5.
8
A Markov random field model for network-based analysis of genomic data.
Bioinformatics. 2007 Jun 15;23(12):1537-44. doi: 10.1093/bioinformatics/btm129. Epub 2007 May 5.
10
Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes.
Am J Hum Genet. 2006 Jun;78(6):1011-25. doi: 10.1086/504300. Epub 2006 Apr 25.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验