Zhang Yuping, Ouyang Zhengqing, Zhao Hongyu
DEPARTMENT OF STATISTICS, INSTITUTE FOR SYSTEMS GENOMICS, CENTER FOR QUANTITATIVE MEDICINE, INSTITUTE FOR COLLABORATION ON HEALTH, INTERVENTION, AND POLICY, THE CONNECTICUT INSTITUTE FOR THE BRAIN AND COGNITIVE SCIENCES, UNIVERSITY OF CONNECTICUT, STORRS, CONNECTICUT 06269, USA,
THE JACKSON LABORATORY FOR GENOMIC MEDICINE, DEPARTMENT OF BIOMEDICAL ENGINEERING, DEPARTMENT OF GENETICS AND GENOME SCIENCES, INSTITUTE FOR SYSTEMS GENOMICS, UNIVERSITY OF CONNECTICUT, FARMINGTON, CONNECTICUT 06030, USA,
Ann Appl Stat. 2017 Mar;11(1):161-184. doi: 10.1214/16-AOAS998. Epub 2017 Apr 8.
Recent advances in high-throughput biotechnologies have generated var-ious types of genetic, genomic, epigenetic, transcriptomic and proteomic data across different biological conditions. It is likely that integrating data from diverse experiments may lead to a more unified and global view of biolog-ical systems and complex diseases. We present a coherent statistical frame-work for integrating various types of data from distinct but related biological conditions through graphical models. Specifically, our statistical framework is designed for modeling multiple networks with shared regulatory mech-anisms from heterogeneous high-dimensional datasets. The performance of our approach is illustrated through simulations and its applications to cancer genomics.
高通量生物技术的最新进展已在不同生物条件下产生了各种类型的遗传、基因组、表观遗传、转录组和蛋白质组数据。整合来自不同实验的数据可能会使我们对生物系统和复杂疾病有更统一和全面的认识。我们提出了一个连贯的统计框架,用于通过图形模型整合来自不同但相关生物条件的各种类型的数据。具体而言,我们的统计框架旨在对来自异构高维数据集的具有共享调控机制的多个网络进行建模。通过模拟及其在癌症基因组学中的应用,说明了我们方法的性能。