Bing Nan, Hoeschele Ina
Virginia Bioinformatics Institute and Department of Statistics, Virginia Polytechnic Institute and State University, Blacksburg, 24061-0477, USA.
Genetics. 2005 Jun;170(2):533-42. doi: 10.1534/genetics.105.041103. Epub 2005 Mar 21.
Genetic analysis of gene expression in a segregating population, which is expression profiled and genotyped at DNA markers throughout the genome, can reveal regulatory networks of polymorphic genes. We propose an analysis strategy with several steps: (1) genome-wide QTL analysis of all expression profiles to identify eQTL confidence regions, followed by fine mapping of identified eQTL; (2) identification of regulatory candidate genes in each eQTL region; (3) correlation analysis of the expression profiles of the candidates in any eQTL region with the gene affected by the eQTL to reduce the number of candidates; (4) drawing directional links from retained regulatory candidate genes to genes affected by the eQTL and joining links to form networks; and (5) statistical validation and refinement of the inferred network structure. Here, we apply an initial implementation of this strategy to a segregating yeast population. In 65, 7, and 28% of the identified eQTL regions, a single candidate regulatory gene, no gene, or more than one gene was retained in step 3, respectively. Overall, 768 putative regulatory links were retained, 331 of which are the strongest candidate links, as they were retained in the expression correlation analysis and were located within or near an eQTL subregion identified by a multimarker analysis separating multiple linked QTL. One or several biological processes were statistically significantly overrepresented in independent network structures or in highly interconnected subnetworks. Most of the transcription factors found in the inferred network had a putative regulatory link to only one other gene or exhibited cis-regulation.
对一个分离群体中的基因表达进行遗传分析,该群体在整个基因组的DNA标记处进行表达谱分析和基因分型,可以揭示多态基因的调控网络。我们提出了一个包含几个步骤的分析策略:(1)对所有表达谱进行全基因组QTL分析,以识别eQTL置信区域,随后对已识别的eQTL进行精细定位;(2)在每个eQTL区域识别调控候选基因;(3)对任何eQTL区域中候选基因的表达谱与受该eQTL影响的基因进行相关性分析,以减少候选基因的数量;(4)从保留的调控候选基因到受eQTL影响的基因绘制定向链接,并连接链接以形成网络;以及(5)对推断的网络结构进行统计验证和优化。在此,我们将该策略的初步实施方案应用于一个分离的酵母群体。在已识别的eQTL区域中,分别有65%、7%和28%的区域在步骤3中保留了单个候选调控基因、无基因或多个基因。总体而言,保留了768个假定的调控链接,其中331个是最强的候选链接,因为它们在表达相关性分析中被保留,并且位于通过分离多个连锁QTL的多标记分析确定的eQTL子区域内或附近。在独立的网络结构或高度互连的子网络中,有一个或几个生物学过程在统计学上有显著的过度富集。在推断的网络中发现的大多数转录因子与仅一个其他基因有假定的调控链接,或表现出顺式调控。