Veiga Diogo F T, Vicente Fábio F R, Nicolás Marisa F, Vasconcelos Ana Tereza R
Laboratório Nacional de Computação Científica, Laboratório de Bioinformática, Av, Getúlio Vargas, 333 Petrópolis, Rio de Janeiro, Brasil.
BMC Microbiol. 2008 Jun 19;8:101. doi: 10.1186/1471-2180-8-101.
Little is known about bacterial transcriptional regulatory networks (TRNs). In Escherichia coli, which is the organism with the largest wet-lab validated TRN, its set of interactions involves only approximately 50% of the repertoire of transcription factors currently known, and ~25% of its genes. Of those, only a small proportion describes the regulation of processes that are clinically relevant, such as drug resistance mechanisms.
We designed feed-forward (FF) and bi-fan (BF) motif predictors for E. coli using multi-layer perceptron artificial neural networks (ANNs). The motif predictors were trained using a large dataset of gene expression data; the collection of motifs was extracted from the E. coli TRN. Each network motif was mapped to a vector of correlations which were computed using the gene expression profile of the elements in the motif. Thus, by combining network structural information with transcriptome data, FF and BF predictors were able to classify with a high precision of 83% and 96%, respectively, and with a high recall of 86% and 97%, respectively. These results were found when motifs were represented using different types of correlations together, i.e., Pearson, Spearman, Kendall, and partial correlation. We then applied the best predictors to hypothesize new regulations for 16 operons involved with multidrug resistance (MDR) efflux pumps, which are considered as a major bacterial mechanism to fight antimicrobial agents. As a result, the motif predictors assigned new transcription factors for these MDR proteins, turning them into high-quality candidates to be experimentally tested.
The motif predictors presented herein can be used to identify novel regulatory interactions by using microarray data. The presentation of an example motif to predictors will make them categorize whether or not the example motif is a BF, or whether or not it is an FF. This approach is useful to find new "pieces" of the TRN, when inspecting the regulation of a small set of operons. Furthermore, it shows that correlations of expression data can be used to discriminate between elements that are arranged in structural motifs and those in random sets of transcripts.
人们对细菌转录调控网络(TRN)了解甚少。在大肠杆菌中,这是拥有经湿实验室验证的最大TRN的生物体,其相互作用集仅涉及目前已知转录因子库的约50%,以及其约25%的基因。其中,只有一小部分描述了与临床相关过程的调控,如耐药机制。
我们使用多层感知器人工神经网络(ANN)为大肠杆菌设计了前馈(FF)和双扇形(BF)基序预测器。基序预测器使用大量基因表达数据进行训练;基序集合从大肠杆菌TRN中提取。每个网络基序都映射到一个相关性向量,该向量使用基序中元件的基因表达谱进行计算。因此,通过将网络结构信息与转录组数据相结合,FF和BF预测器能够分别以83%和96%的高精度以及86%和97%的高召回率进行分类。当使用不同类型的相关性(即皮尔逊、斯皮尔曼、肯德尔和偏相关性)共同表示基序时,发现了这些结果。然后,我们应用最佳预测器对16个与多药耐药(MDR)外排泵相关的操纵子假设新的调控,这些外排泵被认为是细菌对抗抗菌剂的主要机制。结果,基序预测器为这些MDR蛋白分配了新的转录因子,使其成为有待实验测试的高质量候选者。
本文提出的基序预测器可用于通过使用微阵列数据识别新的调控相互作用。向预测器展示一个示例基序将使它们能够对该示例基序是否为BF或是否为FF进行分类。当检查一小部分操纵子的调控时,这种方法有助于找到TRN的新“片段”。此外,它表明表达数据的相关性可用于区分以结构基序排列元件与随机转录本集合中的元件。