Suppr超能文献

通过图论实现条件共调控的新方法,以从微阵列数据中推导共表达基因。

Novel implementation of conditional co-regulation by graph theory to derive co-expressed genes from microarray data.

作者信息

Rawat Arun, Seifert Georg J, Deng Youping

机构信息

University of Southern Mississippi, Hattiesburg, MS-39406, USA.

出版信息

BMC Bioinformatics. 2008 Aug 12;9 Suppl 9(Suppl 9):S7. doi: 10.1186/1471-2105-9-S9-S7.

Abstract

BACKGROUND

Most existing transcriptional databases like Comprehensive Systems-Biology Database (CSB.DB) and Arabidopsis Microarray Database and Analysis Toolbox (GENEVESTIGATOR) help to seek a shared biological role (similar pathways and biosynthetic cycles) based on correlation. These utilize conventional methods like Pearson correlation and Spearman rank correlation to calculate correlation among genes. However, not all are genes expressed in all the conditions and this leads to their exclusion in these transcriptional databases that consist of experiments performed in varied conditions. This leads to incomplete studies of co-regulation among groups of genes that might be linked to the same or related biosynthetic pathway.

RESULTS

We have implemented an alternate method based on graph theory that takes into consideration the biological assumption - conditional co-regulation is needed to mine a large transcriptional data bank and properties of microarray data. The algorithm calculates relationships among genes by converting discretized signals from the time series microarray data (AtGenExpress) to output strings. A 'score' is generated by using a similarity index against all the other genes by matching stored strings for any gene queried against our database.Taking carbohydrate metabolism as a test case, we observed that those genes known to be involved in similar functions and pathways generate a high 'score' with the queried gene. We were also able to recognize most of the randomly selected correlated pairs from Pearson correlation in CSB.DB and generate a higher number of relationships that might be biologically important. One advantage of our method over previously described approaches is that it includes all genes regardless of its expression values thereby highlighting important relationships absent in other contemporary databases.

CONCLUSION

Based on promising results, we understand that incorporating conditional co-regulation to study large expression data helps us identify novel relationships among genes. The other advantage of our approach is that mining expression data from various experiments, the genes that do not express in all the conditions or have low expression values are not excluded, thereby giving a better overall picture. This results in addressing known limitations of clustering methods in which genes that are expressed in only a subset of conditions are omitted.Based on further scope to extract information, ASIDB implementing above described approach has been initiated as a model database. ASIDB is available at http://www.asidb.com.

摘要

背景

大多数现有的转录数据库,如综合系统生物学数据库(CSB.DB)和拟南芥微阵列数据库与分析工具箱(GENEVESTIGATOR),有助于基于相关性寻找共享的生物学作用(相似的途径和生物合成循环)。这些数据库利用皮尔逊相关和斯皮尔曼等级相关等传统方法来计算基因之间的相关性。然而,并非所有基因在所有条件下都表达,这导致它们在这些由不同条件下进行的实验组成的转录数据库中被排除。这导致对可能与相同或相关生物合成途径相关的基因群之间的共调控研究不完整。

结果

我们实施了一种基于图论的替代方法,该方法考虑了生物学假设——挖掘大型转录数据库和微阵列数据属性需要条件共调控。该算法通过将时间序列微阵列数据(AtGenExpress)的离散信号转换为输出字符串来计算基因之间的关系。通过使用针对数据库中任何查询基因的存储字符串进行匹配,与所有其他基因的相似性指数生成一个“分数”。以碳水化合物代谢为例,我们观察到那些已知参与相似功能和途径的基因与查询基因产生高“分数”。我们还能够识别CSB.DB中皮尔逊相关的大多数随机选择的相关对,并生成更多可能具有生物学重要性的关系。我们的方法相对于先前描述的方法的一个优点是,它包括所有基因,无论其表达值如何,从而突出了其他当代数据库中不存在的重要关系。

结论

基于有希望的结果,我们明白纳入条件共调控来研究大型表达数据有助于我们识别基因之间的新关系。我们方法的另一个优点是,挖掘来自各种实验的表达数据时,不会排除那些在所有条件下都不表达或表达值较低的基因,从而给出更好的整体情况。这解决了聚类方法的已知局限性,即在聚类方法中,仅在部分条件下表达的基因被省略。基于进一步提取信息的空间,已启动实施上述方法的ASIDB作为模型数据库。可在http://www.asidb.com访问ASIDB。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37a7/2537558/08300b4f7414/1471-2105-9-S9-S7-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验