• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过图论实现条件共调控的新方法,以从微阵列数据中推导共表达基因。

Novel implementation of conditional co-regulation by graph theory to derive co-expressed genes from microarray data.

作者信息

Rawat Arun, Seifert Georg J, Deng Youping

机构信息

University of Southern Mississippi, Hattiesburg, MS-39406, USA.

出版信息

BMC Bioinformatics. 2008 Aug 12;9 Suppl 9(Suppl 9):S7. doi: 10.1186/1471-2105-9-S9-S7.

DOI:10.1186/1471-2105-9-S9-S7
PMID:18793471
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2537558/
Abstract

BACKGROUND

Most existing transcriptional databases like Comprehensive Systems-Biology Database (CSB.DB) and Arabidopsis Microarray Database and Analysis Toolbox (GENEVESTIGATOR) help to seek a shared biological role (similar pathways and biosynthetic cycles) based on correlation. These utilize conventional methods like Pearson correlation and Spearman rank correlation to calculate correlation among genes. However, not all are genes expressed in all the conditions and this leads to their exclusion in these transcriptional databases that consist of experiments performed in varied conditions. This leads to incomplete studies of co-regulation among groups of genes that might be linked to the same or related biosynthetic pathway.

RESULTS

We have implemented an alternate method based on graph theory that takes into consideration the biological assumption - conditional co-regulation is needed to mine a large transcriptional data bank and properties of microarray data. The algorithm calculates relationships among genes by converting discretized signals from the time series microarray data (AtGenExpress) to output strings. A 'score' is generated by using a similarity index against all the other genes by matching stored strings for any gene queried against our database.Taking carbohydrate metabolism as a test case, we observed that those genes known to be involved in similar functions and pathways generate a high 'score' with the queried gene. We were also able to recognize most of the randomly selected correlated pairs from Pearson correlation in CSB.DB and generate a higher number of relationships that might be biologically important. One advantage of our method over previously described approaches is that it includes all genes regardless of its expression values thereby highlighting important relationships absent in other contemporary databases.

CONCLUSION

Based on promising results, we understand that incorporating conditional co-regulation to study large expression data helps us identify novel relationships among genes. The other advantage of our approach is that mining expression data from various experiments, the genes that do not express in all the conditions or have low expression values are not excluded, thereby giving a better overall picture. This results in addressing known limitations of clustering methods in which genes that are expressed in only a subset of conditions are omitted.Based on further scope to extract information, ASIDB implementing above described approach has been initiated as a model database. ASIDB is available at http://www.asidb.com.

摘要

背景

大多数现有的转录数据库,如综合系统生物学数据库(CSB.DB)和拟南芥微阵列数据库与分析工具箱(GENEVESTIGATOR),有助于基于相关性寻找共享的生物学作用(相似的途径和生物合成循环)。这些数据库利用皮尔逊相关和斯皮尔曼等级相关等传统方法来计算基因之间的相关性。然而,并非所有基因在所有条件下都表达,这导致它们在这些由不同条件下进行的实验组成的转录数据库中被排除。这导致对可能与相同或相关生物合成途径相关的基因群之间的共调控研究不完整。

结果

我们实施了一种基于图论的替代方法,该方法考虑了生物学假设——挖掘大型转录数据库和微阵列数据属性需要条件共调控。该算法通过将时间序列微阵列数据(AtGenExpress)的离散信号转换为输出字符串来计算基因之间的关系。通过使用针对数据库中任何查询基因的存储字符串进行匹配,与所有其他基因的相似性指数生成一个“分数”。以碳水化合物代谢为例,我们观察到那些已知参与相似功能和途径的基因与查询基因产生高“分数”。我们还能够识别CSB.DB中皮尔逊相关的大多数随机选择的相关对,并生成更多可能具有生物学重要性的关系。我们的方法相对于先前描述的方法的一个优点是,它包括所有基因,无论其表达值如何,从而突出了其他当代数据库中不存在的重要关系。

结论

基于有希望的结果,我们明白纳入条件共调控来研究大型表达数据有助于我们识别基因之间的新关系。我们方法的另一个优点是,挖掘来自各种实验的表达数据时,不会排除那些在所有条件下都不表达或表达值较低的基因,从而给出更好的整体情况。这解决了聚类方法的已知局限性,即在聚类方法中,仅在部分条件下表达的基因被省略。基于进一步提取信息的空间,已启动实施上述方法的ASIDB作为模型数据库。可在http://www.asidb.com访问ASIDB。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37a7/2537558/23f290b086ef/1471-2105-9-S9-S7-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37a7/2537558/08300b4f7414/1471-2105-9-S9-S7-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37a7/2537558/fac163ff6787/1471-2105-9-S9-S7-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37a7/2537558/c738988fff5c/1471-2105-9-S9-S7-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37a7/2537558/23f290b086ef/1471-2105-9-S9-S7-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37a7/2537558/08300b4f7414/1471-2105-9-S9-S7-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37a7/2537558/fac163ff6787/1471-2105-9-S9-S7-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37a7/2537558/c738988fff5c/1471-2105-9-S9-S7-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37a7/2537558/23f290b086ef/1471-2105-9-S9-S7-4.jpg

相似文献

1
Novel implementation of conditional co-regulation by graph theory to derive co-expressed genes from microarray data.通过图论实现条件共调控的新方法,以从微阵列数据中推导共表达基因。
BMC Bioinformatics. 2008 Aug 12;9 Suppl 9(Suppl 9):S7. doi: 10.1186/1471-2105-9-S9-S7.
2
Constructing gene co-expression networks and predicting functions of unknown genes by random matrix theory.利用随机矩阵理论构建基因共表达网络并预测未知基因的功能。
BMC Bioinformatics. 2007 Aug 14;8:299. doi: 10.1186/1471-2105-8-299.
3
A robust measure of correlation between two genes on a microarray.一种用于衡量微阵列上两个基因之间相关性的可靠方法。
BMC Bioinformatics. 2007 Jun 25;8:220. doi: 10.1186/1471-2105-8-220.
4
Network constrained clustering for gene microarray data.基因微阵列数据的网络约束聚类
Bioinformatics. 2005 Nov 1;21(21):4014-20. doi: 10.1093/bioinformatics/bti655. Epub 2005 Sep 1.
5
Detecting biological associations between genes based on the theory of phase synchronization.基于相位同步理论检测基因之间的生物学关联。
Biosystems. 2008 May;92(2):99-113. doi: 10.1016/j.biosystems.2007.12.006. Epub 2008 Jan 11.
6
Combining sequence and time series expression data to learn transcriptional modules.结合序列和时间序列表达数据以学习转录模块。
IEEE/ACM Trans Comput Biol Bioinform. 2005 Jul-Sep;2(3):194-202. doi: 10.1109/TCBB.2005.34.
7
Analysis of a Gibbs sampler method for model-based clustering of gene expression data.一种基于模型的基因表达数据聚类的吉布斯采样器方法分析。
Bioinformatics. 2008 Jan 15;24(2):176-83. doi: 10.1093/bioinformatics/btm562. Epub 2007 Nov 22.
8
Building pathway clusters from Random Forests classification using class votes.利用类别投票从随机森林分类中构建通路簇。
BMC Bioinformatics. 2008 Feb 6;9:87. doi: 10.1186/1471-2105-9-87.
9
A phase synchronization clustering algorithm for identifying interesting groups of genes from cell cycle expression data.一种用于从细胞周期表达数据中识别有趣基因组的相位同步聚类算法。
BMC Bioinformatics. 2008 Jan 28;9:56. doi: 10.1186/1471-2105-9-56.
10
Modeling gene expression networks using fuzzy logic.使用模糊逻辑对基因表达网络进行建模。
IEEE Trans Syst Man Cybern B Cybern. 2005 Dec;35(6):1351-9. doi: 10.1109/tsmcb.2005.855590.

引用本文的文献

1
Predicting horizontal gene transfers with perfect transfer networks.利用完美转移网络预测水平基因转移。
Algorithms Mol Biol. 2024 Feb 6;19(1):6. doi: 10.1186/s13015-023-00242-2.
2
Systems toxicology identifies mechanistic impacts of 2-amino-4,6-dinitrotoluene (2A-DNT) exposure in Northern Bobwhite.系统毒理学确定了北美鹑暴露于2-氨基-4,6-二硝基甲苯(2A-DNT)的机制影响。
BMC Genomics. 2015 Aug 7;16(1):587. doi: 10.1186/s12864-015-1798-4.
3
Paradigm of Time-sequence Development of the Intestine of Suckling Piglets with Microarray.

本文引用的文献

1
Microarray probe expression measures, data normalization and statistical validation.微阵列探针表达测量、数据归一化及统计验证。
Comp Funct Genomics. 2003;4(4):442-6. doi: 10.1002/cfg.312.
2
Distinct properties of the five UDP-D-glucose/UDP-D-galactose 4-epimerase isoforms of Arabidopsis thaliana.拟南芥五种UDP-D-葡萄糖/UDP-D-半乳糖4-表异构酶亚型的独特特性。
J Biol Chem. 2006 Jun 23;281(25):17276-17285. doi: 10.1074/jbc.M512727200. Epub 2006 Apr 27.
3
Identification of novel genes in Arabidopsis involved in secondary cell wall formation using expression profiling and reverse genetics.
利用基因芯片技术研究哺乳仔猪肠道时序发育的模式
Asian-Australas J Anim Sci. 2012 Oct;25(10):1481-92. doi: 10.5713/ajas.2012.12004.
4
Co-expression of cell-wall related genes: new tools and insights.细胞壁相关基因的共表达:新工具和新见解。
Front Plant Sci. 2012 May 3;3:83. doi: 10.3389/fpls.2012.00083. eCollection 2012.
5
Multiscale feature analysis of salivary gland branching morphogenesis.唾液腺分支形态发生的多尺度特征分析。
PLoS One. 2012;7(3):e32906. doi: 10.1371/journal.pone.0032906. Epub 2012 Mar 5.
6
From raw materials to validated system: the construction of a genomic library and microarray to interpret systemic perturbations in Northern bobwhite.从原材料到验证系统:构建基因组文库和微阵列以解释北美野鹌鹑系统扰动。
Physiol Genomics. 2010 Jul 7;42(2):219-35. doi: 10.1152/physiolgenomics.00022.2010. Epub 2010 Apr 20.
7
Toward genome-wide metabolotyping and elucidation of metabolic system: metabolic profiling of large-scale bioresources.迈向全基因组代谢组学和代谢系统阐明:大规模生物资源的代谢组学分析。
J Plant Res. 2010 May;123(3):291-8. doi: 10.1007/s10265-010-0337-2. Epub 2010 Apr 6.
8
In silico evaluation of predicted regulatory interactions in Arabidopsis thaliana.拟南芥中预测调控相互作用的计算评估。
BMC Bioinformatics. 2009 Dec 21;10:435. doi: 10.1186/1471-2105-10-435.
9
Quantification of spatial parameters in 3D cellular constructs using graph theory.使用图论对3D细胞构建体中的空间参数进行量化。
J Biomed Biotechnol. 2009;2009:928286. doi: 10.1155/2009/928286. Epub 2009 Nov 10.
10
Proceedings of the 2009 MidSouth Computational Biology and Bioinformatics Society (MCBIOS) conference. Introduction.2009年中南计算生物学与生物信息学学会(MCBIOS)会议论文集。引言。
BMC Bioinformatics. 2009 Oct 8;10 Suppl 11(Suppl 11):S1. doi: 10.1186/1471-2105-10-S11-S1.
利用表达谱和反向遗传学鉴定拟南芥中参与次生细胞壁形成的新基因。
Plant Cell. 2005 Aug;17(8):2281-95. doi: 10.1105/tpc.105.031542. Epub 2005 Jun 24.
4
Identification of genes required for cellulose synthesis by regression analysis of public microarray data sets.通过对公共微阵列数据集进行回归分析来鉴定纤维素合成所需的基因。
Proc Natl Acad Sci U S A. 2005 Jun 14;102(24):8633-8. doi: 10.1073/pnas.0503392102. Epub 2005 Jun 2.
5
Estimating genomic coexpression networks using first-order conditional independence.使用一阶条件独立性估计基因组共表达网络。
Genome Biol. 2004;5(12):R100. doi: 10.1186/gb-2004-5-12-r100. Epub 2004 Nov 30.
6
GENEVESTIGATOR. Arabidopsis microarray database and analysis toolbox.GENEVESTIGATOR. 拟南芥微阵列数据库及分析工具箱。
Plant Physiol. 2004 Sep;136(1):2621-32. doi: 10.1104/pp.104.046367.
7
CSB.DB: a comprehensive systems-biology database.CSB.DB:一个综合性的系统生物学数据库。
Bioinformatics. 2004 Dec 12;20(18):3647-51. doi: 10.1093/bioinformatics/bth398. Epub 2004 Jul 9.
8
Functional annotation of the Arabidopsis genome using controlled vocabularies.使用受控词汇对拟南芥基因组进行功能注释。
Plant Physiol. 2004 Jun;135(2):745-55. doi: 10.1104/pp.104.040071. Epub 2004 Jun 1.
9
Glycosyltransferases and cell wall biosynthesis: novel players and insights.糖基转移酶与细胞壁生物合成:新角色与新见解
Curr Opin Plant Biol. 2004 Jun;7(3):285-95. doi: 10.1016/j.pbi.2004.03.006.
10
Nucleotide sugar interconversions and cell wall biosynthesis: how to bring the inside to the outside.核苷酸糖的相互转化与细胞壁生物合成:如何将内部物质运输到外部。
Curr Opin Plant Biol. 2004 Jun;7(3):277-84. doi: 10.1016/j.pbi.2004.03.004.