Suppr超能文献

基于模体的基因表达数据稀疏分解用于调控模块识别。

Motif-guided sparse decomposition of gene expression data for regulatory module identification.

机构信息

Bradley Department of Electrical and Computer Engineering, Virginia Tech, Arlington, VA 22203, USA.

出版信息

BMC Bioinformatics. 2011 Mar 22;12:82. doi: 10.1186/1471-2105-12-82.

Abstract

BACKGROUND

Genes work coordinately as gene modules or gene networks. Various computational approaches have been proposed to find gene modules based on gene expression data; for example, gene clustering is a popular method for grouping genes with similar gene expression patterns. However, traditional gene clustering often yields unsatisfactory results for regulatory module identification because the resulting gene clusters are co-expressed but not necessarily co-regulated.

RESULTS

We propose a novel approach, motif-guided sparse decomposition (mSD), to identify gene regulatory modules by integrating gene expression data and DNA sequence motif information. The mSD approach is implemented as a two-step algorithm comprising estimates of (1) transcription factor activity and (2) the strength of the predicted gene regulation event(s). Specifically, a motif-guided clustering method is first developed to estimate the transcription factor activity of a gene module; sparse component analysis is then applied to estimate the regulation strength, and so predict the target genes of the transcription factors. The mSD approach was first tested for its improved performance in finding regulatory modules using simulated and real yeast data, revealing functionally distinct gene modules enriched with biologically validated transcription factors. We then demonstrated the efficacy of the mSD approach on breast cancer cell line data and uncovered several important gene regulatory modules related to endocrine therapy of breast cancer.

CONCLUSION

We have developed a new integrated strategy, namely motif-guided sparse decomposition (mSD) of gene expression data, for regulatory module identification. The mSD method features a novel motif-guided clustering method for transcription factor activity estimation by finding a balance between co-regulation and co-expression. The mSD method further utilizes a sparse decomposition method for regulation strength estimation. The experimental results show that such a motif-guided strategy can provide context-specific regulatory modules in both yeast and breast cancer studies.

摘要

背景

基因作为基因模块或基因网络协同工作。已经提出了各种计算方法来根据基因表达数据找到基因模块;例如,基因聚类是一种将具有相似基因表达模式的基因进行分组的常用方法。然而,由于传统的基因聚类通常不能很好地识别调控模块,因为得到的基因簇是共表达的,但不一定是共调控的。

结果

我们提出了一种新的方法,即基于 motif 的稀疏分解(mSD),通过整合基因表达数据和 DNA 序列 motif 信息来识别基因调控模块。mSD 方法是作为一个两步算法实现的,包括(1)转录因子活性和(2)预测的基因调控事件的强度的估计。具体来说,首先开发了一种基于 motif 的聚类方法来估计基因模块的转录因子活性;然后应用稀疏成分分析来估计调节强度,从而预测转录因子的靶基因。mSD 方法首先在模拟和真实酵母数据上进行了测试,以验证其在寻找调控模块方面的改进性能,揭示了功能不同的基因模块,其中富集了具有生物学验证的转录因子。然后,我们在乳腺癌细胞系数据上展示了 mSD 方法的功效,并发现了几个与乳腺癌内分泌治疗相关的重要基因调控模块。

结论

我们开发了一种新的集成策略,即基因表达数据的基于 motif 的稀疏分解(mSD),用于调控模块识别。mSD 方法的特点是一种新的基于 motif 的聚类方法,用于通过找到共调节和共表达之间的平衡来估计转录因子活性。mSD 方法进一步利用稀疏分解方法来估计调节强度。实验结果表明,这种基于 motif 的策略可以在酵母和乳腺癌研究中提供特定于上下文的调控模块。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/929d/3072956/4ff67a4c943c/1471-2105-12-82-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验