Suppr超能文献

果蝇胚胎基因表达模式的自动图像分析

Automatic image analysis for gene expression patterns of fly embryos.

作者信息

Peng Hanchuan, Long Fuhui, Zhou Jie, Leung Garmay, Eisen Michael B, Myers Eugene W

机构信息

Janelia Farm Research Campus, Howard Hughes Medical Institute, Ashburn, VA 20147, USA.

出版信息

BMC Cell Biol. 2007 Jul 10;8 Suppl 1(Suppl 1):S7. doi: 10.1186/1471-2121-8-S1-S7.

Abstract

BACKGROUND

Staining the mRNA of a gene via in situ hybridization (ISH) during the development of a D. melanogaster embryo delivers the detailed spatio-temporal pattern of expression of the gene. Many biological problems such as the detection of co-expressed genes, co-regulated genes, and transcription factor binding motifs rely heavily on the analyses of these image patterns. The increasing availability of ISH image data motivates the development of automated computational approaches to the analysis of gene expression patterns.

RESULTS

We have developed algorithms and associated software that extracts a feature representation of a gene expression pattern from an ISH image, that clusters genes sharing the same spatio-temporal pattern of expression, that suggests transcription factor binding (TFB) site motifs for genes that appear to be co-regulated (based on the clustering), and that automatically identifies the anatomical regions that express a gene given a training set of annotations. In fact, we developed three different feature representations, based on Gaussian Mixture Models (GMM), Principal Component Analysis (PCA), and wavelet functions, each having different merits with respect to the tasks above. For clustering image patterns, we developed a minimum spanning tree method (MSTCUT), and for proposing TFB sites we used standard motif finders on clustered/co-expressed genes with the added twist of requiring conservation across the genomes of 8 related fly species. Lastly, we trained a suite of binary-classifiers, one for each anatomical annotation term in a controlled vocabulary or ontology that operate on the wavelet feature representation. We report the results of applying these methods to the Berkeley Drosophila Genome Project (BDGP) gene expression database.

CONCLUSION

Our automatic image analysis methods recapitulate known co-regulated genes and give correct developmental-stage classifications with 99+% accuracy, despite variations in morphology, orientation, and focal plane suggesting that these techniques form a set of useful tools for the large-scale computational analysis of fly embryonic gene expression patterns.

摘要

背景

在黑腹果蝇胚胎发育过程中,通过原位杂交(ISH)对基因的mRNA进行染色,可呈现该基因详细的时空表达模式。许多生物学问题,如共表达基因、共调控基因的检测以及转录因子结合基序,都严重依赖于对这些图像模式的分析。ISH图像数据的日益丰富,促使人们开发自动化计算方法来分析基因表达模式。

结果

我们开发了算法及相关软件,可从ISH图像中提取基因表达模式的特征表示,对具有相同时空表达模式的基因进行聚类,为基于聚类显示似乎共调控的基因推测转录因子结合(TFB)位点基序,并在给定注释训练集的情况下自动识别表达某一基因的解剖区域。事实上,我们基于高斯混合模型(GMM)、主成分分析(PCA)和小波函数开发了三种不同的特征表示,每种在上述任务方面都有不同优点。对于聚类图像模式,我们开发了一种最小生成树方法(MSTCUT),对于推测TFB位点,我们在聚类/共表达基因上使用标准基序查找器,并增加了在8种相关果蝇基因组中保守性的要求。最后,我们训练了一组二元分类器,针对受控词汇表或本体中的每个解剖注释术语各有一个,这些分类器基于小波特征表示进行操作。我们报告了将这些方法应用于伯克利果蝇基因组计划(BDGP)基因表达数据库的结果。

结论

我们的自动图像分析方法概括了已知的共调控基因,并以99%以上的准确率给出正确的发育阶段分类,尽管存在形态、方向和焦平面的变化,这表明这些技术构成了一组用于果蝇胚胎基因表达模式大规模计算分析的有用工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f87c/1924512/14bc9eed864e/1471-2121-8-S1-S7-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验