Rasko David A, Myers Garry S A, Ravel Jacques
The Institute for Genomic Research, 9712 Medical Center Drive, Rockville, MD 20850, USA.
BMC Bioinformatics. 2005 Jan 5;6:2. doi: 10.1186/1471-2105-6-2.
The first microbial genome sequence, Haemophilus influenzae, was published in 1995. Since then, more than 400 microbial genome sequences have been completed or commenced. This massive influx of data provides the opportunity to obtain biological insights through comparative genomics. However few tools are available for this scale of comparative analysis.
The BLAST Score Ratio (BSR) approach, implemented in a Perl script, classifies all putative peptides within three genomes using a measure of similarity based on the ratio of BLAST scores. The output of the BSR analysis enables global visualization of the degree of proteome similarity between all three genomes. Additional output enables the genomic synteny (conserved gene order) between each genome pair to be assessed. Furthermore, we extend this synteny analysis by overlaying BSR data as a color dimension, enabling visualization of the degree of similarity of the peptides being compared.
Combining the degree of similarity, synteny and annotation will allow rapid identification of conserved genomic regions as well as a number of common genomic rearrangements such as insertions, deletions and inversions. The script and example visualizations are available at: http://www.microbialgenomics.org/BSR/.
1995年公布了首个微生物基因组序列——流感嗜血杆菌的基因组序列。从那时起,已有400多个微生物基因组序列完成测序或正在进行测序。如此大量的数据涌入为通过比较基因组学获取生物学见解提供了机会。然而,针对这种规模的比较分析,可用的工具很少。
通过一个Perl脚本实现的BLAST评分比(BSR)方法,使用基于BLAST评分比的相似性度量对三个基因组中的所有假定肽段进行分类。BSR分析的输出能够全局可视化所有三个基因组之间蛋白质组的相似程度。额外的输出能够评估每对基因组之间的基因组共线性(保守的基因顺序)。此外,我们通过将BSR数据作为颜色维度叠加,扩展了这种共线性分析,从而能够可视化所比较肽段的相似程度。
结合相似程度、共线性和注释信息,将能够快速识别保守的基因组区域以及一些常见的基因组重排,如插入、缺失和倒位。该脚本及示例可视化内容可在以下网址获取:http://www.microbialgenomics.org/BSR/ 。