Suppr超能文献

序列保守性的新度量。

A new measurement of sequence conservation.

机构信息

Center for Research in Biological Systems, University of California, San Diego, 9500 Gilman Dr. MC0446, La Jolla, CA 92093, USA.

出版信息

BMC Genomics. 2009 Dec 22;10:623. doi: 10.1186/1471-2164-10-623.

Abstract

BACKGROUND

Understanding sequence conservation is important for the study of sequence evolution and for the identification of functional regions of the genome. Current studies often measure sequence conservation based on every position in contiguous regions. Therefore, a large number of functional regions that contain conserved segments separated by relatively long divergent segments are ignored. Our goal in this paper is to define a new measurement of sequence conservation such that both contiguously conserved regions and discontiguously conserved regions can be detected based on this new measurement. Here and in the following, conserved regions are those regions that share similarity higher than a pre-specified similarity threshold with their homologous regions in other species. That is, conserved regions are good candidates of functional regions and may not be always functional. Moreover, conserved regions may contain long and divergent segments.

RESULTS

To identify both discontiguously and contiguously conserved regions, we proposed a new measurement of sequence conservation, which measures sequence similarity based only on the conserved segments within the regions. By defining conserved segments using the local alignment tool CHAOS, under the new measurement, we analyzed the conservation of 1642 experimentally verified human functional non-coding regions in the mouse genome. We found that the conservation in at least 11% of these functional regions could be missed by the current conservation analysis methods. We also found that 72% of the mouse homologous regions identified based on the new measurement are more similar to the human functional sequences than the aligned mouse sequences from the UCSC genome browser. We further compared BLAST and discontiguous MegaBLAST with our method. We found that our method picks up many more conserved segments than BLAST and discontiguous MegaBLAST in these regions.

CONCLUSIONS

It is critical to have a new measurement of sequence conservation that is based only on the conserved segments in one region. Such a new measurement can aid the identification of better local "orthologous" regions. It will also shed light on the identification of new types of conserved functional regions in vertebrate genomes.

摘要

背景

理解序列保守性对于研究序列进化和识别基因组的功能区域非常重要。目前的研究通常基于连续区域中的每个位置来衡量序列保守性。因此,大量包含保守片段的功能区域被忽略了,这些保守片段被相对较长的分歧片段隔开。我们的目标是定义一种新的序列保守性测量方法,以便可以基于该新测量方法检测连续和不连续的保守区域。在这里和以下内容中,保守区域是指与其他物种同源区域具有高于预定相似度阈值的相似性的区域。也就是说,保守区域是功能区域的良好候选者,但不一定总是具有功能。此外,保守区域可能包含长而分歧的片段。

结果

为了识别不连续和连续的保守区域,我们提出了一种新的序列保守性测量方法,该方法仅基于区域内的保守片段来衡量序列相似性。通过使用局部比对工具 CHAOS 定义保守片段,在新的测量方法下,我们分析了在小鼠基因组中 1642 个经过实验验证的人类功能非编码区域的保守性。我们发现,目前的保守性分析方法可能会错过这些功能区域中至少 11%的保守性。我们还发现,基于新测量方法识别的 72%的小鼠同源区域与人类功能序列的相似度高于 UCSC 基因组浏览器中对齐的小鼠序列。我们进一步将 BLAST 和不连续 MegaBLAST 与我们的方法进行了比较。我们发现,在这些区域中,我们的方法比 BLAST 和不连续 MegaBLAST 发现了更多的保守片段。

结论

仅基于一个区域中的保守片段来衡量序列保守性是至关重要的。这种新的测量方法可以帮助识别更好的局部“直系同源”区域。它还将为脊椎动物基因组中新型保守功能区域的识别提供启示。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76bf/2807881/11a6aaeefaaa/1471-2164-10-623-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验