Suppr超能文献

使用贝叶斯分割方法对保守内含子非编码序列进行全基因组鉴定。

Genome-wide identification of conserved intronic non-coding sequences using a Bayesian segmentation approach.

作者信息

Algama Manjula, Tasker Edward, Williams Caitlin, Parslow Adam C, Bryson-Richardson Robert J, Keith Jonathan M

机构信息

School of Mathematical Sciences, Monash University, Melbourne, VIC, 3800, Australia.

School of Biological Sciences, Monash University, Melbourne, VIC, 3800, Australia.

出版信息

BMC Genomics. 2017 Mar 27;18(1):259. doi: 10.1186/s12864-017-3645-2.

Abstract

BACKGROUND

Computational identification of non-coding RNAs (ncRNAs) is a challenging problem. We describe a genome-wide analysis using Bayesian segmentation to identify intronic elements highly conserved between three evolutionarily distant vertebrate species: human, mouse and zebrafish. We investigate the extent to which these elements include ncRNAs (or conserved domains of ncRNAs) and regulatory sequences.

RESULTS

We identified 655 deeply conserved intronic sequences in a genome-wide analysis. We also performed a pathway-focussed analysis on genes involved in muscle development, detecting 27 intronic elements, of which 22 were not detected in the genome-wide analysis. At least 87% of the genome-wide and 70% of the pathway-focussed elements have existing annotations indicative of conserved RNA secondary structure. The expression of 26 of the pathway-focused elements was examined using RT-PCR, providing confirmation that they include expressed ncRNAs. Consistent with previous studies, these elements are significantly over-represented in the introns of transcription factors.

CONCLUSIONS

This study demonstrates a novel, highly effective, Bayesian approach to identifying conserved non-coding sequences. Our results complement previous findings that these sequences are enriched in transcription factors. However, in contrast to previous studies which suggest the majority of conserved sequences are regulatory factor binding sites, the majority of conserved sequences identified using our approach contain evidence of conserved RNA secondary structures, and our laboratory results suggest most are expressed. Functional roles at DNA and RNA levels are not mutually exclusive, and many of our elements possess evidence of both. Moreover, ncRNAs play roles in transcriptional and post-transcriptional regulation, and this may contribute to the over-representation of these elements in introns of transcription factors. We attribute the higher sensitivity of the pathway-focussed analysis compared to the genome-wide analysis to improved alignment quality, suggesting that enhanced genomic alignments may reveal many more conserved intronic sequences.

摘要

背景

非编码RNA(ncRNA)的计算识别是一个具有挑战性的问题。我们描述了一种全基因组分析方法,该方法使用贝叶斯分割来识别在人类、小鼠和斑马鱼这三种进化距离较远的脊椎动物物种之间高度保守的内含子元件。我们研究了这些元件在多大程度上包含ncRNA(或ncRNA的保守结构域)和调控序列。

结果

在全基因组分析中,我们识别出655个深度保守的内含子序列。我们还对参与肌肉发育的基因进行了通路聚焦分析,检测到27个内含子元件,其中22个在全基因组分析中未被检测到。全基因组元件中至少87%以及通路聚焦元件中70%具有表明保守RNA二级结构的现有注释。使用逆转录聚合酶链反应(RT-PCR)检测了26个通路聚焦元件的表达,证实它们包含表达的ncRNA。与先前的研究一致,这些元件在转录因子的内含子中显著富集。

结论

本研究展示了一种新颖、高效的贝叶斯方法来识别保守的非编码序列。我们的结果补充了先前的发现,即这些序列在转录因子中富集。然而,与先前表明大多数保守序列是调控因子结合位点的研究不同,使用我们的方法识别出的大多数保守序列包含保守RNA二级结构的证据,并且我们的实验室结果表明大多数是有表达的。DNA和RNA水平上的功能作用并非相互排斥,我们的许多元件同时具备这两方面的证据。此外,ncRNA在转录和转录后调控中发挥作用,这可能是这些元件在转录因子内含子中过度富集的原因。我们将通路聚焦分析相对于全基因组分析具有更高灵敏度归因于比对质量的提高,这表明增强的基因组比对可能会揭示更多保守的内含子序列。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a77/5369223/ddbfaebc4d20/12864_2017_3645_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验