Suppr超能文献

猪基因组中的结构化RNA和共线性区域。

Structured RNAs and synteny regions in the pig genome.

作者信息

Anthon Christian, Tafer Hakim, Havgaard Jakob H, Thomsen Bo, Hedegaard Jakob, Seemann Stefan E, Pundhir Sachin, Kehr Stephanie, Bartschat Sebastian, Nielsen Mathilde, Nielsen Rasmus O, Fredholm Merete, Stadler Peter F, Gorodkin Jan

机构信息

Center for non-coding RNA in Technology and Health, University of Copenhagen, DK-1870 Frederiksberg, Denmark.

出版信息

BMC Genomics. 2014 Jun 10;15(1):459. doi: 10.1186/1471-2164-15-459.

Abstract

BACKGROUND

Annotating mammalian genomes for noncoding RNAs (ncRNAs) is nontrivial since far from all ncRNAs are known and the computational models are resource demanding. Currently, the human genome holds the best mammalian ncRNA annotation, a result of numerous efforts by several groups. However, a more direct strategy is desired for the increasing number of sequenced mammalian genomes of which some, such as the pig, are relevant as disease models and production animals.

RESULTS

We present a comprehensive annotation of structured RNAs in the pig genome. Combining sequence and structure similarity search as well as class specific methods, we obtained a conservative set with a total of 3,391 structured RNA loci of which 1,011 and 2,314, respectively, hold strong sequence and structure similarity to structured RNAs in existing databases. The RNA loci cover 139 cis-regulatory element loci, 58 lncRNA loci, 11 conflicts of annotation, and 3,183 ncRNA genes. The ncRNA genes comprise 359 miRNAs, 8 ribozymes, 185 rRNAs, 638 snoRNAs, 1,030 snRNAs, 810 tRNAs and 153 ncRNA genes not belonging to the here fore mentioned classes. When running the pipeline on a local shuffled version of the genome, we obtained no matches at the highest confidence level. Additional analysis of RNA-seq data from a pooled library from 10 different pig tissues added another 165 miRNA loci, yielding an overall annotation of 3,556 structured RNA loci. This annotation represents our best effort at making an automated annotation. To further enhance the reliability, 571 of the 3,556 structured RNAs were manually curated by methods depending on the RNA class while 1,581 were declared as pseudogenes. We further created a multiple alignment of pig against 20 representative vertebrates, from which RNAz predicted 83,859 de novo RNA loci with conserved RNA structures. 528 of the RNAz predictions overlapped with the homology based annotation or novel miRNAs. We further present a substantial synteny analysis which includes 1,004 lineage specific de novo RNA loci and 4 ncRNA loci in the known annotation specific for Laurasiatheria (pig, cow, dolphin, horse, cat, dog, hedgehog).

CONCLUSIONS

We have obtained one of the most comprehensive annotations for structured ncRNAs of a mammalian genome, which is likely to play central roles in both health modelling and production. The core annotation is available in Ensembl 70 and the complete annotation is available at http://rth.dk/resources/rnannotator/susscr102/version1.02.

摘要

背景

对哺乳动物基因组中的非编码RNA(ncRNA)进行注释并非易事,因为并非所有的ncRNA都已为人所知,而且计算模型对资源要求很高。目前,人类基因组拥有最佳的哺乳动物ncRNA注释,这是多个研究团队大量工作的成果。然而,对于越来越多已测序的哺乳动物基因组来说,需要一种更直接的策略,其中一些基因组,如猪的基因组,作为疾病模型和生产动物具有重要意义。

结果

我们对猪基因组中的结构化RNA进行了全面注释。通过结合序列和结构相似性搜索以及特定类别的方法,我们获得了一个保守的集合,其中共有3391个结构化RNA位点,其中分别有1011个和2314个位点与现有数据库中的结构化RNA具有很强的序列和结构相似性。这些RNA位点涵盖了139个顺式调控元件位点、58个lncRNA位点、11个注释冲突位点以及3183个ncRNA基因。这些ncRNA基因包括359个miRNA、8个核酶、185个rRNA、638个snoRNA、1030个snRNA、810个tRNA以及153个不属于上述类别的ncRNA基因。当在基因组的本地随机版本上运行该流程时,我们在最高置信水平下未获得匹配结果。对来自10种不同猪组织的混合文库的RNA-seq数据进行的额外分析又增加了165个miRNA位点,从而得到了3556个结构化RNA位点的总体注释。此注释代表了我们在进行自动注释方面的最大努力。为了进一步提高可靠性,我们根据RNA类别,通过多种方法对3556个结构化RNA中的571个进行了人工筛选,同时将1581个声明为假基因。我们还对猪与20种代表性脊椎动物进行了多重比对,从中RNAz预测出83859个具有保守RNA结构的全新RNA位点。RNAz预测的528个位点与基于同源性的注释或新的miRNA重叠。我们还进行了大量的共线性分析,其中包括1004个特定谱系的全新RNA位点以及在劳亚兽总目(猪、牛、海豚、马、猫、狗、刺猬)已知注释中特定的4个ncRNA位点。

结论

我们获得了哺乳动物基因组中结构化ncRNA最全面的注释之一,这可能在健康建模和生产中都发挥核心作用。核心注释可在Ensembl 70中获取,完整注释可在http://rth.dk/resources/rnannotator/susscr102/version1.02上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/24df/4124155/ec8d227bee9c/12864_2013_6316_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验