Suppr超能文献

Evolinc:一种用于长基因间非编码RNA鉴定与进化比较的工具。

Evolinc: A Tool for the Identification and Evolutionary Comparison of Long Intergenic Non-coding RNAs.

作者信息

Nelson Andrew D L, Devisetty Upendra K, Palos Kyle, Haug-Baltzell Asher K, Lyons Eric, Beilstein Mark A

机构信息

Beilstein Lab, School of Plant Sciences, University of ArizonaTucson, AZ, USA.

CyVerse, Bio5, University of ArizonaTucson, AZ, USA.

出版信息

Front Genet. 2017 May 9;8:52. doi: 10.3389/fgene.2017.00052. eCollection 2017.

Abstract

Long intergenic non-coding RNAs (lincRNAs) are an abundant and functionally diverse class of eukaryotic transcripts. Reported lincRNA repertoires in mammals vary, but are commonly in the thousands to tens of thousands of transcripts, covering ~90% of the genome. In addition to elucidating function, there is particular interest in understanding the origin and evolution of lincRNAs. Aside from mammals, lincRNA populations have been sparsely sampled, precluding evolutionary analyses focused on their emergence and persistence. Here we present Evolinc, a two-module pipeline designed to facilitate lincRNA discovery and characterize aspects of lincRNA evolution. The first module (Evolinc-I) is a lincRNA identification workflow that also facilitates downstream differential expression analysis and genome browser visualization of identified lincRNAs. The second module (Evolinc-II) is a genomic and transcriptomic comparative analysis workflow that determines the phylogenetic depth to which a lincRNA locus is conserved within a user-defined group of related species. Here we validate lincRNA catalogs generated with Evolinc-I against previously annotated Arabidopsis and human lincRNA data. Evolinc-I recapitulated earlier findings and uncovered an additional 70 Arabidopsis and 43 human lincRNAs. We demonstrate the usefulness of Evolinc-II by examining the evolutionary histories of a public dataset of 5,361 Arabidopsis lincRNAs. We used Evolinc-II to winnow this dataset to 40 lincRNAs conserved across species in Brassicaceae. Finally, we show how Evolinc-II can be used to recover the evolutionary history of a known lincRNA, the human telomerase RNA (TERC). These latter analyses revealed unexpected duplication events as well as the loss and subsequent acquisition of a novel TERC locus in the lineage leading to mice and rats. The Evolinc pipeline is currently integrated in CyVerse's Discovery Environment and is free for use by researchers.

摘要

长基因间非编码RNA(lincRNA)是一类丰富且功能多样的真核生物转录本。哺乳动物中已报道的lincRNA库各不相同,但通常有成千上万种转录本,覆盖约90%的基因组。除了阐明其功能外,人们对了解lincRNA的起源和进化也特别感兴趣。除了哺乳动物外,lincRNA群体的采样很少,这使得专注于其出现和存续的进化分析难以开展。在此,我们介绍Evolinc,这是一个由两个模块组成的流程,旨在促进lincRNA的发现并刻画lincRNA进化的各个方面。第一个模块(Evolinc-I)是一个lincRNA识别工作流程,它还便于对已识别的lincRNA进行下游差异表达分析和基因组浏览器可视化。第二个模块(Evolinc-II)是一个基因组和转录组比较分析工作流程,它可以确定lincRNA基因座在用户定义的一组相关物种中保守的系统发育深度。在此,我们根据之前注释的拟南芥和人类lincRNA数据验证了用Evolinc-I生成的lincRNA目录。Evolinc-I重现了早期的研究结果,并发现了另外70种拟南芥lincRNA和43种人类lincRNA。我们通过检查一个包含5361种拟南芥lincRNA的公共数据集的进化历史,展示了Evolinc-II的实用性。我们使用Evolinc-II将这个数据集筛选到40种在十字花科物种间保守的lincRNA。最后,我们展示了如何使用Evolinc-II来恢复一种已知lincRNA——人类端粒酶RNA(TERC)的进化历史。后面这些分析揭示了意外的复制事件,以及在导致小鼠和大鼠的谱系中一个新的TERC基因座的丢失和随后的获得。Evolinc流程目前已集成到CyVerse的发现环境中,可供研究人员免费使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0cec/5422434/afcf55161b75/fgene-08-00052-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验