Suppr超能文献

另一个来自未映射reads 的教训:对来自不同马组织的 RNA-Seq reads 的深度分析。

Another lesson from unmapped reads: in-depth analysis of RNA-Seq reads from various horse tissues.

机构信息

Center for Experimental and Innovative Medicine, University of Agriculture in Krakow, Rędzina 1c, 30-248, Kraków, Poland.

Department of Animal Molecular Biology, National Research Institute of Animal Production, Krakowska 1, 32-083, Balice, Poland.

出版信息

J Appl Genet. 2022 Sep;63(3):571-581. doi: 10.1007/s13353-022-00705-z. Epub 2022 Jun 7.

Abstract

In recent years, a vast amount of sequencing data has been generated and large improvements have been made to reference genome sequences. Despite these advances, significant portions of reads still do not map to reference genomes and these reads have been considered as junk or artificial sequences. Recent studies have shown that these reads can be useful, e.g., for refining reference genomes or detecting contaminating microorganisms present in the analyzed biological samples. A special case of this is RNA sequencing (RNA-Seq) reads that come from tissue transcriptomes. Unmapped reads from RNA-Seq have received much less attention than those from whole-genome sequencing. In particular, in the horse, an analysis of unmapped RNA reads has not been performed yet. Thus, in this study, we analyzed the unmapped reads originating from the RNA-Seq performed through the Functional Annotation of Animal Genomes (FAANG) project in the horse, using eight different tissues from two mares. We demonstrated that unmapped reads from RNA-Seq could be easily assembled into transcripts relating to many important genes present in the sequences of other mammals. Large portions of these transcripts did not have coding potential and, thus, can be considered as non-coding RNA. Moreover, reads that were not mapped to the reference genome but aligned to the entries in NCBI database of horse proteins were enriched for biological processes that largely correspond to the functions of organ from which RNA was isolated and thus are presumably true transcripts of genes associated with cell metabolism in those tissues. In addition, a portion of reads aligned to the common pathogenic or neutral microbiota, of which the most common was Brucella spp. These data suggest that unmapped reads can be an important target for in-depth analysis that may substantially enrich results of initial RNA-Seq experiments for various tissues and organs.

摘要

近年来,产生了大量的测序数据,并对参考基因组序列进行了重大改进。尽管取得了这些进展,但仍有很大一部分读取序列无法与参考基因组匹配,这些读取序列被认为是垃圾或人工序列。最近的研究表明,这些读取序列可以被用来改进参考基因组,或检测分析生物样本中存在的污染微生物。这种情况的一个特殊例子是来自组织转录组的 RNA 测序(RNA-Seq)读取序列。与全基因组测序相比,未映射的 RNA-Seq 读取序列受到的关注要少得多。特别是,在马中,尚未对未映射的 RNA 读取序列进行分析。因此,在这项研究中,我们使用两匹母马的 8 种不同组织,分析了来自通过 FAANG 项目进行的 RNA-Seq 的未映射读取序列。我们证明,RNA-Seq 的未映射读取序列可以很容易地组装成与其他哺乳动物序列中存在的许多重要基因相关的转录本。这些转录本中的很大一部分没有编码潜力,因此可以被认为是非编码 RNA。此外,没有与参考基因组匹配但与马蛋白 NCBI 数据库中的条目匹配的读取序列富含生物过程,这些生物过程与 RNA 分离的器官的功能基本对应,因此推测是这些组织中与细胞代谢相关基因的真实转录本。此外,一部分读取序列与常见的致病性或中性微生物群对齐,其中最常见的是布鲁氏菌属。这些数据表明,未映射的读取序列可能是深入分析的重要目标,这可能会大大丰富各种组织和器官的初始 RNA-Seq 实验结果。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验