Suppr超能文献

通过混合校正的PacBio长读长生成的全长mRNA转录组改善了转录本注释并鉴定了大西洋鲑鱼中数千种新的剪接变体。

A Full-Length mRNA Transcriptome Generated From Hybrid-Corrected PacBio Long-Reads Improves the Transcript Annotation and Identifies Thousands of Novel Splice Variants in Atlantic Salmon.

作者信息

Ramberg Sigmund, Høyheim Bjørn, Østbye Tone-Kari Knutsdatter, Andreassen Rune

机构信息

Department of Life Sciences and Health, Faculty of Health Sciences, OsloMet - Oslo Metropolitan University, Oslo, Norway.

Department of Preclinical Sciences and Pathology, Faculty of Veterinary Medicine, Norwegian University of Life Sciences, Ås, Norway.

出版信息

Front Genet. 2021 Apr 27;12:656334. doi: 10.3389/fgene.2021.656334. eCollection 2021.

Abstract

Atlantic salmon () is a major species produced in world aquaculture and an important vertebrate model organism for studying the process of rediploidization following whole genome duplication events (Ss4R, 80 mya). The current transcriptome is largely generated from genome sequence based predictions supported by ESTs and short-read sequencing data. However, recent progress in long-read sequencing technologies now allows for full-length transcript sequencing from single RNA-molecules. This study provides a full-length mRNA transcriptome from liver, head-kidney and gill materials. A pipeline was developed based on Iso-seq sequencing of long-reads on the PacBio platform (HQ reads) followed by error-correction of the HQ reads by short-reads from the Illumina platform. The pipeline successfully processed more than 1.5 million long-reads and more than 900 million short-reads into error-corrected HQ reads. A surprisingly high percentage (32%) represented expressed interspersed repeats, while the remaining were processed into 71 461 full-length mRNAs from 23 071 loci. Each transcript was supported by several single-molecule long-read sequences and at least three short-reads, assuring a high sequence accuracy. On average, each gene was represented by three isoforms. Comparisons to the current Atlantic salmon transcripts in the RefSeq database showed that the long-read transcriptome validated 25% of all known transcripts, while the remaining full-length transcripts were novel isoforms, but few were transcripts from novel genes. A comparison to the current genome assembly indicates that the long-read transcriptome may aid in improving transcript annotation as well as provide long-read linkage information useful for improving the genome assembly. More than 80% of transcripts were assigned GO terms and thousands of transcripts were from genes or splice-variants expressed in an organ-specific manner demonstrating that hybrid error-corrected long-read transcriptomes may be applied to study genes and splice-variants expressed in certain organs or conditions (e.g., challenge materials). In conclusion, this is the single largest contribution of full-length mRNAs in Atlantic salmon. The results will be of great value to salmon genomics research, and the pipeline outlined may be applied to generate additional transcriptomes in Atlantic Salmon or applied for similar projects in other species.

摘要

大西洋鲑()是世界水产养殖中主要的养殖品种,也是研究全基因组复制事件(Ss4R,8000万年前)后再二倍体化过程的重要脊椎动物模式生物。目前的转录组主要来自基于基因组序列预测,并由EST和短读长测序数据支持。然而,长读长测序技术的最新进展现在允许从单个RNA分子进行全长转录本测序。本研究提供了来自肝脏、头肾和鳃组织的全长mRNA转录组。基于PacBio平台上长读长的Iso-seq测序(HQ读段)开发了一个流程,随后通过Illumina平台的短读长对HQ读段进行纠错。该流程成功地将超过150万个长读长和超过9亿个短读长处理为纠错后的HQ读段。令人惊讶的是,高比例(32%)代表表达的散布重复序列,其余的则被处理为来自23071个基因座的71461个全长mRNA。每个转录本都由几个单分子长读长序列和至少三个短读长支持,确保了高序列准确性。平均而言,每个基因由三种异构体代表。与RefSeq数据库中当前的大西洋鲑转录本进行比较表明,长读长转录组验证了所有已知转录本的25%,而其余的全长转录本是新的异构体,但很少是来自新基因的转录本。与当前基因组组装的比较表明,长读长转录组可能有助于改善转录本注释,并提供有助于改善基因组组装的长读长连锁信息。超过80%的转录本被赋予了GO术语,数千个转录本来自以器官特异性方式表达的基因或剪接变体,这表明混合纠错长读长转录组可用于研究在某些器官或条件下(如应激材料)表达的基因和剪接变体。总之,这是大西洋鲑全长mRNA的最大单一贡献。这些结果对鲑鱼基因组学研究具有重要价值,概述的流程可用于在大西洋鲑中生成更多的转录组,或应用于其他物种的类似项目。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/031d/8110904/9f6c2e5d3c8a/fgene-12-656334-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验