Suppr超能文献

虹鳟鱼全基因组范围内长链非编码RNA的发现

Genome-Wide Discovery of Long Non-Coding RNAs in Rainbow Trout.

作者信息

Al-Tobasei Rafet, Paneru Bam, Salem Mohamed

机构信息

Computational Science Program, Middle Tennessee State University, Murfreesboro, TN, 37132, United States of America.

Department of Biology and Molecular Biosciences Program, Middle Tennessee State University, Murfreesboro, TN, 37132, United States of America.

出版信息

PLoS One. 2016 Feb 19;11(2):e0148940. doi: 10.1371/journal.pone.0148940. eCollection 2016.

Abstract

The ENCODE project revealed that ~70% of the human genome is transcribed. While only 1-2% of the RNAs encode for proteins, the rest are non-coding RNAs. Long non-coding RNAs (lncRNAs) form a diverse class of non-coding RNAs that are longer than 200 nt. Emerging evidence indicates that lncRNAs play critical roles in various cellular processes including regulation of gene expression. LncRNAs show low levels of gene expression and sequence conservation, which make their computational identification in genomes difficult. In this study, more than two billion Illumina sequence reads were mapped to the genome reference using the TopHat and Cufflinks software. Transcripts shorter than 200 nt, with more than 83-100 amino acids ORF, or with significant homologies to the NCBI nr-protein database were removed. In addition, a computational pipeline was used to filter the remaining transcripts based on a protein-coding-score test. Depending on the filtering stringency conditions, between 31,195 and 54,503 lncRNAs were identified, with only 421 matching known lncRNAs in other species. A digital gene expression atlas revealed 2,935 tissue-specific and 3,269 ubiquitously-expressed lncRNAs. This study annotates the lncRNA rainbow trout genome and provides a valuable resource for functional genomics research in salmonids.

摘要

ENCODE计划表明,约70%的人类基因组会被转录。虽然只有1%-2%的RNA编码蛋白质,但其余的都是非编码RNA。长链非编码RNA(lncRNA)构成了一类多样的非编码RNA,其长度超过200个核苷酸。新出现的证据表明,lncRNA在包括基因表达调控在内的各种细胞过程中发挥着关键作用。lncRNA的基因表达水平低且序列保守性差,这使得在基因组中通过计算方法识别它们变得困难。在本研究中,使用TopHat和Cufflinks软件将超过20亿条Illumina测序读数映射到基因组参考序列上。去除了长度小于200个核苷酸、开放阅读框(ORF)超过83-100个氨基酸或与NCBI nr-蛋白质数据库有显著同源性的转录本。此外,还使用了一个计算流程,根据蛋白质编码得分测试对剩余的转录本进行筛选。根据筛选的严格条件,共鉴定出31,195至54,503个lncRNA,其中只有421个与其他物种中已知的lncRNA匹配。一个数字基因表达图谱揭示了2,935个组织特异性lncRNA和3,269个普遍表达的lncRNA。本研究对虹鳟鱼基因组中的lncRNA进行了注释,为鲑科鱼类的功能基因组学研究提供了宝贵资源。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d95f/4764514/19a8aaf3a913/pone.0148940.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验