Suppr超能文献

使用各种RNA测序样本对L1基因座表达的比较分析。

Comparative analysis on the expression of L1 loci using various RNA-Seq preparations.

作者信息

Kaul Tiffany, Morales Maria E, Sartor Alton O, Belancio Victoria P, Deininger Prescott

机构信息

1Tulane Cancer Center, Tulane Health Sciences Center, 1700 Tulane Ave, New Orleans, LA 70112 USA.

2Section of Hematology and Oncology, Department of Medicine, Tulane School of Medicine, 1430 Tulane Ave, New Orleans, LA 70112 USA.

出版信息

Mob DNA. 2020 Jan 6;11:2. doi: 10.1186/s13100-019-0194-z. eCollection 2020.

Abstract

BACKGROUND

Retrotransposons are one of the oldest evolutionary forces shaping mammalian genomes, with the ability to mobilize from one genomic location to another. This mobilization is also a significant factor in human disease. The only autonomous human retroelement, L1, has propagated to make up 17% of the human genome, accumulating over 500,000 copies. The majority of these loci are truncated or defective with only a few reported to remain capable of retrotransposition. We have previously published a strand-specific RNA-Seq bioinformatics approach to stringently identify at the locus-specific level the few expressed full-length L1s using cytoplasmic RNA. With growing repositories of RNA-Seq data, there is potential to mine these datasets to identify and study expressed L1s at single-locus resolution, although many datasets are not strand-specific or not generated from cytoplasmic RNA.

RESULTS

We developed whole-cell, cytoplasmic and nuclear RNA-Seq datasets from 22Rv1 prostate cancer cells to test the influence of different preparations on the quality and effort needed to measure L1 expression. We found that there was minimal data loss in the identification of full-length expressed L1 s using whole cell, strand-specific RNA-Seq data compared to cytoplasmic, strand-specific RNA-Seq data. However, this was only possible with an increased amount of manual curation of the bioinformatics output to eliminate increased background. About half of the data was lost when the sequenced datasets were non-strand specific.

CONCLUSIONS

The results of these studies demonstrate that with rigorous manual curation the utilization of stranded RNA-Seq datasets allow identification of expressed L1 loci from either cytoplasmic or whole-cell RNA-Seq datasets.

摘要

背景

逆转录转座子是塑造哺乳动物基因组的最古老进化力量之一,具有从一个基因组位置移动到另一个位置的能力。这种移动也是人类疾病的一个重要因素。唯一自主的人类逆转录元件L1已经扩散,占人类基因组的17%,积累了超过50万个拷贝。这些位点中的大多数是截短的或有缺陷的,只有少数被报道仍有能力进行逆转录转座。我们之前发表了一种链特异性RNA-Seq生物信息学方法,用于在基因座特异性水平上严格鉴定少数使用细胞质RNA表达的全长L1。随着RNA-Seq数据存储库的不断增加,有可能挖掘这些数据集,以单基因座分辨率识别和研究表达的L1,尽管许多数据集不是链特异性的,也不是从细胞质RNA生成的。

结果

我们从22Rv1前列腺癌细胞中开发了全细胞、细胞质和细胞核RNA-Seq数据集,以测试不同样本制备对测量L1表达所需的质量和工作量的影响。我们发现,与细胞质链特异性RNA-Seq数据相比,使用全细胞链特异性RNA-Seq数据鉴定全长表达的L1时数据损失最小。然而,这只有在增加生物信息学输出的人工筛选量以消除增加的背景时才有可能。当测序数据集是非链特异性时,大约一半的数据会丢失。

结论

这些研究结果表明,通过严格的人工筛选,利用链特异性RNA-Seq数据集可以从细胞质或全细胞RNA-Seq数据集中鉴定出表达的L1基因座。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/022f/6945437/dd29152b48f3/13100_2019_194_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验